A serverless AWS-based application that automatically processes Instagram direct messages containing video clips, identifies songs using audio recognition, and sends the results back to users via email and Instagram messages.
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Instagram │ │ SQS Lambda │ │ Search Engine │ │ Email Notifier │
│ Messages │───▶│ (Trigger) │───▶│ Lambda │───▶│ Lambda │
│ │ │ │ │ │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │ │
│ ▼ ▼ ▼
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ │ SQS Queue │ │ Audio API │ │ Gmail SMTP │
│ │ │ │ (RapidAPI) │ │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘
│
▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ S3 Bucket │ │ DynamoDB │ │ CloudWatch │
│ (Session Store) │ │ (Sessions) │ │ (Logs) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
- AWS CLI configured with appropriate permissions
- Docker installed (for Lambda layer creation)
- Python 3.9+ for local development
- GitHub account with repository
- GitHub Personal Access Token (for CI/CD)
This project includes a complete CI/CD pipeline using AWS CodePipeline. This is the recommended deployment method.
-
Review the Quick Start Guide:
# See QUICK-START.md for a 3-step setup process -
Deploy the CI/CD Pipeline:
PowerShell (Windows):
.\deploy-cicd-pipeline.ps1
Bash (Linux/Mac):
chmod +x deploy-cicd-pipeline.sh ./deploy-cicd-pipeline.sh
-
Push Code to Trigger Automatic Deployment:
git add . git commit -m "Deploy via CI/CD" git push origin main
- ✅ Self-updating - Updates itself when pipeline.yaml changes
- ✅ Automatically builds Lambda functions and layers on every push
- ✅ Packages dependencies with correct architecture for Lambda
- ✅ Uploads artifacts to S3
- ✅ Deploys to TEST environment automatically
- ✅ Requires manual approval before production
- ✅ Deploys to PRODUCTION after approval
- ✅ Sends notifications on success/failure and for approvals
- ✅ Safe deployment with test-then-production flow
- 📘 Detailed Setup: See
CICD-SETUP.mdfor complete documentation - 🏗️ Build Configuration: See
buildspec.ymlfor build process - ☁️ Pipeline Infrastructure: See
cft/pipeline.yamlfor pipeline template
If you prefer manual deployment instead of CI/CD:
# Deploy S3 bucket for deployment artifacts
aws cloudformation deploy \
--template-file cft/Deployment-bucket.yaml \
--stack-name instabot-deployment-bucket
# Deploy secrets (update with your credentials first)
aws cloudformation deploy \
--template-file cft/secrets.yaml \
--stack-name instabot-secrets \
--capabilities CAPABILITY_IAM
# Deploy main application stack
aws cloudformation deploy \
--template-file cft/Application-stack.yaml \
--stack-name instabot-app \
--parameter-overrides \
SecretsStackName=instabot-secrets \
S3BucketStackName=instabot-deployment-bucket \
--capabilities CAPABILITY_IAMOption A: Use the manual build script:
PowerShell:
.\scripts\manual-build.ps1 -S3Bucket your-bucket-nameBash:
chmod +x scripts/manual-build.sh
./scripts/manual-build.sh your-bucket-nameOption B: Follow manual packaging instructions below, then upload the ZIP files to your S3 deployment bucket and update the Lambda functions.
Create a Lambda layer with shared dependencies using Docker:
# Create layer for instabot_sqs_lambda dependencies
docker run --rm --entrypoint "" \
-v "${PWD}/instabot_sqs_lambda:/var/task" \
public.ecr.aws/lambda/python:3.12 \
sh -c "cd /var/task && pip install -r requirements.txt -t ./python"
# Zip the layer
cd instabot_sqs_lambda && zip -r ../instabot_layer.zip python && cd ..Package each Lambda function:
# Package SQS Lambda
cd instabot_sqs_lambda/
zip -r ../instabot_sqs_lambda.zip . --exclude="python/*"
cd ..
# Package Search Engine Lambda
cd instabot_search_engine/
zip -r ../instabot_search_engine.zip .
cd ..
# Package Email Notifier Lambda
cd instabot_email_notifier/
zip -r ../instabot_email_notifier.zip .
cd ..Purpose: Entry point that monitors Instagram DMs and queues processing tasks
Functionality:
- 🔐 Authenticates with Instagram using stored S3 session
- 📬 Checks for unread direct messages containing video clips
- 🎬 Extracts video URLs from Instagram clips/reels
- 📤 Sends extracted data to SQS queue for processing
- 🗑️ Hides processed message threads
Key Features:
- Session persistence via S3 for reliable Instagram authentication
- Automatic fallback to credential-based login if session expires
- Robust error handling and logging
Environment Variables:
username=your_instagram_username
password=your_instagram_password
SQS_QUEUE_URL=your_sqs_queue_url
Bucket_name=your_s3_bucket_nameDependencies: instagrapi, pillow, pandas, psycopg2-binary
Purpose: Processes audio from videos and identifies songs using AI recognition
Functionality:
- 🎵 Downloads audio from Instagram video URLs
- ✂️ Extracts and processes audio segments using FFmpeg
- 🤖 Sends audio to RapidAPI's Shazam-like service for song identification
- 🔄 Tries multiple time segments if initial recognition fails
- 🔗 Generates YouTube and YouTube Music links for identified songs
- 📧 Triggers email notification with results
Key Features:
- Multi-attempt recognition strategy (tries 3 different time segments)
- Automatic temporary file cleanup
- Base64 audio encoding for API compatibility
- Batch processing support via SQS
Environment Variables:
api_host=your_rapidapi_host
api_key=your_rapidapi_key
content_type=text/plain
email_lambda_name=your_email_lambda_arnDependencies: requests
Lambda Layer Requirements: FFmpeg binary for audio processing
Purpose: Sends formatted email notifications and Instagram responses with song results
Functionality:
- 📧 Composes HTML email with song identification results
- 📊 Creates formatted table with song names and streaming links
- 📮 Sends email via Gmail SMTP
- 💬 Sends song links back to original Instagram thread
- 🗑️ Cleans up Instagram conversation after sending results
Key Features:
- Professional HTML email formatting
- Dual delivery: email + Instagram DM
- Instagram session management with S3 persistence
- Comprehensive error handling and logging
Environment Variables:
sender_email=your_gmail_address
sender_password=your_gmail_app_password
receiver_email=recipient_email_address
instagram_username=your_instagram_username
instagram_password=your_instagram_password
bucket_name=your_s3_bucket_nameDependencies: instagrapi, pillow, pandas, psycopg2-binary
{
"instagram_username": "your_instagram_username",
"instagram_password": "your_instagram_password",
"gmail_username": "your_gmail_address",
"gmail_password": "your_gmail_app_password",
"rapidapi_key": "your_rapidapi_key",
"rapidapi_host": "shazam-api-host"
}Update the CloudFormation template or Lambda console with:
- Instagram credentials for DM access
- Gmail credentials for email notifications
- RapidAPI credentials for song recognition
- S3 bucket names for session storage
- SQS queue URLs for message passing
# Install dependencies
pip install -r requirements.txt
# Set environment variables
export instagram_username="your_username"
export instagram_password="your_password"
# ... other variables
# Test individual functions
python -c "from instabot_sqs_lambda.lambda_function import lambda_handler; lambda_handler({}, {})"- 📊 CloudWatch Logs: Monitor function execution and errors
- 📈 CloudWatch Metrics: Track invocation counts and duration
- 🚨 SQS DLQ: Failed messages are sent to Dead Letter Queue
- 📧 Email notifications provide user-facing status updates
- Lambda: Serverless function execution
- SQS: Message queuing between functions
- S3: Session storage and deployment artifacts
- DynamoDB: Session metadata storage
- CloudWatch: Logging and monitoring
- Secrets Manager: Secure credential storage
- IAM: Permission management
- Instagram Graph API: Direct message access
- RapidAPI (Shazam): Audio recognition service
- Gmail SMTP: Email delivery
- YouTube: Music link generation
- Message Detection: SQS Lambda monitors Instagram DMs
- Content Extraction: Extracts video URLs from messages
- Queue Processing: Sends data to SQS for async processing
- Audio Analysis: Search Engine Lambda downloads and processes audio
- Song Recognition: Identifies songs using external API
- Result Delivery: Email Notifier sends results via email and Instagram
- Cleanup: Removes processed messages and temporary files
Instagram Authentication Fails:
- Check credentials in Secrets Manager
- Verify S3 session file exists and is valid
- Instagram may require 2FA - use app-specific password
Song Recognition Not Working:
- Verify RapidAPI key and host are correct
- Check FFmpeg layer is attached to Search Engine Lambda
- Audio quality may be too poor for recognition
Email Delivery Issues:
- Enable Gmail "Less secure app access" or use App Password
- Check sender/receiver email addresses
- Verify SMTP settings (Gmail: smtp.gmail.com:587)
# Check Lambda logs
aws logs describe-log-groups --log-group-name-prefix "/aws/lambda/instabot"
# Monitor SQS queue
aws sqs get-queue-attributes --queue-url YOUR_QUEUE_URL --attribute-names All
# Check S3 session file
aws s3 ls s3://your-bucket-name/instagram_session.jsonThis project is licensed under the MIT License - see the LICENSE file for details.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
For issues and questions:
- Check the troubleshooting section above
- Review CloudWatch logs for detailed error messages
- Ensure all environment variables are properly configured
- Verify AWS permissions are correctly set
- This application processes Instagram content - ensure compliance with Instagram's Terms of Service
- Audio recognition API has usage limits - monitor your RapidAPI quotas
- Keep your credentials secure and rotate them regularly
- Test thoroughly in a development environment before production deployment