A production-ready FastAPI application that combines intelligent image processing with AI-powered YouTube video summarization. Features comprehensive monitoring with Datadog integration, multi-database support (MongoDB, PostgreSQL, SQL Server), and flexible deployment options including Kubernetes and Synology NAS.
- Python 3.9 or higher
- Docker (optional, for containerized deployment)
- PostgreSQL, MongoDB, or SQL Server (at least one database)
- AWS Account (for S3, Rekognition, and SES)
- OpenAI API Key (for AI features)
- Datadog Account (optional, for monitoring)
- Notion Account (optional, for YouTube summaries storage)
- AWS S3 Integration - Secure image upload and storage
- Amazon Rekognition - Automated label detection, text extraction, and content moderation
- Multi-Database Support - Store metadata in MongoDB, PostgreSQL, or SQL Server
- Content Safety - Automatic detection of questionable content and "bug" images
- Smart Error Detection - Identifies images containing error text for debugging
- Amazon SES Integration - Email notifications for error reporting (replaced SendGrid)
- AI-Powered Summarization - Uses OpenAI to generate intelligent video summaries
- Batch Processing - Process multiple YouTube videos simultaneously with configurable strategies
- Transcript Processing - Extracts and processes YouTube video transcripts with fallback mechanisms
- Notion Integration - Automatically save video summaries to Notion databases
- Metadata Extraction - Retrieves video details including title, channel, and publication date
- Multiple URL Support - Accept various YouTube URL formats (standard, shorts, mobile)
- Datadog APM - Full application performance monitoring with distributed tracing
- LLM Observability - Track AI model performance and costs
- Custom Event Tracking - Content moderation alerts, bug detection events
- Runtime Profiling - CPU and memory performance analysis
- Health Monitoring - Application health checks and uptime tracking
- Docker Containerization - Multi-stage builds with Python 3.9 slim base
- Kubernetes Support - Ready for container orchestration with manifest files
- GitHub Actions CI/CD - Automated testing, building, and deployment pipeline
- Flexible Deployment - Support for Docker, Kubernetes, Synology NAS, and cloud platforms
- Environment Management - Separate configurations for dev/staging/production
- Database Migration Tools - SQL scripts for easy database setup and schema management
GET /images?backend=mongo|postgres|sqlserver- Retrieve all stored imagesPOST /add_image?backend=mongo|postgres|sqlserver- Upload and process imagesDELETE /delete_image/{id}?backend=mongo|postgres|sqlserver- Remove images and metadata
POST /api/v1/upload-image-amazon/- Upload image directly to Amazon S3DELETE /api/v1/delete-one-s3/{key}- Delete single object from S3DELETE /api/v1/delete-all-s3- Delete all objects from S3
POST /api/v1/summarize-youtube- Generate AI summary of a single YouTube videoPOST /api/v1/batch-summarize-youtube- Process multiple YouTube videos with batch strategiesPOST /api/v1/save-youtube-to-notion- Save video summaries directly to Notion
GET /api/v1/openai-hello- Service health checkGET /api/v1/openai-gen-image/{prompt}- Generate images using DALL-E 3
GET /api/v1/mongo/get-image-mongo/{id}- Retrieve image from MongoDBDELETE /api/v1/mongo/delete-all-mongo/{key}- Delete all items by key from MongoDBGET /api/v1/postgres/get-image-postgres/{id}- Retrieve image from PostgreSQLGET /api/v1/sqlserver/get-image-sqlserver/{id}- Retrieve image from SQL Server
GET /datadog-hello- Datadog integration health checkPOST /datadog-event- Send custom events to DatadogGET /datadog-events- Retrieve Datadog eventsPOST /app-event/{event_type}- Track application-specific eventsPOST /track-api-request- Log API request metricsPOST /bug-detection-event- Report bug detection events
GET /- Root endpoint with welcome messageGET /health- Application health status with detailed service checksGET /test-sqlserver- Test SQL Server connectionGET /timeout-test?timeout=N- Performance testing endpointPOST /create_post- Demo endpoint for external API integration
-
Clone the repository:
git clone <repository-url> cd demo-fastapi
-
Install dependencies:
pip install -r requirements.txt
-
Configure environment:
cp env.example .env # Edit .env with your development credentials -
Set up databases (optional):
# Interactive database setup ./scripts/setup-databases.sh -
Run the application:
python main.py
The API will be available at
http://localhost:8080
-
Build and run with Docker:
# Basic build docker build -t fastapi-app . docker run -p 8080:8080 --env-file .env fastapi-app
-
Using the build script:
# Local development with auto-run ./scripts/build.sh --local --run # Build without cache ./scripts/build.sh --no-cache --local --run # Clean existing containers and rebuild ./scripts/build.sh --clean --run
-
Docker Compose:
docker-compose up
# Using Podman
./scripts/build.sh --podman --run
# Using Rancher Desktop
./scripts/build.sh --rancher --run# OpenAI Configuration
OPENAI_API_KEY=sk-your-openai-api-key
# AWS Services (S3, Rekognition, SES)
AMAZON_KEY_ID=your-aws-access-key-id
AMAZON_KEY_SECRET=your-aws-secret-access-key
AMAZON_S3_BUCKET=your-s3-bucket-name
SES_REGION=us-west-2
SES_FROM_EMAIL=your-verified-email@domain.com
# Database Configuration
# MongoDB (optional)
MONGO_CONN=mongodb://your-mongodb-connection
MONGO_USER=your-mongo-username
MONGO_PW=your-mongo-password
# PostgreSQL (optional)
PGHOST=your-postgres-host
PGPORT=5432
PGDATABASE=your-database-name
PGUSER=your-postgres-username
PGPASSWORD=your-postgres-password
# SQL Server (optional)
SQLSERVER_ENABLED=true # Set to false to disable
SQLSERVERHOST=your-sqlserver-host
SQLSERVERPORT=1433
SQLSERVERDB=your-database-name
SQLSERVERUSER=your-sqlserver-username
SQLSERVERPW=your-sqlserver-password
# Notion Integration (optional)
NOTION_API_KEY=secret_your-notion-key
NOTION_DATABASE_ID=your-notion-database-id
# Datadog Monitoring
DD_API_KEY=your-datadog-api-key
DD_APP_KEY=your-datadog-app-key
DD_AGENT_HOST=192.168.1.100 # Your Datadog agent host
DD_TRACE_AGENT_PORT=8126
DD_PROFILING_ENABLED=true
DD_DBM_PROPAGATION_MODE=full # Enable DB monitoring
# LLM Observability
DD_LLMOBS_ENABLED=true
DD_LLMOBS_ML_APP=youtube-summarizer
DD_LLMOBS_EVALUATORS=ragas_faithfulness,ragas_context_precision,ragas_answer_relevancy
# Application Configuration
DD_SERVICE=fastapi-app
DD_ENV=production
DD_VERSION=1.0
BUG_REPORT_EMAIL=your-email@domain.com
# SonarQube Integration (optional)
SONAR_TOKEN=your-sonarqube-token
SONAR_HOST_URL=https://your-sonarqube-server.comThe application gracefully handles missing services:
- Databases: MongoDB, PostgreSQL, and SQL Server can be used independently or together
- Notion Integration: YouTube summaries work without Notion
- Datadog: Monitoring is optional for development
- Amazon SES: Email notifications are optional (fallback available)
- SonarQube: Code quality analysis (configured in
sonar-project.properties)
# Enable SQL Server
SQLSERVER_ENABLED=true
# Disable SQL Server (faster startup, no SQL Server dependency)
SQLSERVER_ENABLED=false# View status of all databases
curl http://localhost:8000/api/v1/database-status
# View configuration (without passwords)
curl http://localhost:8000/api/v1/database-config
# Test SQL Server connection specifically
curl http://localhost:8000/test-sqlserver# Get images from PostgreSQL
curl "http://localhost:8000/images?backend=postgres"
# Get images from MongoDB
curl "http://localhost:8000/images?backend=mongo"
# Get images from SQL Server
curl "http://localhost:8000/images?backend=sqlserver"See docs/DATABASE_CONFIGURATION.md for complete database configuration guide.
- Multi-stage build for optimal image size
- Python 3.9 slim base image
- Security hardening with non-root user (PUID/PGID)
- Environment optimization for container runtime
- Health checks for container orchestration
# Standard build for production
docker build -t fastapi-app .
# Platform-specific builds
docker build --platform linux/amd64 -t fastapi-app .
# No-cache build
docker build --no-cache -t fastapi-app .# Build image for Synology DS923+
./scripts/build.sh
# This creates a .tar file on your Desktop for import into Synology Container Manager
# Port mapping: 9000:8080
# Environment: Use .env.production values# See examples/youtube_batch_usage.py for complete examples
import asyncio
from examples.youtube_batch_usage import YouTubeBatchClient
async def batch_process_videos():
client = YouTubeBatchClient()
# Process multiple videos
urls = [
"https://youtube.com/watch?v=video1",
"https://youtube.com/watch?v=video2",
"https://youtube.com/watch?v=video3"
]
result = await client.process_batch(
urls=urls,
strategy="parallel_individual",
save_to_notion=True,
max_parallel=3
)
print(f"Processed {len(result['results'])} videos")
asyncio.run(batch_process_videos())# Interactive database setup wizard
./scripts/setup-databases.sh
# Quick PostgreSQL setup
psql -h localhost -U username -d database_name -f sql/quick_setup_postgres.sql
# Fix and setup PostgreSQL
psql -h localhost -U username -d database_name -f sql/fix_and_setup_postgres.sql
# SQL Server setup
sqlcmd -S localhost -U sa -d database_name -i sql/sqlserver_schema.sql# Set up GitHub Secrets for CI/CD
./scripts/setup-secrets.sh
# Clear sensitive environment variables
./scripts/clear-secrets.sh# All tests
pytest
# With coverage
pytest --cov=src
# With Datadog Test Optimization (CI)
pytest --ddtrace -v
# Specific test files
pytest tests/test_basic.py
pytest tests/test_simple.py
pytest tests/mongo_test.py
# Run test script
./scripts/test.sh# Test YouTube URL processing and transcript retrieval
python test_youtube_urls.pyThis utility tests:
- Video ID extraction from different URL formats
- Transcript retrieval functionality
- Full video processing pipeline
Tests are designed to gracefully handle missing external services:
- Mock external APIs when credentials are unavailable
- Skip integration tests for unconfigured services
- Isolated unit tests for core functionality
demo-fastapi/
βββ main.py # FastAPI application with Datadog integration
βββ requirements.txt # Python dependencies
βββ Dockerfile # Multi-stage container build
βββ docker-compose.yml # Local development services
βββ env.example # Environment variable template
βββ pytest.ini # Test configuration
βββ scripts/ # Build and deployment automation
β βββ build.sh # Multi-platform Docker builds
β βββ deploy.sh # Production deployment with GitHub Secrets
β βββ setup-secrets.sh # GitHub Secrets management
β βββ setup-databases.sh # Database setup and migration helper
β βββ setup-runner.sh # Self-hosted GitHub runner setup
β βββ setup-sonarqube-monitoring.sh # SonarQube Datadog monitoring
β βββ add-apm-to-sonarqube.sh # Add APM to SonarQube
β βββ test.sh # Test automation
βββ src/ # Application modules
β βββ amazon.py # AWS S3, Rekognition, SES integration
β βββ mongo.py # MongoDB operations
β βββ postgres.py # PostgreSQL operations
β βββ sqlserver.py # SQL Server operations
β βββ openai_service.py # OpenAI API integration
β βββ datadog.py # Custom monitoring and events
β βββ services/ # Additional service integrations
β βββ youtube_service.py # Single video processing
β βββ youtube_batch_service.py # Batch video processing
β βββ youtube_transcript_fallback.py # Transcript fallback handling
β βββ notion_service.py # Notion database integration
βββ tests/ # Test suite
β βββ conftest.py # Test configuration and fixtures
β βββ test_basic.py # API endpoint tests
β βββ test_simple.py # Unit tests
β βββ mongo_test.py # Database integration tests
βββ sql/ # Database schemas and migrations
β βββ postgres_schema.sql # PostgreSQL table definitions
β βββ sqlserver_schema.sql # SQL Server table definitions
β βββ *.sql # Migration and setup scripts
βββ examples/ # Usage examples
β βββ youtube_batch_usage.py # Batch processing examples
βββ docs/ # Additional documentation
β βββ GMKTEC_MIGRATION.md # GMKTec host migration guide
β βββ SQL_SERVER_SETUP.md # SQL Server configuration guide
β βββ YOUTUBE_BATCH_PROCESSING.md # YouTube batch processing guide
βββ .github/workflows/ # CI/CD pipelines
β βββ deploy.yaml # GitHub-hosted deployment (manual)
β βββ deploy-self-hosted.yaml # Self-hosted runner deployment (auto)
β βββ datadog-security.yml # Datadog security scanning
βββ datadog-conf.d/ # Datadog monitoring configurations
β βββ docker.d/ # Docker container monitoring
β βββ sonarqube.d/ # SonarQube monitoring
β βββ github_runner.yaml # GitHub runner monitoring
βββ k8s-fastapi-app.yaml # Kubernetes application manifest
βββ k8s-datadog-agent.yaml # Kubernetes Datadog agent manifest
βββ docker-compose.sonarqube-with-apm.yml # SonarQube with APM config
βββ sonar-project.properties # SonarQube project configuration
βββ runner.env.example # GitHub runner environment template
βββ test_youtube_urls.py # YouTube URL processing test utility
βββ static-analysis.datadog.yml # Datadog static analysis configuration
βββ tailscale-acl-example.json # Tailscale ACL configuration example
- Upload - Images uploaded to AWS S3
- Analysis - Amazon Rekognition extracts labels, text, and checks content
- Storage - Metadata stored in MongoDB, PostgreSQL, or SQL Server
- Monitoring - Content moderation and error detection events sent to Datadog
- URL Parsing - Extract video ID from multiple YouTube URL formats
- Transcript Retrieval - Get video transcript with multiple fallback mechanisms
- AI Summarization - Generate summary using OpenAI GPT with custom instructions
- Batch Processing - Handle multiple videos with configurable parallel processing strategies
- Optional Storage - Save to Notion database with metadata and tags
- Request Tracing - Every API call tracked with Datadog APM
- Error Detection - Custom events for content moderation and bug detection
- Performance Metrics - CPU, memory, and response time monitoring
- LLM Observability - OpenAI usage tracking with RAGAS evaluators for model quality assessment
- Database Monitoring - APM trace correlation with database queries (DBM)
- Runtime Profiling - CPU and memory profiling for performance optimization
- Custom Events API - Send application-specific events to Datadog
- Code Quality - SonarQube analysis for bugs, vulnerabilities, and code smells
- Static Analysis - Datadog and SonarQube code quality checks
docker run -p 8080:8080 --env-file .env fastapi-appkubectl apply -f k8s-fastapi-app.yaml
kubectl apply -f k8s-datadog-agent.yaml- Build image:
./scripts/build.sh - Transfer .tar file to Synology
- Import via Container Manager
- Configure port 9000:8080
The repository includes comprehensive CI/CD pipelines:
- Self-Hosted Runner (
.github/workflows/deploy-self-hosted.yaml) - Optimized for GMKTec local deployment - GitHub-Hosted Runner (
.github/workflows/deploy.yaml) - Cloud-based fallback option (manual trigger only) - Security Scanning (
.github/workflows/datadog-security.yml) - Static analysis and security checks - Code Quality - SonarQube integration for automated code analysis on every push
- Runner Setup Script - Use
scripts/setup-runner.shfor easy self-hosted runner configuration - Runner Environment - Configure with
runner.env.examplefor Docker-based GitHub runner
With the self-hosted runner on GMKTec, deployment is automatic on push to main:
- Port 9000 for production
- Local Docker deployment
- No SSH or Tailscale needed
# Quick setup for self-hosted runner
cp runner.env.example .env.runner
# Edit .env.runner with your GitHub token and Docker group ID (988 for GMKTec)
# Run the setup script
./scripts/setup-runner.sh
# Choose option 2 to start the runnerFor manual deployment:
# Using the deployment script
./scripts/deploy.sh --local
# Or trigger workflow manually
gh workflow run deploy-self-hosted.yamlAccess your monitoring dashboards:
- Datadog APM: Distributed tracing and performance metrics
- Datadog LLM Observability: Track AI model performance and costs
- SonarQube: Code quality metrics at your configured SonarQube instance
- FastAPI Docs:
http://localhost:8080/docs - ReDoc:
http://localhost:8080/redoc - Health Check:
http://localhost:8080/health
The project is configured for SonarQube analysis:
- Automatic analysis on every push (GitHub Actions)
- Configured exclusions for test and migration files
- Python 3.9-3.12 compatibility
- Optional APM monitoring (see
docker-compose.sonarqube-with-apm.yml)
Datadog monitoring configurations available in datadog-conf.d/:
- Docker container monitoring (
docker.d/) - SonarQube monitoring via HTTP checks and JMX (
sonarqube.d/) - GitHub runner process monitoring (
github_runner.yaml)
Detailed guides are available in the docs/ directory:
- GMKTec Migration Guide - Detailed instructions for migrating to GMKTec host
- SQL Server Setup Guide - Complete SQL Server configuration and troubleshooting
- YouTube Batch Processing Guide - Advanced YouTube video processing strategies
- Environment Variable Management - No secrets in code, comprehensive
.envconfiguration - CORS Configuration - Controlled cross-origin access with configurable origins
- Content Moderation - Automatic detection of inappropriate content using Amazon Rekognition
- Error Tracking - Structured logging with Datadog integration for security monitoring
- Container Security - Non-root user execution with PUID/PGID support
- Static Security Analysis - Automated security scanning with Datadog's Python security rulesets
- Tailscale Support - Optional secure network access for remote deployments
- OAuth Integration - Secure authentication for CI/CD pipelines
- Local Deployment - Self-hosted runner supports local deployment without network exposure
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
See LICENSE.md for license information.