Minimal backend + Chrome extension to analyze YouTube comments with a HuggingFace model.
Features
- Fetch top-level YouTube comments via YouTube Data API v3
- Sentiment analysis (multilingual model)
- FastAPI endpoint
/analyze - Chrome extension popup to trigger analysis and show counts
- MLflow for model tracking
Setup
- Create
.envfrom the template:cp .env.template .env
- Fill in your API key in
.env:YOUTUBE_API_KEY=... YOUTUBE_VIDEO_ID=... # optional for tests - Install dependencies:
uv sync
Run the API
uv run uvicorn --app-dir src youtube_sentiment.main:app --reload --port 8001Test the API
curl -X POST http://127.0.0.1:8001/analyze \
-H "Content-Type: application/json" \
-d '{"video_id":"VIDEO_ID","max_comments":50}'Run tests
uv run pytestIntegration tests (require .env values):
RUN_INTEGRATION_TESTS=1 uv run pytest -m integrationChrome Extension
- Go to
chrome://extensions - Enable Developer mode
- Click Load unpacked and select this repo folder
- Open a YouTube watch page
- Click the extension icon → Analyze Comments
Notes
- The model cache is stored in
.hf-cache/(override withHF_CACHE_DIR).
DVC (Data Versioning)
- DVC is used to track datasets, comment dumps, and model artifacts without putting large files in Git.
- This project is initialized for DVC, but no pipelines are defined yet (we will add them later).
- DVC remote is currently a local placeholder; swap to
s3://...orgs://...when you move storage to S3 or GCS.
MLflow Model Registry (Local)
- Start MLflow server:
mlflow server \ --backend-store-uri sqlite:///mlflow.db \ --default-artifact-root ./mlruns \ --host 127.0.0.1 \ --port 5000
- Register the pretrained model:
uv run python scripts/register_model.py
CI/CD Secrets
MLFLOW_TRACKING_URI(GitHub Actions secret): MLflow tracking server URL used by CI workflows.KUBECONFIG(GitHub Actions secret): kubeconfig contents for deploy workflow.
Deployment (Kubernetes)
- Build & push happens on
mainvia CI (GHCR). - Deploy manually via GitHub Actions → Deploy Backend.
- Requires
KUBECONFIGsecret. - Deploys only if a Production model exists in MLflow.
- Optionally provide
image_tagto deploy a specific SHA.
- Requires
Local apply (optional):
kubectl apply -f k8s/configmap-dev.yaml
kubectl apply -f k8s/secret.yaml
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yamlArchitecture
- Doc:
docs/architecture.md - Link: Architecture Doc