YouTube Sentiment Analysis MLOps Project

Minimal backend + Chrome extension to analyze YouTube comments with a HuggingFace model.

Features

Fetch top-level YouTube comments via YouTube Data API v3
Sentiment analysis (multilingual model)
FastAPI endpoint /analyze
Chrome extension popup to trigger analysis and show counts
MLflow for model tracking

Setup

Create .env from the template:
```
cp .env.template .env
```

Fill in your API key in .env:

YOUTUBE_API_KEY=...
YOUTUBE_VIDEO_ID=...   # optional for tests

Install dependencies:
```
uv sync
```

Run the API

uv run uvicorn --app-dir src youtube_sentiment.main:app --reload --port 8001

Test the API

curl -X POST http://127.0.0.1:8001/analyze \
  -H "Content-Type: application/json" \
  -d '{"video_id":"VIDEO_ID","max_comments":50}'

Run tests

uv run pytest

Integration tests (require .env values):

RUN_INTEGRATION_TESTS=1 uv run pytest -m integration

Chrome Extension

Go to chrome://extensions
Enable Developer mode
Click Load unpacked and select this repo folder
Open a YouTube watch page
Click the extension icon → Analyze Comments

Notes

The model cache is stored in .hf-cache/ (override with HF_CACHE_DIR).

DVC (Data Versioning)

DVC is used to track datasets, comment dumps, and model artifacts without putting large files in Git.
This project is initialized for DVC, but no pipelines are defined yet (we will add them later).
DVC remote is currently a local placeholder; swap to s3://... or gs://... when you move storage to S3 or GCS.

MLflow Model Registry (Local)

Start MLflow server:

mlflow server \
  --backend-store-uri sqlite:///mlflow.db \
  --default-artifact-root ./mlruns \
  --host 127.0.0.1 \
  --port 5000

Register the pretrained model:
```
uv run python scripts/register_model.py
```

CI/CD Secrets

MLFLOW_TRACKING_URI (GitHub Actions secret): MLflow tracking server URL used by CI workflows.
KUBECONFIG (GitHub Actions secret): kubeconfig contents for deploy workflow.

Deployment (Kubernetes)

Build & push happens on main via CI (GHCR).
Deploy manually via GitHub Actions → Deploy Backend.
- Requires KUBECONFIG secret.
- Deploys only if a Production model exists in MLflow.
- Optionally provide image_tag to deploy a specific SHA.

Local apply (optional):

kubectl apply -f k8s/configmap-dev.yaml
kubectl apply -f k8s/secret.yaml
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yaml

Architecture

Doc: docs/architecture.md
Link: Architecture Doc

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.dvc		.dvc
.github/workflows		.github/workflows
docs		docs
k8s		k8s
reports		reports
scripts		scripts
src/youtube_sentiment		src/youtube_sentiment
tests		tests
.dvcignore		.dvcignore
.env.template		.env.template
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
content.js		content.js
dvc.lock		dvc.lock
dvc.yaml		dvc.yaml
manifest.json		manifest.json
popup.css		popup.css
popup.html		popup.html
popup.js		popup.js
pyproject.toml		pyproject.toml
styles.css		styles.css
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YouTube Sentiment Analysis MLOps Project

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

YouTube Sentiment Analysis MLOps Project

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages