Skip to content

Live feed fixes and UI theme rebrand #7

Merged
CodeNinjaSarthak merged 60 commits intomainfrom
dev
Mar 18, 2026
Merged

Live feed fixes and UI theme rebrand #7
CodeNinjaSarthak merged 60 commits intomainfrom
dev

Conversation

@CodeNinjaSarthak
Copy link
Copy Markdown
Owner

Summary

  • Fix live feed race conditions: deduplicate comments, guard against missing fields, and cap feed at 100 messages
  • Fix cluster title text wrapping and overflow in ClustersPanel
  • Rebrand UI theme: swap fonts to Azeret Mono + Outfit, replace cyan accent with orange (#FF6B35), adjust
    surface/text colors for both dark and light themes
  • Add subtle grid background, staggered sidebar/column entrance animations, and approve button hover glow
  • Harden AuthContext with null-safe token parsing and logout on corrupt state
  • Fix scheduler worker timezone-aware datetime comparison

Files changed

  • frontend/src/index.css — Full color palette + font swap, animations, light theme refresh
  • frontend/index.html — Updated Google Fonts import
  • frontend/src/components/Dashboard/QuestionsFeed.jsx — Dedup, null guards, feed cap
  • frontend/src/components/Dashboard/YouTubePanel.jsx — Cluster title overflow fix
  • frontend/src/context/AuthContext.jsx — Null-safe token init, corrupt state handling
  • workers/scheduler/worker.py — UTC-aware datetime fix

Test plan

  • Verify live feed deduplicates and caps at 100 messages
  • Confirm cluster titles wrap properly on long text
  • Check dark and light theme render correctly with new palette
  • Verify sidebar/column entrance animations play on page load
  • Test login/logout flow with corrupted localStorage token

CodeNinjaSarthak and others added 30 commits March 7, 2026 22:02
…HNSW indexes

   - Refactor Redis pub/sub relay into WebSocketManager.start_subscriber()
     with exponential backoff (1→30s) and auto-reconnect on failure
   - Fix Redis channel prefix: ws:session:{id} → ws:{id} (manager + posting worker)
   - Add publish() helper on WebSocketManager for future use
   - Tune DB connection pool per worker: pool_size=2, max_overflow=3 (budget: 45 total)
   - Add Alembic migration c3d4e5f6a7b8: HNSW indexes on comments and rag_documents embeddings
   - Add scripts/truncate_embeddings.sql utility to clear embeddings before re-indexing
   - Set postgres max_connections=100 in docker-compose
   - Switch Redis maxmemory-policy from allkeys-lru to volatile-lru (preserve non-expiring keys)
   - Run uvicorn with --workers 2 in api.Dockerfile
   - Remove legacy vanilla-JS frontend files (replaced by React/Vite)
feat: infrastructure hardening — WS subscriber refactor, DB pooling, HNSW indexes
- Added exception handling for WebSocketDisconnect to ensure proper disconnection.
- Wrapped websocket.close calls in try-except blocks to prevent unhandled exceptions during closure.
- Enhanced error handling for invalid authentication and forbidden access scenarios.
…id clustering

  Switch from accumulate-N-then-cluster to immediate per-comment clustering using
  pgvector cosine distance. Each embedded question is assigned to the nearest existing
  centroid (threshold 0.65) or starts a new cluster. Centroids update incrementally via
  running mean (L2-normalized). Cluster titles auto-summarize via Gemini at comment_count=3.
  Answer generation is gated to new-cluster creation and milestones {3,10,25} to avoid
  job explosion. Adds HNSW cosine index on clusters.centroid_embedding.

  Files changed:
  - workers/common/schemas.py — ClusteringPayload: comment_ids+trigger_type → comment_id
  - workers/embeddings/worker.py — remove Redis counter/batch logic, enqueue single-comment task immediately after
   embedding
  - workers/clustering/worker.py — full rewrite: pgvector nearest-centroid, incremental mean, summarize_cluster at
   3, milestone-gated answer generation
  - backend/app/services/gemini/client.py — add summarize_cluster() method
  - backend/alembic/versions/d4e5f6a7b8c9_add_clusters_centroid_hnsw.py — new migration
  - backend/app/services/websocket/manager.py — fix duplicate block IndentationError
…responsive layout improvements

  - Add GET /dashboard/clusters/{cluster_id}/representative endpoint
    using pgvector cosine distance to find the closest comment to centroid
  - Display "Most asked" representative question in ClustersPanel cards
  - Debounce WS-triggered refetches (500ms) in MetricsCards and ClustersPanel
  - Rework responsive breakpoints: 1200px (2-col), 768px (1-col reordered)
  - Add system design specification report (docs/SYSTEM_DESIGN_REPORT.md)

  I'd recommend two separate commits to keep the docs separate from the feature work. Want me to proceed
  with committing?
…, and debounce tuning

- Add Redis pub/sub WebSocket event publishing to classification, clustering,
  answer generation, and YouTube polling workers
- Add mock YouTube polling worker with config toggle (mock_youtube, mock_message_interval)
- Increase debounce delay from 500ms to 1000ms in ClustersPanel and MetricsCards
- Add verify_mock_data script
…onfidence threshold

- Extract classification prompt into class constants (system instruction, few-shot examples, JSON schema)
- Use Gemini structured output (response_schema + response_mime_type) instead of raw text parsing
- Add CONFIDENCE_THRESHOLD=0.4 gate: low-confidence questions are logged but not forwarded to embedding
- Add CLAUDE.md to .gitignore
Metrics queries were unscoped — any authenticated teacher could see
aggregate counts across all teachers. Now all three queries (active_sessions,
questions_processed, answers_generated) JOIN to StreamingSession and filter
by teacher_id == current_user.id, matching the ownership pattern used
throughout the rest of the API.
Enable multiprocess mode so FastAPI and all 6 workers write to a shared
/tmp/prometheus_multiproc directory. FastAPI exposes GET /metrics using
MultiProcessCollector; Grafana Alloy scrapes it every 15s and forwards
to Grafana Cloud via remote_write.

Instrumented: http_requests_total and http_request_duration_seconds via
RequestContextMiddleware; worker_items_processed_total,
worker_processing_duration_seconds, worker_errors_total, and queue_depth
in each worker's polling loop.
…int errors

Title update at 3 comments committed without notifying the dashboard
via Redis pub/sub. Also fix pre-existing ruff errors (unused imports,
missing noqa annotations, f-strings without placeholders) across
workers and scripts.
apiFetch intercepts 401s, refreshes the token once via /auth/refresh,
and retries the original request. A promise lock deduplicates concurrent
refresh calls. AuthContext checks token expiry on mount and refreshes
before rendering children.
…tion=64

Aligns clusters.centroid_embedding index params with comments and rag_documents indexes.
Failed probe in half_open state was not resetting _opened_at, leaving
the circuit permanently half_open. Now restarts the recovery timeout
on probe failure as intended.
…beddings workers

Pure refactor — zero behavior change. Enables unit testing of worker
logic without running the polling loop.
17 tests covering observable behavior contracts. Uses fakeredis for
queue tests and real DB + mocked Gemini boundary for integration tests.
No tests coupled to internal state or implementation details.
config.py now imports queue constants instead of duplicating strings;
Makefile and start_dev.sh set PYTHONPATH so the import resolves;
stub comments added to runner.py and trigger_monitor.
…t ordering

Atomic CAS on approve_answer (409 on double-approve), try/except around
redis publish in youtube_posting and answer_generation, publish-before-commit
in classification/embeddings/clustering, cluster_summary_failed event on
summarization failure.
…traint handling, gemini key validation

Wrap decrypt_data in try/except at 3 call sites in youtube.py, add rollback
on embedding failure in document_service.py, handle IntegrityError on manual
question flush, and add startup validator for gemini_api_key.
…istener cleanup

Token refresh now logs errors and clears invalid refresh tokens instead of
silently returning null. AuthContext logout wrapped in useCallback with
localStorage read to fix stale closure. YouTubePanel OAuth message listener
cleaned up on unmount via ref.
Strip HTML tags from manual questions, YouTube comment text, and author
names to prevent stored XSS. Move CSRF state deletion after DB commit
so tokens aren't lost if the OAuth exchange fails mid-flow.
Add flake8 per-file-ignores to match ruff config for sys.path-dependent
imports. Expand pylint disable list to suppress false positives (alembic
no-member, side-effect imports, circuit breaker protected-access).
…wers

Wire ModerationService into classification and answer_generation workers
to gate unsafe content before further pipeline processing.
Replace task stubs with real implementations, add APScheduler-based
scheduler worker for periodic maintenance (daily quota reset, hourly
token cleanup), and wire it into start_dev.sh.
Cover four untested areas: RAG document CRUD + ownership boundaries,
WebSocket auth/reject flows, moderation pipeline contract (comment +
answer rejection), and scheduler task logic (quota reset + token cleanup).
…ness

Delete 6 unused stub files with TODO placeholders, remove empty directories.
Update README: fix native dev instructions, add Features and Known Limitations sections.
…inor UI hardening

QuestionsFeed deduplicates WS events and defers processing until initial fetch completes;
YouTubePanel handles 401 gracefully; cluster titles wrap to 2 lines; scheduler logs run times after start.
@CodeNinjaSarthak CodeNinjaSarthak self-assigned this Mar 18, 2026
@CodeNinjaSarthak CodeNinjaSarthak merged commit ffc3e65 into main Mar 18, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant