Skip to content

saivarun2001/54

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

83 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Electoral Search PWA

A premium, mobile-first PWA for searching electoral roll data with bilingual support (English + Telugu).

Project Structure

├── apps/web          # Next.js 14 App Router + Tailwind + Supabase
├── packages/db       # SQL Migrations for Supabase
└── packages/ingest   # Ingestion worker and scripts

Quick Start

cd apps/web
npm install
npm run dev

Environment Setup

Create .env.local in apps/web:

NEXT_PUBLIC_SUPABASE_URL=your_project_url
NEXT_PUBLIC_SUPABASE_ANON_KEY=your_anon_key

For ingestion worker, also set:

SUPABASE_SERVICE_ROLE_KEY=your_service_role_key

Runbook

How to Add a Document (Drive Link)

  1. Go to /admin in the app (requires admin role)
  2. Click "Add Document"
  3. Paste your Google Drive sharing link
  4. Select document type (Structured/Scanned)
  5. Click "Add to Queue"

How to Run Ingestion Locally

cd packages/ingest
npx ts-node worker.ts                           # Process all queued
npx ts-node worker.ts --document_id=<uuid>      # Process specific doc

How to Add Allowed Users

Run in Supabase SQL Editor:

INSERT INTO electoral.allowed_users (email, role)
VALUES ('user@example.com', 'admin');

How to Recover from Failures

  1. Go to /admin/queue
  2. Find the failed document
  3. Click "Retry" or run manually with CLI

Data Privacy Notes

  • PDFs > 50MB should use Drive links (not uploaded)
  • Service role key is NEVER exposed to frontend
  • All voter data protected by Row Level Security

Deployment (Vercel)

  1. Import apps/web as the root
  2. Add environment variables
  3. Deploy

V2 Development Environment

V2 is a fully isolated development environment for unstable handwritten OCR ingestion.

Branching Strategy

Branch Purpose Deploys To
main V1 Production (stable) electoral.app
release/v1 V1 hotfixes only electoral.app
v2 V2 Development v2.electoral.app

V2 Infrastructure

  • Supabase: civic-intel-v2 (xvdretunljtbkyzmmnok)
  • Vercel: 54-v2 project (create manually)
  • GitHub Actions: v2 environment with approval gates

V2 Data Model

V2 uses a "canonical view" pattern for handwritten OCR:

voters (base)  ←→  voters_patch (corrections)
                        ↓
              voters_canonical_v2 (merged view)

Key tables:

  • voters_candidate - Raw OCR extractions (may have errors)
  • voters_patch - Verified corrections
  • qa_issues - Quality issues tracking
  • source_assets - Images and textract outputs

Working on V2

git checkout v2
cp apps/web/.env.v2.template apps/web/.env.local
# Fill in SUPABASE_SERVICE_ROLE_KEY from dashboard
npm run dev

Rollback Procedure

V1 remains independent at all times:

  1. V1 code: main branch, unaffected by v2
  2. V1 data: Division 54 Supabase project, separate from V2
  3. V1 deploy: deploy-production.yml only touches main

Phase History

  • Phase 1: Foundation skeleton + PWA
  • Phase 2: Supabase Auth + RLS + real search
  • Phase 3: Documents & Ingestion system
  • Phase 4: Voice Search + Read Aloud
  • Phase 5: Voice-First Upgrade
  • Phase 6: Production Hardening & Quality Gates
  • Phase 7: Self-Improving Search + Telugu/Roman Superpowers + Active Learning
  • Phase 8: Durable Ingestion + Scale Spine (Zero-Duplicate, Zero-Stuck)
  • Phase 9: Secure Data Foundation + Search (RLS, Audit Logging)
  • Phase 10: Field Operations (Offline-First, Visits, Issues, Sync)
  • Phase 11: Intelligence Layer (Analytics, ML Scoring, Dashboards, Segments)
  • Phase 12: Production Launch & Corporate Hardening (CI/CD, Security, DR)

Phase 7: Search & Active Learning

Any-Script Search

Search works with both Telugu and Roman scripts:

  • Type "Ramesh" → finds "రమేష్"
  • Type "శ్రీనివాస్" → finds matching records
  • Fuzzy matching handles typos and OCR errors

How Normalization Works

Each voter record has normalized fields:

  • name_norm_te - Telugu normalized name
  • name_norm_roman - Roman (ITRANS) normalized name
  • address_norm_te/roman - Normalized addresses
  • search_blob - Combined search field

The system uses indic-transliteration for Telugu ↔ Roman conversion.

How to Run Backfill Jobs

cd packages/ingest
python -m normalize.backfill

This populates normalized columns for all existing voters.

How Review Tasks Are Generated

The queue_reviews.py script uses:

  1. Uncertainty sampling - Low confidence records
  2. Diversity sampling - Random pages across documents
  3. Random sampling - Avoid blind spots

Run manually:

cd packages/ingest
python -m normalize.queue_reviews

Tuning Thresholds

  • source_confidence < 0.6 → triggers review
  • source_confidence = 1.0 → marked as reviewed
  • Adjust in queue_reviews.py and admin review page

Phase 8: Durable Ingestion

Phase 8 implements durable, idempotent ingestion with:

  • Job leasing with heartbeats
  • Event logging for traceability
  • Artifact storage for replay
  • Natural key constraints for deduplication

Running Durable Ingestion

cd packages/ingest
python -m durable.run

Phase 9: Secure Data Foundation

  • Expanded voter schema with family, contact, benefits fields
  • Role-based RLS policies (admin, leader, booth_incharge, field_worker)
  • Audit logging for all sensitive operations
  • Advanced search RPC with fuzzy matching

Phase 10: Field Operations

Offline-first field ops system:

  • IndexedDB storage with booth packs
  • Visit logging with sync queue
  • Issue tracking and schemes
  • Service worker for background sync

Phase 11: Intelligence Layer

Analytics and ML scoring:

  • Materialized views for dashboards
  • Segment builder with query DSL
  • Turnout propensity and priority models
  • SHAP-based explainability

Phase 12: Production Hardening

Corporate-grade deployment:

  • CI/CD with GitHub Actions
  • k6 load testing suite
  • Security headers and OWASP checklist
  • Backup/DR with PITR
  • Compliance documentation

About

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors