Artha-Nyaya Suite

A Databricks-native platform that protects India's 300M+ first-time digital finance users across the complete fraud lifecycle — prevention, detection, legal guidance, government scheme access, and complaint filing — in 10 Indian languages with voice support.

Detailed setup instructions: how_to_run.md | Interactive architecture: Architecture_Diagram.html | Presentation: Presentation.html

Architecture

flowchart LR
    U["User\n10 languages + voice"] --> APP["Databricks App\n(Gradio)"]

    subgraph Modules["5 Connected Modules"]
      SA["Saavdhaan\nLending Analyzer"]
      SU["Suraksha\nFraud Detection"]
      AD["Adhikar\nRights Chatbot"]
      SM["Samriddhi\nScheme Navigator"]
      NI["Nivaaran\nComplaint Drafter"]
    end

    APP --> SA & SU & AD & SM & NI

    subgraph RAG["RAG Pipeline (orchestration.py)"]
      direction LR
      R1["Translate\nSarvam Mayura"] --> R2["Rewrite\nChat Memory"] --> R3["Retrieve\nVector Search / FAISS"] --> R4["Generate\nLlama-4-Maverick"] --> R5["Translate Back"]
    end

    subgraph ML["Spark MLlib + MLflow"]
      F["GBT Fraud Classifier\n@champion"]
      P["KMeans Personas\n@champion"]
    end

    subgraph Lakehouse["Databricks Lakehouse"]
      DT["Delta Tables\n(CDF enabled)"]
      VS["Vector Search Index\n(databricks-bge-large-en)"]
      VOL["UC Volume\nFAISS + app_cache"]
      UC["Unity Catalog\nworkspace.default"]
    end

    SA & AD & SM & NI --> RAG
    SU --> F
    SM --> P
    F & P --> DT
    RAG --> VS & VOL
    VS --> DT

How Databricks Components Connect

User (10 languages + voice)
  │
  ▼
Databricks App (Gradio UI, OAuth M2M service principal)
  │
  ├──► Saavdhaan ──► RAG Pipeline ──► Vector Search (unified_corpus, CDF auto-sync)
  ├──► Suraksha  ──► Spark MLlib GBT (MLflow fraud_detector@champion) ──► Delta: upi_transactions
  ├──► Adhikar   ──► RAG Pipeline ──► Vector Search ──► FAISS fallback (UC Volume)
  ├──► Samriddhi ──► Spark MLlib KMeans (MLflow persona_kmeans@champion) + RAG
  ├──► Nivaaran  ──► RAG Pipeline ──► Llama-4-Maverick (primary) / sarvam-m (fallback)
  └──► Performance ──► Delta tables via SQL Statement API (live metrics)

Data Pipeline (Notebook Execution)

Google Drive (seed files)
    ↓ notebook 00b
UC Volume: /Volumes/workspace/default/project_files/
    ↓ notebooks 01-06
Delta Tables: upi_transactions, bns_sections, rbi_circulars, gov_schemes, ...
    ↓ notebook 07
unified_corpus (merged, Change Data Feed enabled)
    ↓ notebook 08                    ↓ notebook 09
Vector Search Index              FAISS Index on UC Volume
(databricks-bge-large-en)       (all-MiniLM-L6-v2)
    ↓ notebook 10                    ↓ notebook 11
MLflow: fraud_detector@champion   MLflow: persona_kmeans@champion

Tech Stack

Layer	Technologies
Databricks Platform	Delta Lake (CDF), Unity Catalog, Spark MLlib, MLflow, Vector Search, Databricks Apps, Llama-4-Maverick, SQL Statement API, OAuth M2M
Indian AI	Sarvam Mayura (translation), Saaras v3 (STT), Bulbul v3 (TTS), sarvam-m (fallback LLM)
Legal/Financial Data	BNS 2023, BNS-IPC Mapping, RBI Digital Lending Circulars, MyScheme.gov.in, BhashaBench
Languages	English, Hindi, Tamil, Telugu, Bengali, Marathi, Kannada, Malayalam, Gujarati, Punjabi
Fallbacks	LLM: Databricks → Sarvam

How to Run

Prerequisites

Databricks workspace with Unity Catalog enabled (DBR 14.3+)
Databricks CLI installed (docs)
Sarvam AI API key
HuggingFace token (for BhashaBench datasets)

Step 1: Export Credentials

export DATABRICKS_HOST="https://<your-workspace>.cloud.databricks.com"
export DATABRICKS_TOKEN="<your-personal-access-token>"

Step 2: Store Secrets

databricks secrets create-scope artha-nyaya
databricks secrets put-secret artha-nyaya sarvam_api_key
databricks secrets put-secret artha-nyaya hf_token

Step 3: Sync Repo to Workspace

databricks sync . /Workspace/Users/<your-email>/Databricks_Hackathon

Step 4: Run Notebooks in Order

Open each in Databricks UI and run sequentially on a cluster:

#	Notebook	Purpose	Required?
1	`00_setup_secrets_and_volume.py`	Create secret scope, catalog, volume, verify APIs	Yes
2	`00b_sync_repo_data_to_volume.py`	Download seed data from Google Drive	Yes
3	`01_ingest_upi_transactions.py`	Ingest UPI transactions	Yes
4	`02_ingest_bns_sections.py`	Ingest BNS 2023 legal sections	Yes
5	`03_ingest_bns_ipc_mapping.py`	Ingest BNS-IPC mapping	Yes
6	`04_ingest_rbi_circulars.py`	Ingest RBI circulars	Yes
7	`05_ingest_gov_schemes.py`	Ingest government schemes	Yes
8	`06_ingest_bhashabench_eval.py`	Ingest BhashaBench benchmarks	Yes
9	`07_build_unified_corpus.py`	Merge into unified corpus (CDF enabled)	Yes
10	`08_setup_vector_search.py`	Create Vector Search endpoint + index	Yes
11	`09_build_faiss_fallback.py`	Build FAISS fallback index	Yes
12	`10_train_fraud_model.py`	Train GBT fraud classifier (MLflow)	Yes
13	`11_train_persona_clusters.py`	Train KMeans personas	Yes
14	`12_evaluate_rag_bhashabench.py`	RAG evaluation	Optional
15	`13_smoke_test_end_to_end.py`	End-to-end smoke test	Optional
16	`14_notebook_ui_fallback.py`	Notebook UI fallback	Skip
17	`15_grant_app_permissions.py`	Grant service principal permissions	Yes (after Step 6)
18	`16_current_permissions.py`	Permissions audit	Optional

Step 5: Create the App

databricks apps create artha-nyaya-suite --description "Artha-Nyaya Suite"

Step 6: Find Client ID & Grant Permissions

Go to Databricks UI → Compute → Apps → artha-nyaya-suite
Copy the Service Principal Client ID

Grant secret scope access:

databricks secrets put-acl artha-nyaya <CLIENT_ID> READ

Run notebook 15 in Databricks UI (enter Client ID in the widget) — this grants USE_CATALOG, USE_SCHEMA, READ_VOLUME, SELECT on tables, EXECUTE on models, and secret scope READ ACL

Step 7: Deploy

databricks apps deploy artha-nyaya-suite \
  --source-code-path /Workspace/Users/<your-email>/Databricks_Hackathon

The app URL appears in databricks apps get artha-nyaya-suite or app logs.

For full details, CLI permission commands, and troubleshooting, see how_to_run.md.

Demo Steps

The Connected Journey (recommended 5-minute flow)

1. Saavdhaan — Predatory Lending Detection

Paste these example terms into the text box:

Interest rate: 36% per week. Processing fee: 15% of loan amount.
We will access your contacts, photos, and location. Recovery agents
may contact your family members. Late fee: Rs 500 per day.

Click "Analyze for Safety"
See the safety scorecard (0-100) with flagged violations + RAG legal analysis citing RBI guidelines
Two buttons appear: "Know Your Rights" and "Report / File Complaint"

2. Cross-Module Navigation → Nivaaran

Click "Report / File Complaint"
App jumps to Nivaaran tab with complaint type pre-filled as "Digital Lending App Harassment" and description pre-filled with the flagged terms
Click "Generate Complaint" → formal legal draft with BNS sections + RBI citations

3. Suraksha — Fraud Detection

Navigate to Suraksha tab
Click "Load Transactions" (demo user: ramesh.kumar@oksbi)
See transaction table with FRAUD FLAGGED rows
Click "Explain Flagged Fraud" → AI explains why with legal references
Click "Know Your Rights" → jumps to Adhikar with suggested questions shown

4. Adhikar — Legal Rights Chatbot

Yellow hint box shows suggested questions from Suraksha
Type: What is BNS section 318 about cheating?
Follow up: What's the punishment for that?
The system resolves "that" using conversation memory and retrieves the correct answer

5. Samriddhi — Government Schemes

Navigate to Samriddhi tab
Click "Find Schemes For Me" → KMeans persona matching → relevant government schemes displayed

6. Performance Dashboard

Open Performance tab → live fraud model AUC (0.9999), RAG proxy accuracy (62%), architecture overview

Results

Metric	Value
Fraud Model AUC	0.9999
Fraud-class F1	0.9668
Fraud Precision	0.9711
Fraud Recall	0.9626
Training Data	5M+ rows
RAG Proxy Accuracy (BhashaBench Hindi)	62.0%
RAG Token F1	0.384
RAG Avg Latency	4.8s
Languages Supported	10
Connected Modules	5 + Performance dashboard

Repository Layout

Databricks_Hackathon/
├── app/
│   ├── main.py                   # Gradio entry point (6 tabs + cross-module nav)
│   ├── fraud_module.py           # Suraksha — fraud detection
│   ├── rights_module.py          # Adhikar — legal rights chatbot
│   ├── scheme_module.py          # Samriddhi — scheme matching
│   ├── nivaaran_module.py        # Nivaaran — complaint drafter
│   ├── saavdhaan_module.py       # Saavdhaan — predatory lending detector
│   ├── eval_module.py            # Performance — eval metrics dashboard
│   ├── orchestration.py          # RAG pipeline (translate → retrieve → generate)
│   ├── voice_module.py           # Voice I/O (STT + TTS)
│   └── ui_helpers.py             # Shared UI formatting
├── lib/
│   ├── sarvam_client.py          # Sarvam API client
│   ├── llm_client.py             # LLM client (Databricks + Sarvam fallback)
│   ├── retrieval.py              # Unified retriever (Vector Search + FAISS)
│   ├── app_cache.py              # Parquet cache for app container
│   └── secrets.py                # Three-tier secret loading
├── notebooks/                    # 00-16: run in order (see How to Run)
├── app.yaml                      # Databricks App deployment config
├── requirements.txt              # Python dependencies
├── how_to_run.md                 # Detailed setup & deployment guide
├── Architecture_Diagram.html     # Interactive architecture diagram
├── Presentation.html             # 4-slide presentation deck
└── README.md

Built for the Databricks Hackathon.

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
app		app
lib		lib
notebooks		notebooks
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
Architecture_Diagram.html		Architecture_Diagram.html
CONTEXT.md		CONTEXT.md
DEMO_SCRIPT.md		DEMO_SCRIPT.md
Final Setup.docx		Final Setup.docx
LICENSE		LICENSE
PITCH_OUTLINE.md		PITCH_OUTLINE.md
Presentation.html		Presentation.html
README.md		README.md
app.yaml		app.yaml
how_to_run.md		how_to_run.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Artha-Nyaya Suite

Architecture

How Databricks Components Connect

Data Pipeline (Notebook Execution)

Tech Stack

How to Run

Prerequisites

Step 1: Export Credentials

Step 2: Store Secrets

Step 3: Sync Repo to Workspace

Step 4: Run Notebooks in Order

Step 5: Create the App

Step 6: Find Client ID & Grant Permissions

Step 7: Deploy

Demo Steps

The Connected Journey (recommended 5-minute flow)

Results

Repository Layout

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Artha-Nyaya Suite

Architecture

How Databricks Components Connect

Data Pipeline (Notebook Execution)

Tech Stack

How to Run

Prerequisites

Step 1: Export Credentials

Step 2: Store Secrets

Step 3: Sync Repo to Workspace

Step 4: Run Notebooks in Order

Step 5: Create the App

Step 6: Find Client ID & Grant Permissions

Step 7: Deploy

Demo Steps

The Connected Journey (recommended 5-minute flow)

Results

Repository Layout

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages