## Sanchalak - Bringing Schemes closer to Farmers

## Background

India’s 13+ crore farmers struggle to access government schemes due to low awareness, complex eligibility, and language barriers. Despite high smartphone penetration, welfare schemes remain underutilized. Farmers rely on opaque intermediaries and face friction at every step—from discovery to application. There’s a clear need for a unified, accessible, and language-inclusive support system to bridge this gap.

## Problem


Farmers struggle to access welfare schemes due to fragmented processes, lack of awareness, and limited language support. The system is not voice-friendly or user-centric. As a result, many miss out on benefits, rely on middlemen, and face delays. This leads to underutilization of funds and distrust in public systems.


## Customer



The primary beneficiaries are small and marginal farmers in rural India. Their core challenge is accessing government schemes due to language barriers, low digital literacy, and bureaucratic complexity. Today, they rely on intermediaries, face long delays, and often miss out on entitlements. They need a simple, trustworthy, and localized way to understand and apply for schemes.



## Value Proposition



Solving this enables direct access to welfare benefits for millions of underserved farmers, reducing dependency on intermediaries. It can increase scheme utilization, improve rural livelihoods, and enhance trust in public systems. For governments, it ensures better policy delivery and measurable impact. For farmers, it means timely support, clarity, and empowerment in their native language.



## Product (Experience)


Users interact via a multilingual voice-first interface, making it accessible to non-literate farmers. They speak their land or crop details, and the assistant explains eligible schemes, auto-fills applications, and tracks status. The experience is conversational, intuitive, and localized. Unlike static portals, it offers personalized guidance and real-time support in the farmer’s language.


## Objectives (SMART or Business Goals)



* Enable 80%+ accuracy in scheme eligibility detection
based on user input.

* Reduce application time by 60% through autofill and
document guidance.

* Reach 1 lakh+ farmers across 3 states within the first 6 months.

* Achieve 40%+ scheme application conversion rate from user interactions.

* Maintain 90%+ satisfaction score through multilingual conversational UX.

Success means high engagement, increased scheme adoption, and demonstrable improvement in government service delivery.

## Risks and Challenges





1.   Context Management: Misinterpreting user inputs in multi-turn conversations can lead to incorrect scheme recommendations. Mitigated by using robust NLP pipelines and context tracking with fallback handling.


2.   Data Privacy: Handling personal and land data poses privacy risks. Addressed through encryption, consent mechanisms, and compliance with data protection norms.

3. Language Ambiguity: Diverse dialects may confuse the voice assistant. Solved via regional NLP tuning and continuous speech model refinement.

4. Operational Scalability: Scaling across states with different scheme rules is complex. Mitigated by modularizing scheme logic and localizing configurations.

## Task Type



Sanchalak involves a multi-label classification task to identify eligible government schemes based on user input (land, crop, location, income, etc.). It also includes NER (Named Entity Recognition) and intent classification for voice query understanding.



## Data



Sanchalak uses semi-structured text data extracted from government scheme PDFs and portals using Tesseract OCR and BeautifulSoup, then standardized into a schema-spec.yml format. This includes eligibility criteria, benefits, documents required, and application processes.
Data is manually validated and labeled to align with NER and classification needs. It will be refreshed quarterly as new schemes are launched or updated.



## Plan / Roadmap



**Phase 1: Problem & Data Understanding**

* Finalize scheme categories, user personas, and data schema.


* Clean and validate semi-structured scheme data.

**Phase 2: MVP Development (Week 3)**

* Build voice-to-text + intent/NER pipeline.

* Deploy scheme classifier and response generator.

* Integrate with WhatsApp/Telegram via Twilio/Bot API.

**Phase 3: Pilot Launch (Week 4)**

* Onboard ~1,000 farmers across 2 districts.

* Collect feedback on UX, accuracy, and language gaps.

**Phase 4: Scale and Optimize (Week 5-6)**

* Retrain with new utterances, improve context handling.

* Expand scheme coverage and add 3 more languages.


## Continuous Improvement



The system will improve through active learning from user interactions and feedback loops. Misclassifications and new utterances will be logged and used to retrain the NER and classification models every month. Scheme database updates will be automated and reviewed quarterly to stay current. Voice model accuracy will be enhanced using region-specific accents over time.

## Resources – Human



NLP/ML Engineer (1–2): To develop and maintain classification, NER, and speech models.

Full-Stack Developer (1): For API, bot integration, and dashboard support.

Domain Expert (1): For interpreting government schemes and maintaining the schema-spec.yml.

Data Annotator (1–2 part-time): For validating OCR data and labeling utterances.
Total: 4–6 people.



## Resources-Compute



* Training: Requires GPU (e.g., NVIDIA T4 or A100) for model fine-tuning (NER, classifier, ASR).

* Serving: CPU-based inference for WhatsApp/Telegram deployment is sufficient, with fallback to lightweight cloud-based GPU if scaling.

* Hosting on AWS/GCP with scalable container orchestration (e.g., ECS, Cloud Run) is recommended.

