An AI-driven chatbot for the INGRES (India Ground Water Resource Estimation System) platform that enables users (researchers, planners, policymakers, and the public) to query groundwater resource data using natural language.
The system provides:
- Natural language query interface for groundwater data
- Support for multiple states and districts across India
- Historical trends and comparisons
- Automated data visualization
- RESTful API for integration
- Database Query (Text only)
-What is the groundwater status in Central Delhi in 2024-groundwater extraction in Haridwar 2023-rainfall in Dehradun district-Compare Delhi and Uttarakhand in 2022 - LLM Query (Knowledge from documents)
-What is groundwater recharge?-Explain the difference between confined and unconfined aquifers-What does semi-critical groundwater status mean?-How is the stage of groundwater extraction calculated?-Tell me about groundwater sustainability-What factors affect groundwater availability? - Hybrid Query (Text + Graph)
-Show rainfall chart for Central Delhi 2024-Plot extraction trend in Dehradun from 2021 to 2024-Display bar graph of groundwater stage in Haridwar-Visualize rainfall data for Delhi 2024-Create a graph showing extraction in Uttarakhand districts-Show me a chart of water availability in North Delhi
D:\INGRES ├── app/ │ ├── init.py │ ├── main.py # FastAPI application │ ├── database.py # MongoDB connection and queries │ ├── parser.py # Natural language query parser │ └── visualizer.py # Data visualization generator │ ├── data/ │ ├── insert_db.py # Script to insert JSON data into MongoDB │ ├── Delhi_2022-23.json │ ├── Delhi_2024-25.json │ ├── UK_2023-24.json # Uttarakhand data │ ├── Mizoram_2022-23.json │ └── Kerala_2022-23.json │ └── README.md
- Database: MongoDB (local/cloud instance)
- Backend: Python 3.8+ with FastAPI
- NLP: Custom query parser with regex-based entity extraction
- Visualization: Matplotlib, Pandas
- API Documentation: Auto-generated Swagger UI via FastAPI
# Create virtual environment
python -m venv .venv
# Activate (Windows)
.venv\Scripts\activate
# Activate (Linux/Mac)
source .venv/bin/activate
# Install dependencies
pip install fastapi uvicorn pymongo pandas matplotlib
2️⃣ Install and Start MongoDB
Download MongoDB Community Edition
Start MongoDB service:
bash# Windows
net start MongoDB
# Linux/Mac
sudo systemctl start mongod
3️⃣ Load Groundwater Data
bashcd D:\INGRES
python data/insert_db.py
This will:
Connect to MongoDB at mongodb://localhost:27017
Create database: ingres_bot
Create collection: groundwater_state
Insert groundwater data from JSON files
Create unique indexes on STATE, DISTRICT, and assessment_year
4️⃣ Start the API Server
bashuvicorn app.main:app --reload --host 0.0.0.0 --port 8000
The API will be available at:
API Root: http://localhost:8000
Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc
📊 Dataset Information
Current Coverage
States: Delhi, Uttarakhand, Mizoram, Kerala
Years: 2022-2024
Districts: 40+ districts across covered states
Data Fields
Rainfall (mm)
Groundwater Recharge (ham)
Groundwater Extraction (domestic, industrial, irrigation)
Stage of Groundwater Extraction (%)
Annual Extractable Resources
Net Future Availability
Field Naming Convention
NC = Non-Confined aquifer (Delhi, Mizoram, Kerala)
C = Confined aquifer (Uttarakhand)
The system automatically handles both conventions
🔌 API Endpoints
Main Query Endpoint
GET /query?user_query=What is the groundwater level in Central Delhi in 2024?
Response:
json{
"query": "What is the groundwater level in Central Delhi in 2024?",
"response": "Groundwater Status for CENTRAL district, DELHI in 2024:\n\nExtraction Stage: 75.62%\nStatus: Semi-critical...",
"data_count": 1,
"summary": {
"total_records": 1,
"states": ["DELHI"],
"districts": ["CENTRAL"],
"years": [2024]
}
}
Other Endpoints
GET /states - List all available states
GET /districts?state=DELHI - List districts for a state
GET /years - Get available years
GET /compare - Compare multiple states/districts
GET /trends - Get time series data
GET /health - Health check
💬 Example Queries
What is the groundwater level in Central Delhi in 2024?
Compare Delhi and Uttarakhand groundwater in 2023
Show rainfall data for Dehradun district
What is the extraction stage in Haridwar?
Compare North and South districts in Delhi
🧩 System Features
Natural Language Processing
Extracts states, districts, years, and metrics from queries
Supports comparison queries (vs, compare, between)
Handles year ranges (2022-2024)
Detects multiple entities for complex queries
Smart Response Generation
Converts technical data into human-readable explanations
Provides status interpretations (Safe, Semi-critical, Over-exploited)
Includes breakdowns for domestic, irrigation, and industrial use
Offers actionable insights
Data Flexibility
Handles both Confined (C) and Non-Confined (NC) aquifer data
Automatically normalizes field names across states
Supports dynamic state/district additions without code changes
👨💻 Development
Adding New States
Place JSON file in data/ folder
Update data/insert_db.py to include the new file
Run: python data/insert_db.py
System automatically detects new states/districts
Testing Queries
bash# Test basic query
curl "http://localhost:8000/query?user_query=Delhi%20groundwater%202024"
# Get raw data
curl "http://localhost:8000/query?user_query=Delhi%202024&include_raw=true"
# Check available states
curl http://localhost:8000/states
📝 Notes
The system is designed to be state-agnostic and scales automatically
MongoDB indexes ensure fast queries even with large datasets
API follows REST principles with proper HTTP status codes
All responses include natural language explanations for non-technical users
🔮 Future Enhancements
Real-time data updates from INGRES portal
Multi-language support
Historical trend analysis with ML predictions
Export reports in PDF/Excel formats
👥 Team
Backend developers working on Python/FastAPI: Nitin, Nakul
Frontend team integrate with API endpoints: Krishna, prakriti
Data team managing groundwater datasets: Sarovar, Rajan