Welcome to the Text to SQL Chatbot project! This project bridges the gap between non-technical team members and database interactions, allowing users to query databases using natural language instead of SQL.
π Table of Contents
- π Project Overview
- π§ Features
- π οΈ Installation
- π Usage
- πΌ Architecture
- π Evaluation
- π License
In many companies, team members need access to data stored in SQL databases but lack the technical skills to write SQL queries. This chatbot allows users to ask questions in natural language, converts those questions into SQL queries, and retrieves results from the database.
The project uses Google's Gemini AI model to interpret natural language questions and generate appropriate SQL queries for a MySQL database containing sales, customer, and product data.
- Natural Language to SQL Conversion: Uses Google Gemini AI to convert user questions into SQL statements
- MySQL Database Integration: Connects directly to a MySQL database to execute generated queries
- CSV Data Support: Works with data imported from CSV files into MySQL tables
- Structured Query Generation: Creates precise SQL queries based on the database schema
- Jupyter Notebook Implementation: Complete solution implemented in a Jupyter notebook for easy modification
To get started with the Text to SQL Chatbot, follow these steps:
-
Clone the Repository:
git clone https://github.com/yourusername/Text-to-SQL-Chatbot.git cd Text-to-SQL-Chatbot
-
Install Required Libraries:
pip install langchain langchain_google_genai pymysql
-
Set Up MySQL Database:
- Create a MySQL database named
text_to_sql
- Import the CSV files from the
Data_CSV
folder into corresponding tables - Ensure the database schema matches the one referenced in the notebook
- Create a MySQL database named
-
Configure API Keys:
- Obtain a Google Gemini API key
- Replace
'API_KEY_HERE'
in the notebook with your actual API key
- Open the
Gemini Chatbot.ipynb
notebook in Jupyter - Update the database connection parameters if needed:
host = 'localhost' port = '3306' username = 'root' password = 'root' database_schema = 'text_to_sql'
- Run all cells in the notebook
- Enter your natural language questions when prompted
- The system will convert your question to SQL, execute it, and return the results
Example questions:
- "What is the total 'Line Total' for Geiss Company?"
- "Show me the top 5 products by sales amount"
- "What are the sales figures for the South region in 2021?"
The architecture of the Text to SQL Chatbot consists of the following components:
- Data Source: CSV files containing sales, customer, product, and regional data
- Database: MySQL database where the data is stored in structured tables
- LangChain Framework: Orchestrates the workflow between components
- Google Gemini AI: Processes natural language and generates SQL queries
- SQL Execution Engine: Runs the generated queries against the database
- Result Formatter: Returns the query results in a readable format
The performance of the chatbot can be evaluated using various metrics, including:
- Query Accuracy: How accurately the AI generates the correct SQL queries
- Response Time: The time taken to return results after a question is asked
- Schema Understanding: How well the system understands the database structure
- Complex Query Handling: Ability to handle joins, aggregations, and filters
This project is licensed under the MIT License. See the LICENSE file for more details.