Skip to content

naresh5422/AI_Data_Analyst

Repository files navigation

AI Data Analyst: Natural Language to SQL Query Generator

This project is a Python-based application that acts as an AI Data Analyst. It takes a user's question in natural language, converts it into a valid SQL query, and executes it against a database to fetch the answer. This is achieved using a Large Language Model (LLM) through Ollama and the LangChain framework.

Features

  • Database Setup: Includes a script to create a sample SQLite database (amazon.db) with tables for Customers, Products, Orders, and Order Items, and populates them with dummy data.
  • Web Interface: A user-friendly web interface built with Streamlit to interact with the AI Data Analyst.
  • Schema Extraction: Automatically inspects the database to extract its schema (tables and columns).
  • Text-to-SQL Conversion: Utilizes the deepseek-r1:8b model via Ollama and LangChain to convert a natural language prompt into an SQL query based on the extracted schema.
  • Query Execution: Connects to the database, runs the generated SQL query, and retrieves the results.

Project Structure

AI_Data_Analyst/
├── create_database.py   # Script to create and populate the database
├── main.py              # Main application for Text-to-SQL logic
├── frontend.py          # Streamlit web interface for the application
├── amazon.db            # SQLite database file (generated by create_database.py)
└── README.md            # This file

Prerequisites

Before you begin, ensure you have the following installed:

  1. Python 3.8+
  2. Ollama: You need to have Ollama installed and running on your machine. You can download it from ollama.com.

Setup and Execution

Follow these steps to set up and run the project.

1. Clone the Repository

If your project is in a Git repository, clone it. Otherwise, navigate to your project directory.

2. Install Required Python Packages

It's recommended to use a virtual environment.

# Create and activate a virtual environment (optional but recommended)
python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

# Install dependencies
pip install sqlalchemy langchain-ollama langchain-core

3. Set up the LLM with Ollama

Make sure the Ollama application is running. Then, pull the deepseek-r1:8b model by running the following command in your terminal:

ollama pull deepseek-r1:8b

4. Create the Database

Run the create_database.py script to generate the amazon.db file and populate it with sample data.

python d:/All_Projects/AI_Data_Analyst/create_database.py

You should see the output: Database and tables created with dummy data.

5. Run the Main Application

Execute the main.py script to perform the Text-to-SQL conversion and query the database.

python d:/All_Projects/AI_Data_Analyst/main.py

The script will use the default prompt "Tell me the names of all the customers" to generate an SQL query, execute it, and print the results.

Example Output:

Result: [('Alice Johnson',), ('Bob Smith',), ('Catherine Lee',), ('David Brown',), ('Emma Wilson',), ('Frank Harris',), ('Grace Taylor',), ('Henry Davis',), ('Isabella Moore',), ('Jack Martin',)]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages