AI For BI SQL Server using LangGraph

This repository provides an advanced, AI agent-powered T-SQL assistant for Microsoft SQL Server 2019, leveraging LangChain, LangGraph, Azure OpenAI, and Python data visualization. The assistant can interpret natural language business intelligence (BI) questions, generate optimized SQL queries, execute them, explain results, and produce charts when possible.

Features

Natural Language to SQL: Converts business questions into correct, efficient T-SQL queries for SQL Server 2019.
Schema-aware: Uses schema metadata to ensure queries only reference valid tables/columns.
Error Handling & Retries: Attempts to correct failed queries using error feedback.
Result Explanation: Provides clear, human-readable answers based on query results.
Automated Chart Generation: Attempts to visualize results using Matplotlib and Pandas, with chart type chosen via LLM.
Workflow Orchestration: Modular agent workflow powered by LangGraph.

Code Structure Overview

The core logic resides in app.ipynb, which orchestrates the following workflow:

Environment Setup
- Loads connection strings and Azure OpenAI credentials from environment variables.
- Establishes SQL Server connection using SQLAlchemy.
Schema Extraction
- Retrieves schema metadata from [ExecutiveDashboard].[dbo].[schema_metadata].
- Formats schema info for LLM prompting.
Agent State Modeling
- Defines AgentState (using Pydantic) to track question, schema, SQL code, result, answers, chart code, and more.
LangGraph Workflow Nodes
- Relevance Check: Determines if the question relates to warehouse data.
- Schema Fetch: Loads schema info into state.
- SQL Query Generation: Uses LLM to translate the question + schema into a T-SQL query (with NOLOCK hints).
- Query Execution: Runs SQL and records result/errors.
- Validation: Checks for execution errors.
- Retry: If errors, prompts the LLM for corrections (up to MAX_RETRIES).
- Result Explanation: Uses LLM to summarize the result.
- Visualization: If possible, LLM generates Python code for a chart and the notebook executes it, embedding the image.
Orchestration
- State transitions are managed via LangGraph's StateGraph, with conditional edges for retries and error handling.
Usage Example
- Example queries show how to run the assistant and display answers, SQL, and charts.

Installation

Clone the repository

git clone https://github.com/UtkPatAI25/AI_For_BI_SQL_Server_using_Langgraph.git
cd AI_For_BI_SQL_Server_using_Langgraph

Install dependencies
- Recommended: Use a virtual environment.
```
pip install -r requirements.txt
```
- Required packages include:
  - python-dotenv
  - sqlalchemy
  - pandas
  - matplotlib
  - langchain
  - langgraph
  - langchain-openai
  - pydantic
  - azure-openai (if needed, for Azure LLM access)

Set up environment variables

Create a .env file with:

py-connectionString=your_sqlalchemy_connection_string
AZURE_OPENAI_DEPLOYMENT_NAME=your_azure_deployment
AZURE_OPENAI_ENDPOINT=your_azure_endpoint
AZURE_OPENAI_API_KEY=your_api_key
AZURE_OPENAI_API_VERSION=2024-12-01-preview

Usage

Open app.ipynb in Jupyter Notebook or VS Code, and run the cells. You can interact with the assistant using the provided examples:

answer, sql_code, final_state = run_sql_assistant("Top 10 distributor by sales in May 2025")
print("LLM Answer:\n", answer)
print("SQL Query Executed:\n", sql_code)

if final_state.get("viz_image"):
    from IPython.display import Image
    import base64
    display(Image(data=base64.b64decode(final_state["viz_image"])))
elif final_state.get("viz_reason"):
    print("No chart generated:", final_state["viz_reason"])

How It Works

Flow Diagram

Node Details

check_relevance_node: Filters out unrelated questions.
get_schema_node: Loads schema info for context.
build_query_node: Prompts LLM to generate SQL (schema-constrained, NOLOCK, T-SQL only).
execute_query_node: Runs SQL against database.
validate_node: Looks for errors in execution.
retry_query_node: Attempts to fix failed queries using error feedback.
explain_result_node: LLM generates a human-readable answer from result.
analyze_and_generate_visualization_node: Asks LLM to write Python code for charts, executes code, and embeds image.

Security & Reliability

SQL queries are strictly limited to schema metadata.
NOLOCK hints are enforced to avoid locks.
LLM-generated Python code for charts is executed in a controlled environment.
Error handling, retries, and clear explanations are integrated.

Customization

Schema Source: Adjust the schema extraction query as needed.
LLM Prompting: Modify system prompts for different SQL dialects or business rules.
Visualization Code: Extend chart support by customizing the LLM prompt and chart execution logic.

Limitations

Requires up-to-date schema metadata in [schema_metadata] table.
LLM accuracy depends on prompt quality and model configuration.
Visualization is limited to what the LLM can infer from sample data and columns.

License

MIT License (see LICENSE file).

Contact

For questions, issues, or contributions, open an issue or reach out via GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.ipynb		app.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI For BI SQL Server using LangGraph

Features

Code Structure Overview

Installation

Usage

How It Works

Flow Diagram

Node Details

Security & Reliability

Customization

Limitations

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI For BI SQL Server using LangGraph

Features

Code Structure Overview

Installation

Usage

How It Works

Flow Diagram

Node Details

Security & Reliability

Customization

Limitations

License

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages