# AI Data Science Assistant Chatbot 

This project is a local, cost-free data science assistant built with LangChain, Pandas, and the LLaMA 3 model via Ollama. 
Users can upload CSV files and ask natural language questions to extract insights from structured data — no code or API keys are required. 
This project showcases Retrieval-Augmented Generation (RAG) and tool-augmented LLMs to simulate a conversational data analyst, 
capable of answering statistical, comparative, and exploratory questions over real-world datasets.


In [None]:
# Make sure to have:
# 1. Python installed
# 2. Ollama installed from https://ollama.com
# 3. Run this in a Jupyter notebook OR as a script

# Uncomment below if needed 
# !pip install pandas langchain ollama

In [None]:
import pandas as pd
from langchain.agents import create_pandas_dataframe_agent
from langchain_community.llms import Ollama

In [None]:
# Step 1: Load your CSV dataset
# Replace this with your actual file path
csv_path = "sample.csv"  # Example: sample CSV provided below
df = pd.read_csv(csv_path)

# Preview the data
print("Dataset preview:")
print(df.head())

In [None]:
# Step 2: Initialize the local LLaMA 3 model via Ollama
# Make sure 'ollama run llama3' is running in a separate terminal window
llm = Ollama(model="llama3")

# Step 3: Create the LangChain agent to interface with your DataFrame
agent = create_pandas_dataframe_agent(llm, df, verbose=True)

# Step 4: Ask a question about your data
question = "Which country has the highest average sales?"

print("\nQuestion:")
print(question)
print("\nAnswer:")
response = agent.run(question)
print(response)

In [None]:
# Step 5: Ask more questions
# Example: 
response = agent.run("Show me total revenue for all countries")
print("\nAnswer:")
print(response)
