# Lesson 2: Interacting with a CSV Data

## Setup and connect to the Azure OpenAI endpoint

**Note**: The pre-configured cloud resource grants you access to the Azure OpenAI GPT model. The key and endpoint provided below are intended for teaching purposes only. Your notebook environment is already set up with the necessary keys, which may differ from those used by the instructor during the filming.

In [3]:
import os 
import pandas as pd
from IPython.display import Markdown, HTML, display
from langchain.schema import HumanMessage
from langchain_openai import ChatOpenAI  # Changed from AzureChatOpenAI

model = ChatOpenAI(
    model_name="gpt-3.5-turbo-0125",  # or "gpt-4" depending on your access
    api_key=os.getenv("OPENAI_API_KEY"),  # Make sure you have set this environment variable
    #temperature=2
)

## Load the dataset

**Note**: To access the data locally, use the following code:

```
os.makedirs("data",exist_ok=True)
!wget https://covidtracking.com/data/download/all-states-history.csv -P ./data/
file_url = "./data/all-states-history.csv"
df = pd.read_csv(file_url).fillna(value = 0)
```

In [None]:
df = pd.read_csv("C:/Users/harik/filename.CSV").fillna(value = 0)

# Prepare the Langchain dataframe agent

In [None]:
# Prepare the Langchain dataframe agent
from langchain.agents.agent_types import AgentType
from langchain_experimental.agents.agent_toolkits import create_pandas_dataframe_agent

agent = create_pandas_dataframe_agent(
    llm=model,
    df=df,
    verbose=True,
    allow_dangerous_code=True  # Added this parameter to allow code execution
)

agent.invoke("how many rows are there?")

## Design your prompt and ask your question

In [None]:
'''
CSV_PROMPT_PREFIX = """
First set the pandas display options to show all the columns,
get the column names, then answer the question.
"""

CSV_PROMPT_SUFFIX = """
- **DO NOT MAKE UP AN ANSWER OR USE PRIOR KNOWLEDGE,
ONLY USE THE RESULTS OF THE CALCULATIONS YOU HAVE DONE**.
- **ALWAYS** explain how you got
to the answer on a section that starts with: "\n\nExplanation:\n".
In the explanation, mention the column names that you used to get
to the final answer.
"""
'''

QUESTION = "what is the profit during year 2024?"

# agent.invoke(CSV_PROMPT_PREFIX + QUESTION + CSV_PROMPT_SUFFIX)
agent.invoke(QUESTION)