https://python.langchain.com/docs/use_cases/sql/csv/

In [None]:
import pandas as pd
from pyprojroot import here

In [None]:
df = pd.read_csv(here("data/for upload/titanic.csv"))
print(df.shape)
print(df.columns.tolist())
display(df.head(3))

### **SQL**

Using SQL to interact with CSV data is the recommended approach because it is easier to limit permissions and sanitize queries than with arbitrary Python.

Most SQL databases make it easy to load a CSV file in as a table (DuckDB, SQLite, etc.). Once you’ve done this you can use all of the chain and agent-creating techniques outlined in the SQL use case guide. Here’s a quick example of how we might do this with SQLite:

In [None]:
from langchain_community.utilities import SQLDatabase
from sqlalchemy import create_engine
db_path = str(here("data")) + "/test_sqldb.db"
db_path = f"sqlite:///{db_path}"

engine = create_engine(db_path)
# df.to_sql("titanic", engine, index=False)
df.to_sql("titanic", engine, index=False)

For multiple csv files, we can create a sql with multiple tables:
- df1.to_sql("csv1_name", engine, index=False)
- df2.to_sql("csv2_name", engine, index=False)
- ...

In [None]:
db = SQLDatabase(engine=engine)
print(db.dialect)
print(db.get_usable_table_names())
db.run("SELECT * FROM titanic WHERE Age < 2;")

In [None]:
df[df["Age"]<2]

### **Create an agent to interact with the Database**

In [None]:
import os
from dotenv import load_dotenv
print("Environment variables are loaded:", load_dotenv())
print("test by reading a variable:", os.getenv("OPENAI_API_TYPE"))

In [None]:
from langchain.chat_models import AzureChatOpenAI

model_name = os.getenv("gpt_deployment_name")
azure_openai_api_key = os.environ["OPENAI_API_KEY"]
azure_openai_endpoint = os.environ["OPENAI_API_BASE"]
llm = AzureChatOpenAI(
    openai_api_version=os.getenv("OPENAI_API_VERSION"),
    azure_deployment=model_name,
    model_name=model_name,
    temperature=0.0)

In [None]:
from langchain_community.agent_toolkits import create_sql_agent
agent_executor = create_sql_agent(llm, db=db, agent_type="openai-tools", verbose=True)

In [None]:
agent_executor.invoke({"input": "what's the average age of survivors"})

In [None]:
agent_executor.invoke({"what's the average age of survivors"})

In [None]:
agent_executor.invoke({"input": "what's the average age of survivors"})