# Chat with your csv files using Langchain and OpenAI

In this notebook we will use Langchain and OpenAI to create a question-answering system for a csv file. We will use the tracklist.csv file from my spotify repository - [Github](https://github.com/rubentak/Spotify)



### First install langchain and openai if these are not installed

In [13]:
# !pip install -q langchain openai os


### Load the libraries

In [1]:
from langchain.document_loaders import CSVLoader
from langchain.indexes import VectorstoreIndexCreator
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI
import os

Get your OpenAI Key from here - https://platform.openai.com/account/api-keys

### Set enviorment variable and download the csv file

In [2]:
os.environ["OPENAI_API_KEY"] = "sk-<your key here>"

In [5]:
# Load the documents
loader = CSVLoader(file_path='../data/tracklist.csv')

In [6]:
# Create an index using the loaded documents
index_creator = VectorstoreIndexCreator()
docsearch = index_creator.from_loaders([loader])

Using embedded DuckDB without persistence: data will be transient


In [7]:
# Create a question-answering chain using the index
chain = RetrievalQA.from_chain_type(llm=OpenAI(), chain_type="stuff", retriever=docsearch.vectorstore.as_retriever(), input_key="question")

In [8]:
# Pass a query to the chain
query = "Do you have a column called tempo?"
response = chain({"question": query})

In [9]:
print(response['result'])

 Yes, there is a column called tempo.


In [10]:
# wrap it in a function
query = "Do you have a column called tempo?"
def ask_question(query):
    response = chain({"question": query})
    return response['result']
ask_question(query)

' Yes, there is a column called tempo.'

## Continue the conversation yourself!

In [None]:
query = "...?"
def ask_question(query):
    response = chain({"question": query})
    return response['result']
ask_question(query)