# Chat with your csv files using Langchain and OpenAI

In this notebook we will use Langchain and OpenAI to create a question-answering system for a csv file. We will use the tracklist.csv file from my spotify repository - [Github](https://github.com/rubentak/Spotify)



### First install langchain and openai if these are not installed

In [13]:
# !pip install -q langchain openai os


### Load the libraries

In [2]:
from langchain.document_loaders import CSVLoader
from langchain.indexes import VectorstoreIndexCreator
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI
import os

Get your OpenAI Key from here - https://platform.openai.com/account/api-keys

### Set enviorment variable and download the csv file

In [5]:
os.environ["OPENAI_API_KEY"] = "sk-..."

In [6]:
# Load the documents
loader = CSVLoader(file_path='../data/tracklist.csv')

In [7]:
# Create an index using the loaded documents
index_creator = VectorstoreIndexCreator()
docsearch = index_creator.from_loaders([loader])

Using embedded DuckDB without persistence: data will be transient


In [8]:
# Create a question-answering chain using the index
chain = RetrievalQA.from_chain_type(llm=OpenAI(), chain_type="stuff", retriever=docsearch.vectorstore.as_retriever(), input_key="question")

In [9]:
# Pass a query to the chain
query = "Do you have a column called tempo?"
response = chain({"question": query})

In [10]:
print(response['result'])

 Yes, tempo is one of the columns.


In [15]:
# wrap it in a function
query = "What are all the columns in this file?"
def ask_question(query):
    response = chain({"question": query})
    return response['result']
print(ask_question(query))

 added_at, id, name, popularity, uri, artist, album, release_date, duration_ms, length, danceability, acousticness, energy, instrumentalness, liveness, loudness, speechiness, tempo, time_signature, valence, mode, key, genres, and genre_group.


## Continue the conversation yourself!

In [None]:
query = "..."
def ask_question(query):
    response = chain({"question": query})
    return response['result']
ask_question(query)