## 1. FC Content Data Query example using langchain + Vector DB

In this example, I'll show you a simple example of querying FC data using FC API along with langchain and Vector DB.

### 1.1 Download the table and load the documents

#### a. Install the psycopg2 library if necessary

In [3]:
pip install psycopg2

Collecting psycopg2
  Downloading psycopg2-2.9.6.tar.gz (383 kB)
[2K     [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m384.0/384.0 kB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m[36m0:00:01[0mm eta [36m0:00:01[0m
[?25h  Preparing metadata (setup.py) ... [?25ldone
[?25hBuilding wheels for collected packages: psycopg2
  Building wheel for psycopg2 (setup.py) ... [?25ldone
[?25h  Created wheel for psycopg2: filename=psycopg2-2.9.6-cp38-cp38-macosx_10_15_x86_64.whl size=143470 sha256=5d4c02f2572939e8bace104df1f5b9577a7f45476030a0651e6e2719e8cb1fc5
  Stored in directory: /Users/brainco/Library/Caches/pip/wheels/b4/01/c3/2fb9798be76b52c98b2ad9e6f3101b3dcb5286e7901f754e87
Successfully built psycopg2
Installing collected packages: psycopg2
Successfully installed psycopg2-2.9.6

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.0.1[0m[39;49m -> [0m[32;49m23.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;4

#### b. Download the FocusCalm 3.0 search index table as csv

In [1]:
import psycopg2
import csv

# Establish a connection to the database
db_config = {
    "host":"localhost",
    "port":5432,
    "database":"focus_calm",
    "user":"brainco",
    "password":"password"
}

# Connect to the PostgreSQL database
connection = psycopg2.connect(**db_config)
cursor = connection.cursor()

# Replace 'your_table_name' with the name of the table you want to download
table_name = "tb_v3_content_search"

# Fetch the column names
cursor.execute(f"SELECT column_name FROM information_schema.columns WHERE table_name='{table_name}';")
column_names = [row[0] for row in cursor.fetchall()]

# Fetch the table data
cursor.execute(f"SELECT * FROM {table_name};")
table_data = cursor.fetchall()

# Save the data as a CSV file
csv_filename = f"{table_name}.csv"
with open(csv_filename, 'w', newline='') as csvfile:
    csv_writer = csv.writer(csvfile)
    csv_writer.writerow(column_names)
    csv_writer.writerows(table_data)

# Close the database connection
cursor.close()
connection.close()

print(f"Table '{table_name}' has been downloaded as '{csv_filename}'.")


Table 'tb_v3_content_search' has been downloaded as 'tb_v3_content_search.csv'.


#### c. Load the csv document using CSVLoader
See [here](https://python.langchain.com/en/latest/modules/indexes/document_loaders/examples/csv.html) for the detailed usage of the `CSVLoader`

In [2]:
from langchain.document_loaders.csv_loader import CSVLoader

In [3]:
print(table_name)
loader = CSVLoader(file_path=f'./{table_name}.csv', source_column="internal_id")

data = loader.load()

tb_v3_content_search


In [4]:
%%capture
print(data);

### 1.2 Split the documents and load into vectore DB

#### a. Split the documents using recommended text spliter

In [5]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from langchain.chains import RetrievalQA
from langchain import OpenAI
from langchain.chat_models import ChatOpenAI
import os

In [9]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
# Split your docs into texts
texts = text_splitter.split_documents(data)

embeddings = OpenAIEmbeddings(openai_api_key=os.environ['OPENAI_API_KEY'])
docsearch = FAISS.from_documents(texts, embeddings)

### 1.3 Using `RetrievalQA` to do question the documents

#### a. Creating the qa

In [10]:
chat_qa = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(model_name='gpt-3.5-turbo'), chain_type="stuff", retriever=docsearch.as_retriever()
)

In [11]:
simple_llm_qa = RetrievalQA.from_chain_type(
    llm=OpenAI(), chain_type="stuff", retriever=docsearch.as_retriever()
)

#### b. Asking questions

In [12]:
query1 = "If I want to practice my hand-eye coordination, what's the name of the content should I look for?"
simple_llm_qa.run(query1)

' Track Touch.'

In [13]:
query2 = "If I want to take a break and enjoy some racing, what game should I look for?"
simple_llm_qa.run(query2)

" Relaxation Race; Relax your mind to increase your car's speed!"

#### c. Asking questions, but using ChatModels

Using `ChatOpenAI` instead of `OpenAI`, this gives much richer output

In [14]:
query1 = "If I want to practice my hand-eye coordination, what content should I look for?"
chat_qa.run(query1)

'You should look for the content named "Track Touch" to practice your hand-eye coordination.'

In [82]:
query2 = "If I want to take a break and enjoy some racing, what game should I look for?"
chat_qa.run(query2)

"Based on the context provided, it looks like the content with internal_id GC004 is a relaxation race game that you can play to relax your mind and increase your car's speed. So, you can look for the game with internal_id GC004."

### 1.4 Using `ConversationalRetrievalChain` to achieve chat with history

In [15]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain.llms import OpenAI
from langchain.chains import ConversationalRetrievalChain

#### Creating memmory object

In [16]:
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

#### Initialize the chain with memory

In [17]:
qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), docsearch.as_retriever(), memory=memory)

In [18]:
query1 = "If I want to practice my hand-eye coordination, what content should I look for?"
result = qa({'question': query1})
print(result['answer'])

 Track Touch; Practice your hand-eye coordination by touching the moving shapes on screen as quickly as you can. The more relaxed you are, the easier they are to catch.


In [19]:
query2 = "If I want to take a break and enjoy some racing, what game should I look for?"
result = qa({'question': query2})
print(f'result object: {result}')
print(result['answer'])

result object: {'question': 'If I want to take a break and enjoy some racing, what game should I look for?', 'chat_history': [HumanMessage(content='If I want to practice my hand-eye coordination, what content should I look for?', additional_kwargs={}), AIMessage(content=' Track Touch; Practice your hand-eye coordination by touching the moving shapes on screen as quickly as you can. The more relaxed you are, the easier they are to catch.', additional_kwargs={}), HumanMessage(content='If I want to take a break and enjoy some racing, what game should I look for?', additional_kwargs={}), AIMessage(content=" Relaxation Race; Relax your mind to increase your car's speed!", additional_kwargs={})], 'answer': " Relaxation Race; Relax your mind to increase your car's speed!"}
 Relaxation Race; Relax your mind to increase your car's speed!


In [20]:
# Execute the chain for 5 times to eliminate the randomness
for i in range(1,5):
    query3 = "What are some other content names that are tagged with 'Wellness', list at least 3?"
    result = qa({'question': query3})
    print(result['answer'])

 Wake-up Wellness, Awareness Meditation, and Daily Challenge.
 Wake-up Wellness, Awareness Meditation, and Daily Challenge.
 Wake-up Wellness, Evening Wind Down, and Daily Challenge.
 Wake-up Wellness, Evening Wind Down, and Daily Challenge.


#### Questions and answers with sources

In [21]:
from langchain.chains.llm import LLMChain
from langchain.chains.conversational_retrieval.prompts import CONDENSE_QUESTION_PROMPT
from langchain.chains.qa_with_sources import load_qa_with_sources_chain

In [22]:
llm = OpenAI(temperature=0)
question_generator = LLMChain(llm=llm, prompt=CONDENSE_QUESTION_PROMPT)
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
doc_chain = load_qa_with_sources_chain(llm, chain_type="map_reduce")


Please note, we are using `map_reduce` as the chain_type here. Different from `stuff` chain, `map_reduce` chains are a type of document chain that can be used in the `ConversationalRetrievalChain`. They are used to combine multiple documents into a single document that can be used for question answering. They are useful when you have a large number of documents that need to be combined into a single document for question answering. We are going to ask some questions that will require the LLM to combine multiple sources to form an answer.

In [23]:
print(question_generator.prompt)

input_variables=['chat_history', 'question'] output_parser=None partial_variables={} template='Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question.\n\nChat History:\n{chat_history}\nFollow Up Input: {question}\nStandalone question:' template_format='f-string' validate_template=True


In [24]:
qa_with_source = ConversationalRetrievalChain(
    retriever=docsearch.as_retriever(),
    question_generator=question_generator,
    combine_docs_chain=doc_chain,
    memory=memory
)

In [25]:
query3 = "What are the content tags for Finding Productivity"
result = qa_with_source({'question': query3})

In [27]:
print(result['answer'])

 The content tags for Finding Productivity are Performance, Wellness, Mental Flexibility, Visualization, Self-awareness, Mindfulness, Calm, Consistency, Learn, and Attention Sustained.
SOURCES: MG029


Now let's ask a question which require some extensive searching, this is when `map_reduce` chain comes handy.

In [30]:
query4 = "What are some other content names that are tagged with 'Wellness', list at least 3?"
result = qa_with_source({'question': query4})

In [31]:
print(result['answer'])

 Self-awareness, Calm, Consistency.
SOURCES: PG001, MG004, MG029, PG007


In [34]:
query5 = "If I want my self to be more mindful, what contents should I look for, what are the content names?"
result = qa_with_source({'question': query5})
print(result['answer'])

 Awareness Meditation, Breathwork, and Mindfulness are content names to look for if you want to become more mindful.
SOURCES: MG003, MG004, MG005, MG028
