# Project: Question-Answering on Private Documents Using LangChain, OpenAI and Pinecone - RAG

- Load Documents
- Chunk Data
- Calculate Cost
- Using Pinecone as a vector DB
- Using Chroma as a vector DB
- Asking and Getting Answers
  - Run the functions
- Adding Memory (Chat History)
- Using a Custom Prompt

In [1]:
import os
from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv(), override=True)

True

In [2]:
pip install pypdf -q

Note: you may need to restart the kernel to use updated packages.


In [3]:
pip install docx2txt -q

Note: you may need to restart the kernel to use updated packages.


In [4]:
pip install --upgrade -q openai langchain tiktoken

Note: you may need to restart the kernel to use updated packages.


## Load Documents

In [5]:
# PDF, Text, Doc files
def load_documents(file):
    import os
    name, extension = os.path.splitext(file)

    if extension == '.pdf':
        from langchain.document_loaders import PyPDFLoader
        print(f'Loading {file}')
        loader = PyPDFLoader(file)
    elif extension == '.docx':
        from langchain.document_loaders import Docx2txtLoader
        print(f'Loading {file}')
        loader = Docx2txtLoader(file)
    elif extension == '.txt':
        from langchain.document_loaders import TextLoader
        loader = TextLoader(file)
    else:
        print('Document format not supported!')
        return None
    
    data = loader.load()
    return data

In [6]:
pip install wikipedia -q

Note: you may need to restart the kernel to use updated packages.


In [7]:
# Wikepedia
def load_from_wikipedia(query, lang='en', load_max_docs=2):
    from langchain.document_loaders import WikipediaLoader
    loader = WikipediaLoader(query=query, lang=lang, load_max_docs=load_max_docs)
    data = loader.load()
    return data

## Chunk Data

In [8]:
def chunk_data(data, chunk_size=256):
    from langchain.text_splitter import RecursiveCharacterTextSplitter
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=0)
    chunks = text_splitter.split_documents(data)
    return chunks

## Calculate Cost

In [9]:
def print_embedding_cost(texts):
    import tiktoken
    enc = tiktoken.encoding_for_model('text-embedding-3-small')
    total_tokens = sum([len(enc.encode(page.page_content)) for page in texts])
    print(f'Total Tokens: {total_tokens}')
    print(f'Embedding Cost in USD: {total_tokens / 1000 * 0.00002:.6f}')

## Using Pinecone as a vector DB

In [10]:
pip install -q pinecone

Note: you may need to restart the kernel to use updated packages.


In [11]:
# Delete index(es)

def delete_pinecode_index(index_name='all'):
    from pinecone import Pinecone
    pc = Pinecone()

    if index_name == 'all':
        print('Deleting all indexes...')
        
        indexes = pc.list_indexes().names()
        for i in indexes:
            pc.delete_index(i)
        print('Done!')
    else:
        print(f'Deleting index {index_name}...', end='')
        pc.delete_index(index_name)
        print('Done!')

In [12]:
# Creating and Uploading Embeddings to a vector database (Pinecone)

def insert_or_fetch_embeddings_pinecone(index_name, chunks):
    from pinecone import Pinecone, ServerlessSpec
    from langchain_openai import OpenAIEmbeddings
    from langchain_pinecone import PineconeVectorStore
    
    pc = Pinecone()
    embeddings = OpenAIEmbeddings(model='text-embedding-3-small')

    if index_name not in pc.list_indexes().names():
        print(f'Creating index: {index_name} and embeddings...', end='')
        pc.create_index(
            name=index_name,
            dimension=1536,
            metric='cosine',
            spec=ServerlessSpec(
                cloud='aws',
                region='us-east-1'
            )
            # spec=PodSpec(environment='us-west1-gcp')
        )
        vector_store = PineconeVectorStore.from_documents(
            documents=chunks,
            embedding=embeddings,
            index_name=index_name
        )
        print('Index created!')
    else:
        print(f'Index {index_name} already exists! Loading embeddings...', end='')
        vector_store = PineconeVectorStore.from_existing_index(index_name=index_name, embedding=embeddings)
        print('Done!')
    return vector_store

## Using Chroma as a vector DB

In [13]:
pip install -q langchain-chroma chromadbx

Note: you may need to restart the kernel to use updated packages.


In [14]:
def create_embeddings_chroma(chunks, persist_directory='./chroma_db'):
    from langchain_chroma import Chroma
    from langchain_openai import OpenAIEmbeddings

    embeddings = OpenAIEmbeddings(model='text-embedding-3-small', dimensions=1536)
    vector_store = Chroma.from_documents(chunks, embeddings, persist_directory=persist_directory)
    return vector_store

In [15]:
def load_embeddings_chroma(persist_directory='./chroma_db'):
    from langchain_chroma import Chroma
    from langchain_openai import OpenAIEmbeddings
    
    embeddings = OpenAIEmbeddings(model='text-embedding-3-small', dimensions=1536)
    vector_store = Chroma(persist_directory=persist_directory, embedding_function=embeddings)
    return vector_store

## Asking and Getting Answers

In [16]:
def ask_and_get_answer(vector_store, q):
    from langchain.chains import RetrievalQA
    from langchain_openai import ChatOpenAI
    
    llm = ChatOpenAI(model='gpt-3.5-turbo', temperature=0)
    retriever = vector_store.as_retriever(search_type='similarity', search_kwargs={'k': 3})
    qa_chain = RetrievalQA.from_chain_type(llm=llm, chain_type='stuff', retriever=retriever)

    answer = qa_chain.invoke(q)
    return answer['result']

## Run the functions

In [17]:
pdf_data = load_documents('./files/us_constitution.pdf')
# print(pdf_data[1].page_content)
# print(pdf_data[1].metadata)

print(f'There are {len(pdf_data[1].page_content)} characters on page 2')
print(f'You have {len(pdf_data)} pages in your file')

Loading ./files/us_constitution.pdf
There are 1500 characters on page 2
You have 41 pages in your file


In [18]:
docs_data = load_documents('./files/the_great_gatsby.docx')
# print(docs_data[0].page_content)
# print(docs_data[0].metadata)

print(f'There are {len(docs_data[0].page_content)} characters on page 1')
print(f'You have {len(docs_data)} page in your file')

Loading ./files/the_great_gatsby.docx
There are 296672 characters on page 1
You have 1 page in your file


In [19]:
txt_data = load_documents('./files/churchill_speech.txt')

print(f'There are {len(txt_data[0].page_content)} characters on page 1')
print(f'You have {len(txt_data)} page in your file')

There are 21271 characters on page 1
You have 1 page in your file


In [20]:
wiki_data = load_from_wikipedia('GPT-4')
print(wiki_data[0].page_content)

Generative Pre-trained Transformer 4 (GPT-4) is a large language model developed by OpenAI and the fourth in its series of GPT foundation models.  
GPT-4 is more capable than its predecessor GPT-3.5. GPT-4 Vision (GPT-4V) is a version of GPT-4 that can process images in addition to text. OpenAI has not revealed technical details and statistics about GPT-4, such as the precise size of the model.
An early version of GPT-4 was integrated by Microsoft into Bing Chat, launched in February 2023. GPT-4 was released in ChatGPT in March 2023, and removed in 2025. GPT-4 is still available in OpenAI's API.
GPT-4, as a generative pre-trained transformer (GPT), was first trained to predict the next token for a large amount of text (both public data and "data licensed from third-party providers"). Then, it was fine-tuned for human alignment and policy compliance, notably with reinforcement learning from human feedback (RLHF).


== Background ==
 

OpenAI introduced the first GPT model (GPT-1) in 201

In [21]:
pdf_chunks = chunk_data(pdf_data)
print(len(pdf_chunks))
print(pdf_chunks[10].page_content)

224
Maryland six, V irginia ten, North Carolina five, South Carolina five, and 
 Georgia three. 
 When vacancies happen in the Representation from any State, the 
 Executive Authority thereof shall issue W rits of Election to fill such 
 V acancies.


### Pinecone

In [22]:
delete_pinecode_index()

Deleting all indexes...
Done!


In [23]:
index_name = 'askadocument'
pdf_data = load_documents('./files/us_constitution.pdf')
pdf_data_chunks = chunk_data(pdf_data, chunk_size=256)
pinecone_vector_store = insert_or_fetch_embeddings_pinecone(index_name, pdf_data_chunks)

Loading ./files/us_constitution.pdf
Creating index: askadocument and embeddings...Index created!


In [24]:
q = 'What is the document about?'
answer = ask_and_get_answer(pinecone_vector_store, q)
print(answer)

I'm sorry, but you haven't provided any specific document or context for me to determine what it is about. Could you please provide more details or context so I can assist you better?


In [25]:
# Take questions for user (loop)

import time
i = 1
print('Write "Quit" or "Exit" to quit')

while True:
    q = input(f'Question #{i}: ')
    i = i + 1
    if q.lower() in ['quit', 'exit']:
        print('Exiting... bye!')
        time.sleep(2)
        break
    answer = ask_and_get_answer(pinecone_vector_store, q)
    print(f'\nAnswer: {answer}')
    print(f'\n{"-" * 50} \n')

Write "Quit" or "Exit" to quit


Question #1:  What is the document about?



Answer: The document provided is the United States Constitution. It outlines the framework for the government of the United States, including the establishment of justice, defense, welfare, and the supremacy of the Constitution and federal laws.

-------------------------------------------------- 



Question #2:  What is the first amendment described in the document?



Answer: The First Amendment described in the document states: "Congress shall make no law respecting an establishment of religion, or prohibiting the free exercise thereof; or abridging the freedom of speech, or of the press; or the right of the people peaceably to assemble, and to petition the Government for a redress of grievances."

-------------------------------------------------- 



Question #3:  what about the second amendment?



Answer: The Second Amendment to the United States Constitution states: "A well regulated Militia, being necessary to the security of a free State, the right of the people to keep and bear Arms, shall not be infringed." It protects the right of individuals to keep and bear arms.

-------------------------------------------------- 



Question #4:  bye



Answer: Goodbye! If you have any more questions in the future, feel free to ask.

-------------------------------------------------- 



Question #5:  quit


Exiting... bye!


### Test creating and inserting vectors from Wikipedia data - Using Pinecone

In [26]:
delete_pinecode_index()

Deleting all indexes...
Done!


In [27]:
wiki_data = load_from_wikipedia('ChatGPT')
wiki_data_chunks = chunk_data(wiki_data)
index_name = 'chatgpt-wiki'
pinecone_wiki_vector_store = insert_or_fetch_embeddings_pinecone(index_name, wiki_data_chunks)

Creating index: chatgpt-wiki and embeddings...Index created!


In [28]:
q = 'What is chatGPT?'
answer = ask_and_get_answer(pinecone_wiki_vector_store, q)
print(answer)

ChatGPT is a conversational AI model developed by OpenAI. It is based on the GPT-3 (Generative Pre-trained Transformer 3) architecture and is designed to generate human-like responses in natural language conversations. ChatGPT can be used in various applications such as chatbots, virtual assistants, and customer service interactions.


### Chroma

In [29]:
pdf_data = load_documents('./files/rag_powered_by_google_search.pdf')
pdf_data_chunks = chunk_data(pdf_data, chunk_size=256)
chroma_vector_store = create_embeddings_chroma(pdf_data_chunks)

Loading ./files/rag_powered_by_google_search.pdf


In [30]:
q = 'What is Vertex AI Search?'
answer = ask_and_get_answer(chroma_vector_store, q)
print(answer)

Vertex AI Search is a feature within Google Cloud's Vertex AI platform that allows users to perform hybrid search capabilities. This feature enables users to search across structured and unstructured data sources, making it easier to find relevant information efficiently.


In [31]:
# Use loaded embeddings
db = load_embeddings_chroma()
q = 'How many pairs of questions and answers had the StackOverflow dataset?'
answer = ask_and_get_answer(db, q)
print(answer)

The StackOverflow dataset had 8 million pairs of questions and answers.


In [32]:
# Trying out memory
q = 'Multiply that number by 2.'
answer = ask_and_get_answer(chroma_vector_store, q)
print(answer)

I'm sorry, but it seems like your message got cut off. Could you please provide more context or clarify your question?


## Adding Memory (Chat History)

In [33]:
from langchain_openai import ChatOpenAI
# from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain.memory import ChatMessageHistory, ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain

llm = ChatOpenAI(model_name='gpt-4-turbo-preview', temperature=0)
retriever = chroma_vector_store.as_retriever(search_type='similarity', search_kwargs={'k': 5})
memory = ConversationBufferMemory(memory_key='chat_history', return_messages=True)

crc = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=retriever,
    memory=memory,
    chain_type='stuff',
    verbose=True
)

  memory = ConversationBufferMemory(memory_key='chat_history', return_messages=True)


In [34]:
def ask_question(q, chain):
    result = chain.invoke({'question': q})
    return result

In [35]:
data = load_documents('./files/rag_powered_by_google_search.pdf')
chunks = chunk_data(data, chunk_size=256)
vector_store = create_embeddings_chroma(chunks)

Loading ./files/rag_powered_by_google_search.pdf


In [36]:
q = 'How many pairs of questions and answers had the StackOverflow dataset?'
result = ask_question(q, crc)
print(result)



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: Use the following pieces of context to answer the user's question.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
----------------
simple similarity search was highly e ective because the dataset had 8
million pairs of questions and answers. However, datasets do not
usually contain pre-existing question-and-answer or query-and-

simple similarity search was highly e ective because the dataset had 8
million pairs of questions and answers. However, datasets do not
usually contain pre-existing question-and-answer or query-and-

simple similarity search was highly e ective because the dataset had 8
million pairs of questions and answers. However, datasets do not
usually contain pre-existing question-and-answer or query-and-

simple similarity search was highly e ective because the dataset had 8
million pa

In [37]:
print(result['answer'])

The dataset had 8 million pairs of questions and answers.


In [38]:
q = 'Multiply that number by 2'
answer = ask_question(q, crc)
print(answer)



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:

Human: How many pairs of questions and answers had the StackOverflow dataset?
Assistant: The dataset had 8 million pairs of questions and answers.
Follow Up Input: Multiply that number by 2
Standalone question:[0m

[1m> Finished chain.[0m


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: Use the following pieces of context to answer the user's question.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
----------------
simple similarity search was highly e ective because the dataset had 8
million pairs of questions and answers. However, datasets do not
usually contain pre-existing question-and-answer or query-an

In [39]:
print(answer['answer'])

The result of multiplying the number of pairs of questions and answers in the dataset, which is 8 million, by 2 would be 16 million.


In [40]:
q = 'Divide the result by 200'
answer = ask_question(q, crc)
print(answer['answer'])



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:

Human: How many pairs of questions and answers had the StackOverflow dataset?
Assistant: The dataset had 8 million pairs of questions and answers.
Human: Multiply that number by 2
Assistant: The result of multiplying the number of pairs of questions and answers in the dataset, which is 8 million, by 2 would be 16 million.
Follow Up Input: Divide the result by 200
Standalone question:[0m

[1m> Finished chain.[0m


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: Use the following pieces of context to answer the user's question.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
----------------
e ciency of an advan

In [41]:
for item in result['chat_history']:
    print(item)

content='How many pairs of questions and answers had the StackOverflow dataset?' additional_kwargs={} response_metadata={}
content='The dataset had 8 million pairs of questions and answers.' additional_kwargs={} response_metadata={}
content='Multiply that number by 2' additional_kwargs={} response_metadata={}
content='The result of multiplying the number of pairs of questions and answers in the dataset, which is 8 million, by 2 would be 16 million.' additional_kwargs={} response_metadata={}
content='Divide the result by 200' additional_kwargs={} response_metadata={}
content='To find the result of dividing 16 million by 200, you can perform the calculation as follows:\n\n16,000,000 ÷ 200 = 80,000\n\nTherefore, the result is 80,000.' additional_kwargs={} response_metadata={}


### Loop for asking questions

In [42]:
while True:
    q = input('Your question: ')
    if q.lower() in 'exit quit bye':
        print('Bye bye!')
        break
    result = ask_question(q, crc)
    print(result['answer'])
    print('-' * 100)

Your question:  Tell me about Google Search technologies as described in the document.




[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:

Human: How many pairs of questions and answers had the StackOverflow dataset?
Assistant: The dataset had 8 million pairs of questions and answers.
Human: Multiply that number by 2
Assistant: The result of multiplying the number of pairs of questions and answers in the dataset, which is 8 million, by 2 would be 16 million.
Human: Divide the result by 200
Assistant: To find the result of dividing 16 million by 200, you can perform the calculation as follows:

16,000,000 ÷ 200 = 80,000

Therefore, the result is 80,000.
Follow Up Input: Tell me about Google Search technologies as described in the document.
Standalone question:[0m

[1m> Finished chain.[0m


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0

Your question:   Is Vertex AI Search a fully-managed platform?




[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:

Human: How many pairs of questions and answers had the StackOverflow dataset?
Assistant: The dataset had 8 million pairs of questions and answers.
Human: Multiply that number by 2
Assistant: The result of multiplying the number of pairs of questions and answers in the dataset, which is 8 million, by 2 would be 16 million.
Human: Divide the result by 200
Assistant: To find the result of dividing 16 million by 200, you can perform the calculation as follows:

16,000,000 ÷ 200 = 80,000

Therefore, the result is 80,000.
Human: Tell me about Google Search technologies as described in the document.
Assistant: The document describes a product that enhances search experiences for public or internal websites, mobile applications, and various enterprise sea

Your question:  What advancements have been made in semantic search?




[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:

Human: How many pairs of questions and answers had the StackOverflow dataset?
Assistant: The dataset had 8 million pairs of questions and answers.
Human: Multiply that number by 2
Assistant: The result of multiplying the number of pairs of questions and answers in the dataset, which is 8 million, by 2 would be 16 million.
Human: Divide the result by 200
Assistant: To find the result of dividing 16 million by 200, you can perform the calculation as follows:

16,000,000 ÷ 200 = 80,000

Therefore, the result is 80,000.
Human: Tell me about Google Search technologies as described in the document.
Assistant: The document describes a product that enhances search experiences for public or internal websites, mobile applications, and various enterprise sea

Your question:  bye


Bye bye!


## Using a Custom Prompt

In [43]:
from langchain_openai import ChatOpenAI
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

from langchain.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate

llm = ChatOpenAI(model_name='gpt-3.5-turbo', temperature=0)
retriever = chroma_vector_store.as_retriever(search_type='similarity', search_kwargs={'k': 5})
memory = ConversationBufferMemory(memory_key='chat_history', return_messages=True)

system_template = r'''
Use the following pieces of context to answer the user's question.
Before answering translate your response.
If you don't find the answer in the provided context, just respond "I don't know."
---------------
Context: ```{context}```
'''

user_template = '''Question: ```{question}```'''
messages = [
    SystemMessagePromptTemplate.from_template(system_template),
    HumanMessagePromptTemplate.from_template(user_template)
]

qa_prompt = ChatPromptTemplate.from_messages(messages)

crc = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=retriever,
    memory=memory,
    chain_type='stuff',
    combine_docs_chain_kwargs={'prompt': qa_prompt },
    verbose=True
)

In [44]:
print(qa_prompt)

input_variables=['context', 'question'] input_types={} partial_variables={} messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template='\nUse the following pieces of context to answer the user\'s question.\nBefore answering translate your response.\nIf you don\'t find the answer in the provided context, just respond "I don\'t know."\n---------------\nContext: ```{context}```\n'), additional_kwargs={}), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['question'], input_types={}, partial_variables={}, template='Question: ```{question}```'), additional_kwargs={})]


In [45]:
db = load_embeddings_chroma()
q = 'How many pairs of questions and answers had the StackOverflow dataset?'
result = ask_question(q, crc)
print(result)



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: 
Use the following pieces of context to answer the user's question.
Before answering translate your response.
If you don't find the answer in the provided context, just respond "I don't know."
---------------
Context: ```simple similarity search was highly e ective because the dataset had 8
million pairs of questions and answers. However, datasets do not
usually contain pre-existing question-and-answer or query-and-

simple similarity search was highly e ective because the dataset had 8
million pairs of questions and answers. However, datasets do not
usually contain pre-existing question-and-answer or query-and-

simple similarity search was highly e ective because the dataset had 8
million pairs of questions and answers. However, datasets do not
usually contain pre-existing question-and-answer or query-and-

simple similarity search was highly 

In [46]:
q = 'When was Elon Musk born?'
result = ask_question(q, crc)
print(result)



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:

Human: How many pairs of questions and answers had the StackOverflow dataset?
Assistant: Pregunta: ```¿Cuántos pares de preguntas y respuestas tenía el conjunto de datos de StackOverflow?```
Follow Up Input: When was Elon Musk born?
Standalone question:[0m

[1m> Finished chain.[0m


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: 
Use the following pieces of context to answer the user's question.
Before answering translate your response.
If you don't find the answer in the provided context, just respond "I don't know."
---------------
Context: ```Follow us
Telecommunications
By Ankur Jain • 7-minute read
Gaming
By Patrick Smith • 8-minute read
Applicati

In [47]:
q = 'When was Bill Gates born?'
result = ask_question(q, crc)
print(result)



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:

Human: How many pairs of questions and answers had the StackOverflow dataset?
Assistant: Pregunta: ```¿Cuántos pares de preguntas y respuestas tenía el conjunto de datos de StackOverflow?```
Human: When was Elon Musk born?
Assistant: I don't know.
Follow Up Input: When was Bill Gates born?
Standalone question:[0m

[1m> Finished chain.[0m


[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: 
Use the following pieces of context to answer the user's question.
Before answering translate your response.
If you don't find the answer in the provided context, just respond "I don't know."
---------------
Context: ```Google Cloud Google Cloud Products Privacy Terms
H

In [48]:
print(result['answer'])

No sé.
