# Chat with your Documents

We will chat with this nice article titled [Transformers without pain 🤗](https://www.linkedin.com/pulse/transformers-without-pain-ibrahim-sobh-phd/) using:

- Palm Model
- LangChian


<a href="https://colab.research.google.com/drive/1DDPhjMiffvWs4gqqD9KyNO-yAp26HavK?usp=sharing" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



By [Ibrahim Sobh](https://www.linkedin.com/in/ibrahim-sobh-phd-8681757/)

## Install

In [None]:
!pip -q install configparser langchain google-generativeai chromadb
!pip -q install transformers huggingface_hub sentence_transformers

## Import

In [None]:
import os
from dotenv import load_dotenv, find_dotenv
from langchain.llms import GooglePalm
from langchain.embeddings import GooglePalmEmbeddings
from langchain import PromptTemplate, LLMChain
from langchain.chains import RetrievalQA
from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.indexes import VectorstoreIndexCreator

## Load Palm

In [None]:
load_dotenv(find_dotenv()) # GOOGLE_API_KEY = "do not share your key"
api_key = os.environ["GOOGLE_API_KEY"]

In [None]:
Palm_llm = GooglePalm(temperature=0.1, max_output_tokens=128)

In [None]:
# test
template = """Question: {question}
Answer: Let's think step by step."""

prompt_open = PromptTemplate(template=template, input_variables=["question"])
open_chain = LLMChain(prompt=prompt_open,llm = Palm_llm)

question = "What NFL team won the Super Bowl in the year Justin Beiber was born?"
print(open_chain.run(question))

Justin Bieber was born on March 1, 1994. The New England Patriots won Super Bowl XXXVI in 2002.
The final answer: New England Patriots.


## Load and index your Doc(s)

In [None]:
# load docs and construct the index
urls = ['https://www.linkedin.com/pulse/transformers-without-pain-ibrahim-sobh-phd/',]
loader = WebBaseLoader(urls)
index = VectorstoreIndexCreator(
        embedding=GooglePalmEmbeddings(),
        text_splitter=RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0, separators=[" ", ",", "\n"])).from_loaders([loader])

In [None]:
# QA Retrieval
qa_retriever = RetrievalQA.from_chain_type(llm=Palm_llm, chain_type="stuff",
                                    retriever=index.vectorstore.as_retriever(),
                                    input_key="question")

## Ask your documents questions and get answers

In [None]:
qa_retriever("What these documents are about?")

{'question': 'What these documents are about?',
 'result': 'The documents are about transformers, which are a type of neural network that has been used successfully in natural language processing and computer vision tasks.'}

In [None]:
qa_retriever("What is the main idea of transformers?")

{'question': 'What is the main idea of transformers?',
 'result': 'The main idea of transformers is to use attention mechanisms to model long-range dependencies in sequences.'}

In [None]:
qa_retriever("Why transformers are better thanRNNs and CNNs?")

{'question': 'Why transformers are better thanRNNs and CNNs?',
 'result': 'Transformers are better than RNNs and CNNs because they are more parallelizable and require significantly less time to train.'}

In [None]:
qa_retriever("what is positional encoding?")

{'question': 'what is positional encoding?',
 'result': 'Positional encoding is a technique used to represent the order of words in a sequence.'}

In [None]:
qa_retriever("How to represent the order of words?")

{'question': 'How to represent the order of words?',
 'result': 'Positional Encoding is used to represent the order of the sequence.'}

In [None]:
qa_retriever("what is self attention?")

{'question': 'what is self attention?',
 'result': 'Self attention is a technique to compute a weighted sum of the values (in the encoder), dependent on another value (in the decoder).'}

In [None]:
qa_retriever("How attention is used in encoder and decoder?")

{'question': 'How attention is used in encoder and decoder?',
 'result': 'Attention is used in both encoder and decoder. In the encoder, attention is used to compute a weighted sum of the values (in the encoder), dependent on another value (in the decoder). In the decoder, attention is used to attend or focus on values from the encoder.'}

In [None]:
qa_retriever("what is attention used between encoder and decoder and why?")

{'question': 'what is attention used between encoder and decoder and why?',
 'result': 'Encoder-decoder attention is used to allow the decoder to attend to values from the encoder. This is done so that the decoder can learn the relationship between the input and output sequences.'}

In [None]:
qa_retriever("How query, key, and value vectors are used?")

{'question': 'How query, key, and value vectors are used?',
 'result': 'The query vector is used to compute a weighted sum of the values through the keys. Specifically: q dot product all the keys, then softmax to get weights and finally use these weights to compute a weighted sum of the values.'}

In [None]:
qa_retriever("How to start using transformers?")

{'question': 'How to start using transformers?',
 'result': 'To start using transformers, you can use the huggingface/transformers library. This library provides thousands of pretrained models to perform tasks on texts such as classification, information extraction, question answering, summarization, translation, text generation, etc in 100+ languages.'}