### [Question Answering over text files with Falcon 7B and LangChain](https://medium.com/@salimsahrane/question-answering-over-text-files-with-falcon-7b-and-langchain-4d32a661dd48)
#### What is LangChain? LangChain is a framework designed to simplify the creation of applications using large language models.
#### Here are the steps we will follow to build our QnA program:

1. Load text and split it into chunks.
2. Create embeddings from text chunks.
3. Load the Falcon-7B-instruct LLM.
4. Create a question answering chain.
5. Run the chain with a query.

In [1]:
!pip install langchain

Collecting langchain
  Downloading langchain-0.0.268-py3-none-any.whl (1.5 MB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.5 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.2/1.5 MB[0m [31m5.0 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.7/1.5 MB[0m [31m9.5 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━[0m [32m1.4/1.5 MB[0m [31m13.5 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.5/1.5 MB[0m [31m12.7 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json<0.6.0,>=0.5.7 (from langchain)
  Downloading dataclasses_json-0.5.14-py3-none-any.whl (26 kB)
Collecting langsmith<0.1.0,>=0.0.21 (from langchain)
  Downloading langsmith-0.0.25-py3-none-any.whl (33 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from datac

In [169]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

with open("/content/1706.03762.txt") as f:
    text_file = f.read()
text_splitter = RecursiveCharacterTextSplitter(
                                      chunk_size=500,
                                      chunk_overlap=50)

chunks = text_splitter.split_text(text_file)

In [170]:
# !pip install sentence_transformers

In [171]:
# !pip install faiss-gpu

In [172]:
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS

embeddings = HuggingFaceEmbeddings()
vectorStore = FAISS.from_texts(chunks, embeddings)

In [201]:
import os
from langchain import HuggingFaceHub

os.environ["HUGGINGFACEHUB_API_TOKEN"] = "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"

llm=HuggingFaceHub(repo_id="tiiuae/falcon-7b-instruct", model_kwargs={"temperature":0.5 ,"max_new_tokens":1000})

In [202]:
from langchain.chains import RetrievalQA
from langchain.schema import retriever

chain = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=vectorStore.as_retriever())

In [203]:
query="Explain Trnsformers in one sentence??"
chain.run(query)

' The Transformer is a deep learning architecture that uses self-attention to learn representations of input and output from a single encoder-decoder pair.\n\nThe Transformer is a deep learning architecture that uses self-attention to learn representations of input and output from a single encoder-decoder pair.'

In [204]:
query="Who is Ashish Vaswani in this paper??"
chain.run(query)

"\nAshish Vaswani is a well-known researcher in the field of machine learning and natural language processing. He is the co-founder of Google's Brain project and has made significant contributions to the field of deep learning. He is also an associate professor at Stanford University."

In [205]:
query="Who is Niki Parmar in this paper??"
chain.run(query)

'\nNiki Parmar is a former Google employee who worked on the Google Translate team, and was one of the first employees of Google Brain.'

In [206]:
query="Explain Model architecture in 5 bullet points??"
chain.run(query)

'\n\n1. The Transformer is a model that takes as input an encoder and a decoder.\n2. The encoder and decoder are composed of stacked self-attention and point-wise, fully connected layers.\n3. The encoder and decoder are built using stacked self-attention and point-wise, fully connected layers.\n4. The training data and batch size are specified.\n5. The Transformer is trained using the encoder and decoder stacks, along with residual connections.'

In [207]:
query = 'What is conclusion of this paper??'
chain.run(query)

'\n\nThe conclusion of this paper is that the use of machine translation can lead to unintended consequences and that human translation is preferable. The paper also concludes that the current state of machine translation is not sufficient for it to be used as a primary translation tool.'

In [212]:
query = 'Name two main components of encoder??'
chain.run(query)

'\nThe two main components of an encoder are the self-attention network and the position-wise feed-forward network.'