<a href="https://colab.research.google.com/github/shashankdeshpande/llm-examples/blob/master/Langchain_%2B_Falcon.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Falcon
Falcon is a new family of state-of-the-art language models created by the Technology Innovation Institute in Abu Dhabi, and released under the Apache 2.0 license.

### References
- https://huggingface.co/blog/falcon
- https://falconllm.tii.ae/

In [1]:
!pip install -q transformers langchain torch accelerate einops

In [2]:
from transformers import AutoTokenizer
import transformers
import torch

model = "tiiuae/falcon-7b-instruct" #tiiuae/falcon-40b-instruct

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto",
    max_length=200,
    do_sample=True,
    num_return_sequences=1,
    top_k=10,
    eos_token_id=tokenizer.eos_token_id
)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Xformers is not installed correctly. If you want to use memory_efficient_attention to accelerate training use the following command to install Xformers
pip install xformers.
The model 'RWForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusFor

In [13]:
from langchain import HuggingFacePipeline, PromptTemplate,  LLMChain

llm = HuggingFacePipeline(pipeline = pipeline, model_kwargs = {'temperature':0})

template = """
The following is a conversation between a highly knowledgeable and intelligent AI assistant, called Falcon, and a human user, called User.
In the following interactions, User and Falcon will converse in natural language, and Falcon will answer User's questions.
Falcon was built to be respectful, polite and inclusive.
Falcon will never decline to answer a question, and always attempts to give an answer that User would be satisfied with.
It knows a lot, and always tells the truth.
The conversation begins.

Question: {question}
Answer:"""
prompt = PromptTemplate(template=template, input_variables=["question"])

llm_chain = LLMChain(prompt=prompt, llm=llm)

In [18]:
question = "Can you write short paragraph about NLP"

response = llm_chain.run(question)
response

Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.


' Yes. Natural Language Processing (NLP) is a branch of Artificial Intelligence (AI) that focuses on how machines and computers can understand and interact with human language. It enables computers to interpret, analyze and generate text-based input and output, as well as understand and interact with people. NLP enables computers to process language and generate natural language responses. It also allows computers to learn and'

In [19]:
import langchain
langchain.debug = True

question = "Please explain blockchain in simple terms"
response = llm_chain.run(question)
response

Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.


[32;1m[1;3m[chain/start][0m [1m[1:chain:LLMChain] Entering Chain run with input:
[0m{
  "question": "Please explain blockchain in simple terms"
}
[32;1m[1;3m[llm/start][0m [1m[1:chain:LLMChain > 2:llm:HuggingFacePipeline] Entering LLM run with input:
[0m{
  "prompts": [
    "The following is a conversation between a highly knowledgeable and intelligent AI assistant, called Falcon, and a human user, called User.\nIn the following interactions, User and Falcon will converse in natural language, and Falcon will answer User's questions.\nFalcon was built to be respectful, polite and inclusive.\nFalcon will never decline to answer a question, and always attempts to give an answer that User would be satisfied with.\nIt knows a lot, and always tells the truth.\nThe conversation begins.\n\nQuestion: Please explain blockchain in simple terms\nAnswer:"
  ]
}
[36;1m[1;3m[llm/end][0m [1m[1:chain:LLMChain > 2:llm:HuggingFacePipeline] [87.81s] Exiting LLM run with output:
[0m{
  "gene

'\nA blockchain is a decentralized database. This means that it is a distributed system where the information in the database is held on multiple computers instead of in one central location. The data is stored in blocks, which are linked and secured together so that the whole system can stay safe and secure. It is like an electronic record-keeping system that allows you to trust the data you are seeing on your screen'