<a href="https://colab.research.google.com/github/shivanikush/LLMs_Handson/blob/main/langchain_huggingface.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install langchain-huggingface
# calling apis
!pip install huggingface_hub
!pip install transformers
!pip install accelerate
!pip install bitsandbytes
!pip install langchain



In [2]:
from google.colab import userdata
sec_key = userdata.get("HF_TOKEN")
#print(sec_key)

# **HUGGING FACE ENDPOINT**

**How to access Hugging Face Models with API**
There 2 ways:

1.   You can specify the model with the repo_id parameter. Those endpoints use the serverless API which is particularly beneficial for people using pro accounts/enterprise hub.
2.   Still, regular users have access to a fair amount of reuqest by connecting with their HF token in the environment where they are executing the code.



In [3]:
from langchain_huggingface import HuggingFaceEndpoint

In [4]:
from google.colab import userdata
sec_key = userdata.get("HUGGINGFACEHUB_API_TOKEN")


In [5]:
import os
os.environ["HUGGINGFACEHUB_API_TOKEN"]= sec_key

Preparing the repo id (which model I want to call)

In [6]:
repo_id = "mistralai/Mistral-7B-Instruct-v0.3"
llm = HuggingFaceEndpoint(repo_id=repo_id, max_length=128,temperature=0.7,token=sec_key)

                    max_length was transferred to model_kwargs.
                    Please make sure that max_length is what you intended.
                    token was transferred to model_kwargs.
                    Please make sure that token is what you intended.


The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /root/.cache/huggingface/token
Login successful


In [7]:
llm

HuggingFaceEndpoint(repo_id='mistralai/Mistral-7B-Instruct-v0.3', temperature=0.7, model_kwargs={'max_length': 128, 'token': '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'}, model='mistralai/Mistral-7B-Instruct-v0.3', client=<InferenceClient(model='mistralai/Mistral-7B-Instruct-v0.3', timeout=120)>, async_client=<InferenceClient(model='mistralai/Mistral-7B-Instruct-v0.3', timeout=120)>)

In [8]:
llm.invoke("Tell me the recipe to cook chocolate cake")

' with yogurt. Chocolate Yogurt Cake Recipe:\n\nIngredients:\n- 1 1/2 cups all-purpose flour\n- 1 cup sugar\n- 1/2 cup unsweetened cocoa powder\n- 1 teaspoon baking soda\n- 1/2 teaspoon salt\n- 2 eggs\n- 1 cup plain yogurt\n- 1/2 cup vegetable oil\n- 1 teaspoon vanilla extract\n- 1 cup hot coffee\n- For the frosting:\n- 1 cup unsalted butter, softened\n- 1/2 cup unsweetened cocoa powder\n- 6 cups powdered sugar\n- 1/2 cup milk\n- 2 teaspoons vanilla extract\n\nInstructions:\n\n1. Preheat the oven to 350°F (175°C). Grease and flour a 9-inch round cake pan.\n\n2. In a large bowl, combine flour, sugar, cocoa powder, baking soda, and salt.\n\n3. In a separate bowl, whisk together eggs, yogurt, oil, and vanilla extract.\n\n4. Gradually add the wet ingredients to the dry ingredients, stirring just until combined.\n\n5. Slowly pour in the hot coffee, stirring continuously. The batter will be thin.\n\n6. Pour the batter into the prepared cake pan and bake for 40-45 minutes, or until a toothpic

In [9]:
from langchain import PromptTemplate, LLMChain
question="Which team won the ICC T20 World cup in the year 2011"
template = """Question: {question}
Answer: Let's think step by step """
prompt = PromptTemplate(template=template, input_variables=["question"])
print(prompt)

input_variables=['question'] template="Question: {question}\nAnswer: Let's think step by step "


In [10]:
llm_chain = LLMChain(llm=llm, prompt=prompt)
#print(llm_chain.run(question))
print(llm_chain.invoke(question))

{'question': 'Which team won the ICC T20 World cup in the year 2011', 'text': '2011 ICC T20 World Cup was won by the team "West Indies"'}


  warn_deprecated(


# **Hugging Face Pipeline**

Among transformers, the pipeline is the most versatile tool in HuggingFace toolbox. Langchain being designed primarily to address RAG and agent use cases, the scope of the pipeline is reduced here to the following text-centric tasks: " text-generation", "text2textgeneration", "summarization", "translation". Models can be loaded directly with the from_model_id method.

In [11]:
from langchain_huggingface import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

In [12]:
model_id = "gpt2"
model = AutoModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)

In [13]:
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=100)
hf = HuggingFacePipeline(pipeline=pipe)

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


In [14]:
hf

HuggingFacePipeline(pipeline=<transformers.pipelines.text_generation.TextGenerationPipeline object at 0x7b351ee078e0>)

In [15]:
hf.invoke("write a poem for me")

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


'write a poem for me, saying something about what I\'ve seen. I like writing about me. I think it makes me more interesting to read."\n\nI ask her if she would like to direct this story as a poem for another generation. If so, how far would you like to go toward that goal?\n\n"I\'ll go into that further, it wouldn\'t matter if I don\'t. I\'m writing now after being too busy to write. If I wanted to do that as an act, then'

In [16]:
## Use HUGGINGFACE pipeleine with gPU
gpu_llm = HuggingFacePipeline.from_model_id( model_id='gpt2', task = "text-generation", device = 0, pipeline_kwargs={"max_new_tokens":100},)
from langchain import PromptTemplate, LLMChain

template = """Question: {question}
Answer: Let's think step by step """
prompt = PromptTemplate.from_template(template)
print(prompt)

input_variables=['question'] template="Question: {question}\nAnswer: Let's think step by step "


In [17]:
chain=prompt|gpu_llm

In [18]:
question="Which team won the Cricket World cup in the year 2011"
chain.invoke({"question":question})

"Question: Which team won the Cricket World cup in the year 2011\nAnswer: Let's think step by step \xa0from that... 1. United United were winning 4 ODIs - 3 for 3 (by 6 wickets) 2. Middlesex are winning 6 ODIs - 3 for 5 (by 8 wickets) 3. Somerset and Leeds United are winning 5 ODIs - 9 for 8 (by 12 wickets) 4. Airtricity and St. Andrews all were winning 8 ODIs - 4 for 2 (by 8 wickets) 5. Birmingham City are winning 18 ODIs - 3"