# Huggingface with Langchain

In [None]:
# Libraries required
!pip install langchain-huggingface
# For APIs calls
!pip install huggingface_hub   # repositary for various model
!pip install transformers
!pip install accelerate
!pip install langchain

Collecting langchain-huggingface
  Downloading langchain_huggingface-0.0.3-py3-none-any.whl.metadata (1.2 kB)
Collecting langchain-core<0.3,>=0.1.52 (from langchain-huggingface)
  Downloading langchain_core-0.2.36-py3-none-any.whl.metadata (6.2 kB)
Collecting sentence-transformers>=2.6.0 (from langchain-huggingface)
  Downloading sentence_transformers-3.0.1-py3-none-any.whl.metadata (10 kB)
Collecting jsonpatch<2.0,>=1.33 (from langchain-core<0.3,>=0.1.52->langchain-huggingface)
  Downloading jsonpatch-1.33-py2.py3-none-any.whl.metadata (3.0 kB)
Collecting langsmith<0.2.0,>=0.1.75 (from langchain-core<0.3,>=0.1.52->langchain-huggingface)
  Downloading langsmith-0.1.106-py3-none-any.whl.metadata (13 kB)
Collecting tenacity!=8.4.0,<9.0.0,>=8.1.0 (from langchain-core<0.3,>=0.1.52->langchain-huggingface)
  Downloading tenacity-8.5.0-py3-none-any.whl.metadata (1.2 kB)
Collecting jsonpointer>=1.9 (from jsonpatch<2.0,>=1.33->langchain-core<0.3,>=0.1.52->langchain-huggingface)
  Downloading js

In [None]:
## Environment secret keys
from google.colab import userdata
sec_key=userdata.get("HF_TOKEN")
print(sec_key)

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


## HUggingfaceEndpoint

### How to Access huggingface model with API

There are also two ways to use this class. You can specify the model with the repo_id parameter. Those endpoints use the serverless API, which is particularly beneficial to people using pro accounts or enterprise hub. Still, regular users can already have access to a fair amount of request by connecting with their HF token in the environment where they are executing the code.

In [None]:
from langchain_huggingface import HuggingFaceEndpoint

In [None]:
from google.colab import userdata
sec_key=userdata.get("HUGGINGFACEHUB_API_TOKEN")
print(sec_key)

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


In [None]:
import os
os.environ["HUGGINGFACEHUB_API_TOKEN"] = sec_key

In [None]:
repo_id = "mistralai/Mistral-7B-Instruct-v0.3"    # want to call this model as an API for the generat
llm = HuggingFaceEndpoint(repo_id=repo_id, max_length = 128, temperature=0.7, token = sec_key)

                    max_length was transferred to model_kwargs.
                    Please make sure that max_length is what you intended.
                    token was transferred to model_kwargs.
                    Please make sure that token is what you intended.


The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to /root/.cache/huggingface/token
Login successful


In [None]:
llm

HuggingFaceEndpoint(repo_id='mistralai/Mistral-7B-Instruct-v0.3', temperature=0.7, model_kwargs={'max_length': 128, 'token': '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'}, model='mistralai/Mistral-7B-Instruct-v0.3', client=<InferenceClient(model='mistralai/Mistral-7B-Instruct-v0.3', timeout=120)>, async_client=<InferenceClient(model='mistralai/Mistral-7B-Instruct-v0.3', timeout=120)>)

In [None]:
llm.invoke("what is machine learning")

'?\n\nMachine learning is a subset of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. It focuses on the development of computer programs that can access data and use it to learn for themselves.\n\nMachine learning is a way of achieving artificial intelligence by giving a computer the ability to learn from data, without being explicitly programmed. This is done by feeding the machine large amounts of data and using statistical techniques to enable the machine to learn patterns in the data and make decisions based on them.\n\nMachine learning can be used for a variety of tasks, such as image recognition, speech recognition, natural language processing, and prediction. It is used in a wide range of industries, including finance, healthcare, retail, and technology.\n\nThere are three main types of machine learning: supervised learning, unsupervised learning, and reinforcement learning.\n\

In [None]:
from langchain import PromptTemplate, LLMChain

quetion = "who won the cricket worldcup in the year of 2011?"
template = """Question: {question}
Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])
print(prompt)

input_variables=['question'] template="Question: {question}\nAnswer: Let's think step by step."


In [None]:
llm_chain = LLMChain(prompt=prompt, llm=llm)   #to run entire query along the chain itself
print(llm_chain.invoke(quetion))

{'question': 'who won the cricket worldcup in the year of 2011?', 'text': ' The cricket world cup is a major international tournament that takes place every four years. In 2011, the tournament was held in India, Sri Lanka, and Bangladesh. The final match was played on April 2, 2011, where India faced Sri Lanka. Unfortunately, India lost to Sri Lanka, and Sri Lanka became the 2011 Cricket World Cup champions. So, Sri Lanka won the cricket world cup in the year of 2011.'}


## Huggingface Pipeline

Among transformers, the Pipeline is the most versatile tool in the Hugging Face toolbox. LangChain being designed primarily to address RAG and Agent use cases, the scope of the pipeline here is reduced to the following text-centric tasks: “text-generation", “text2text-generation", “summarization”, “translation”. Models can be loaded directly with the from_model_id method

In [None]:
from langchain_huggingface import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

In [None]:
model_id = "gpt2"
model = AutoModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

In [None]:
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=100)
hf_llm = HuggingFacePipeline(pipeline=pipe)

In [None]:
hf_llm

HuggingFacePipeline(pipeline=<transformers.pipelines.text_generation.TextGenerationPipeline object at 0x7ff3824df460>)

In [None]:
hf_llm.invoke("langchain is a company")

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


"langchain is a company that creates open source code libraries which simplify business development with some simple rules of thumb.\n\nThey've built up a good community which has reached into many projects which all are on different levels of sophistication:\n\nThey also put their trust and loyalty out there.\n\nThey have a lot of high powered people who are willing to help them out if any of their projects don't go well... so we asked them to share more.\n\nThis led to some interesting conversations with the developers."

In [None]:
# use huggungface pipeline with GPU
#gpu_llm = HuggingFacePipeline.from_model_id(
 #   model_id="gpt2",
  #  task="text-generation",
  #  device=0,  # replace with device_map="auto" to use the accelerate library.
  #  pipeline_kwargs={"max_new_tokens": 100},)

#### there is no GPU is available