# Huggingface With Langchain

Announcement Link: https://huggingface.co/blog/langchain


In [1]:
## Libraries Required
!pip install langchain-huggingface
## For API Calls
!pip install huggingface_hub
!pip install transformers
!pip install accelerate
!pip install  bitsandbytes
!pip install langchain


Collecting langchain-huggingface
  Downloading langchain_huggingface-0.1.2-py3-none-any.whl.metadata (1.3 kB)
Downloading langchain_huggingface-0.1.2-py3-none-any.whl (21 kB)
Installing collected packages: langchain-huggingface
Successfully installed langchain-huggingface-0.1.2
Collecting bitsandbytes
  Downloading bitsandbytes-0.45.0-py3-none-manylinux_2_24_x86_64.whl.metadata (2.9 kB)
Downloading bitsandbytes-0.45.0-py3-none-manylinux_2_24_x86_64.whl (69.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m69.1/69.1 MB[0m [31m9.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: bitsandbytes
Successfully installed bitsandbytes-0.45.0


In [2]:
## Environment secret keys
from google.colab import userdata
sec_key=userdata.get("HF_TOKEN")
print(sec_key)

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


## HuggingFaceEndpoint
## How to Access HuggingFace Models with API
There are also two ways to use this class. You can specify the model with the repo_id parameter. Those endpoints use the serverless API, which is particularly beneficial to people using pro accounts or enterprise hub. Still, regular users can already have access to a fair amount of request by connecting with their HF token in the environment where they are executing the code.


In [3]:
from langchain_huggingface import HuggingFaceEndpoint

In [4]:
from google.colab import userdata
sec_key=userdata.get("HUGGINGFACEHUB_API_TOKEN")
print(sec_key)

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


In [5]:
import os
os.environ["HUGGINGFACEHUB_API_TOKEN"]=sec_key

In [14]:
repo_id="mistralai/Mistral-7B-Instruct-v0.2"
llm=HuggingFaceEndpoint(repo_id=repo_id,temperature=0.7,token=sec_key)

                    token was transferred to model_kwargs.
                    Please make sure that token is what you intended.


In [15]:
llm.invoke("Tell something about Al Kabir college Jamshedpur?")

'\n\nAl Kabir College Jamshedpur is a renowned educational institution located in the steel city of Jamshedpur, Jharkhand, India. Established in the year 2001, the college is affiliated with the Birla Institute of Technology, Mesra (BIT Mesra), and is recognized by the University Grants Commission (UGC). Al Kabir College offers undergraduate programs in various disciplines, including Arts, Science, and Commerce. The college is known for its excellent academic record, state-of-the-art infrastructure, and dedicated faculty members. Al Kabir College aims to provide quality education to its students and prepare them for successful careers in their chosen fields. The college also offers various extracurricular activities, including cultural events, sports, and social service programs, to help students develop holistically. Al Kabir College has a strong student community and a vibrant campus life, making it an ideal place for students to learn and grow.'

In [16]:
repo_id="mistralai/Mistral-7B-Instruct-v0.3"
llm=HuggingFaceEndpoint(repo_id=repo_id,temperature=0.7,token=sec_key)

                    token was transferred to model_kwargs.
                    Please make sure that token is what you intended.


In [17]:
llm.invoke("What is Genertaive AI")

'?\n\nGenerative AI is a type of artificial intelligence that is capable of creating new content, such as images, text, or music, by learning patterns from existing data. It works by generating outputs that are similar to the training data, but with some variation to make the output unique.\n\nGenerative AI models are typically trained on large datasets and use various techniques, such as deep learning, to learn the underlying patterns in the data. Once trained, the models can generate new content that is similar to the training data, but with some variations. This makes them useful for a variety of applications, such as creating art, writing stories, and composing music.\n\nOne example of a popular generative AI model is the Generative Adversarial Network (GAN), which consists of two neural networks that compete with each other to generate more realistic outputs. The generator network tries to create new content that is indistinguishable from the real data, while the discriminator net

In [18]:
from langchain import PromptTemplate, LLMChain

question="Who won the Cricket World Cup in the year 2011?"
template = """Question: {question}
Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])
print(prompt)

input_variables=['question'] input_types={} partial_variables={} template="Question: {question}\nAnswer: Let's think step by step."


In [19]:
llm_chain=LLMChain(llm=llm,prompt=prompt)
print(llm_chain.invoke(question))

  llm_chain=LLMChain(llm=llm,prompt=prompt)


{'question': 'Who won the Cricket World Cup in the year 2011?', 'text': "\nThe Cricket World Cup is a tournament that takes place every four years. So, if we subtract 4 from 2011, we get the year of the previous tournament, which was 2007.\nSince the World Cup doesn't happen in the same year twice, the Cricket World Cup in 2011 can't be the one won by the team that won in 2007.\nSo, let's find out who won in 2007.\nIndia, Sri Lanka, and Australia made it to the finals.\nSri Lanka won the semi-final against New Zealand, and India won the other semi-final against Pakistan.\nSo, the final match was between India and Sri Lanka.\nIndia won the match, so India won the Cricket World Cup in 2011.\n\nFinal Answer: India won the Cricket World Cup in the year 2011."}


## HuggingFacePipeline
Among transformers, the Pipeline is the most versatile tool in the Hugging Face toolbox. LangChain being designed primarily to address RAG and Agent use cases, the scope of the pipeline here is reduced to the following text-centric tasks: “text-generation", “text2text-generation", “summarization”, “translation”.
Models can be loaded directly with the from_model_id method


In [20]:
from langchain_huggingface import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

In [21]:
model_id="gpt2"
model=AutoModelForCausalLM.from_pretrained(model_id)
tokenizer=AutoTokenizer.from_pretrained(model_id)


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

In [22]:
pipe=pipeline("text-generation",model=model,tokenizer=tokenizer,max_new_tokens=100)
hf=HuggingFacePipeline(pipeline=pipe)

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


In [None]:
hf

In [24]:
hf.invoke("What is machine learning")

Setting `pad_token_id` to `eos_token_id`:None for open-end generation.


'What is machine learning?" in particular with those trying to make decisions. Why should you think machine learning should be considered a "scientific discipline?" They would be happy to be able to learn from the hard work they put in. But the problem is so widespread that it is not at all fair to assume that AI won\'t happen by 2020. A long time ago, a paper called "Longevity among machines" (by J. Craig Roberts, Stanford University) recommended that research is looking at the long-term viability of'

In [25]:
## Use HuggingfacePipelines With Gpu
gpu_llm = HuggingFacePipeline.from_model_id(
    model_id="gpt2",
    task="text-generation",
    device=0,  # replace with device_map="auto" to use the accelerate library.
    pipeline_kwargs={"max_new_tokens": 100},
)

In [26]:
from langchain_core.prompts import PromptTemplate

template = """Question: {question}

Answer: Let's think step by step."""
prompt = PromptTemplate.from_template(template)

In [27]:
chain=prompt|gpu_llm

In [28]:
question="What is artificial intelligence?"
chain.invoke({"question":question})

'Question: What is artificial intelligence?\n\nAnswer: Let\'s think step by step. One of the first things we do when looking at AI is actually ask whether or not all intelligence is being used right now. The question is, "Can the algorithm answer what I want?" What do you want to find out? There are several theories of the question. Well, the idea behind this is that, first, how we can build a computer program that has learned what I want. Let\'s say we have a new product coming from Xerox. We get this question, "What\'s'