<a href="https://colab.research.google.com/github/gabriel1628/LangChain-Tutorials/blob/main/LangChain_with_HuggingFace.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Source : https://huggingface.co/blog/langchain

In [None]:
!pip install -q -U bitsandbytes transformers langchain-huggingface

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.4/44.4 kB[0m [31m1.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m69.1/69.1 MB[0m [31m8.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m9.7/9.7 MB[0m [31m18.5 MB/s[0m eta [36m0:00:00[0m
[?25h

# The LLMs

## HuggingFacePipeline

In [None]:
from transformers import BitsAndBytesConfig

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
)

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer

# model_id = "microsoft/Phi-3-mini-4k-instruct"
model_id = "meta-llama/Llama-3.2-1B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    # quantization_config=bnb_config,
    torch_dtype=torch.bfloat16,
    #attn_implementation="flash_attention_2", # if you have an ampere GPU
)


In [None]:
from transformers import pipeline

pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=100,
    top_k=50,
    temperature=0.1
)

Device set to use cpu


In [None]:
from langchain_huggingface import HuggingFacePipeline

llm = HuggingFacePipeline(pipeline=pipe)
text = llm.invoke("Hugging Face is")
print(text)

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Hugging Face is a company that provides a platform for developers to build and deploy machine learning models. The company was founded in 2018 by Alexis Concha and David Robinson. The company has raised $50 million in funding from investors such as Andreessen Horowitz, Index Ventures, and Y Combinator.
Hugging Face is a company that provides a platform for developers to build and deploy machine learning models. The company was founded in 2018 by Alexis Concha and David Robinson. The company has raised $


In [18]:
text

'Hugging Face is a company that provides a platform for developers to build and deploy machine learning models. The company was founded in 2018 by Alexis Concha and David Robinson. The company has raised $50 million in funding from investors such as Andreessen Horowitz, Index Ventures, and Y Combinator.\nHugging Face is a company that provides a platform for developers to build and deploy machine learning models. The company was founded in 2018 by Alexis Concha and David Robinson. The company has raised $'

## HuggingFaceEndpoint

In [21]:
from langchain_huggingface import HuggingFaceEndpoint

llm = HuggingFaceEndpoint(
    repo_id="meta-llama/Meta-Llama-3-8B-Instruct",
    task="text-generation",
    max_new_tokens=100,
    do_sample=False,
)
response = llm.invoke("Hugging Face is")
print(response)

 a popular open-source library for natural language processing (NLP) in Python. It provides a wide range of pre-trained models and tools for building NLP applications. Here are some of the key features of Hugging Face:

1. Pre-trained Models: Hugging Face provides a wide range of pre-trained models for various NLP tasks such as language translation, sentiment analysis, text classification, and more.
2. Transformers: Hugging Face provides a set of pre-trained transformer models, including BERT


## ChatHuggingFace

In [45]:
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from pprint import pprint

llm = HuggingFaceEndpoint(
    repo_id="meta-llama/Meta-Llama-3-8B-Instruct",
    task="text-generation",
    max_new_tokens=1024,
    do_sample=False,
)
llm_engine_hf = ChatHuggingFace(llm=llm)
response = llm_engine_hf.invoke("Hugging Face is")

pprint(response.response_metadata)
print()
response.pretty_print()

{'finish_reason': 'stop',
 'model': '',
 'token_usage': ChatCompletionOutputUsage(completion_tokens=338,
                                          prompt_tokens=14,
                                          total_tokens=352)}

A popular Artificial Intelligence (AI) company!

Hugging Face is a privately held AI firm founded in 2018 by Clément Delangue, Thibaut Lamy, and Julien Chaumond. The company is headquartered in New York City, with an office in Paris, France.

Hugging Face is known for developing and maintaining a range of AI-related projects and tools, particularly in the areas of natural language processing (NLP) and transformer-based models. Some of their notable projects include:

1. **Transformers**: A library for the Hugging Face ecosystem that provides pre-trained models, such as BERT, RoBERTa, and XLNet, for NLP tasks like language translation, sentiment analysis, and question answering.
2. **Trainer**: A PyTorch Lightning library that allows developers to easily train and

The above code is equivalent to :
```python
# with mistralai/Mistral-7B-Instruct-v0.2
llm.invoke("<s>[INST] Hugging Face is [/INST]")

# with meta-llama/Meta-Llama-3-8B-Instruct
llm.invoke("""<|begin_of_text|><|start_header_id|>user<|end_header_id|>Hugging Face is<|eot_id|><|start_header_id|>assistant<|end_header_id|>""")

```

# The Embeddings

## HuggingFaceEmbeddings

In [54]:
from langchain_huggingface.embeddings import HuggingFaceEmbeddings

model_name = "mixedbread-ai/mxbai-embed-large-v1"
hf_embeddings = HuggingFaceEmbeddings(
    model_name=model_name,
)

texts = ["Hello, world!", "How are you?"] # lenght 2
embeddings = hf_embeddings.embed_documents(texts)
len(embeddings), len(embeddings[0]), len(embeddings[1])

(2, 1024, 1024)

## HuggingFaceEndpointEmbeddings

In [56]:
from langchain_huggingface.embeddings import HuggingFaceEndpointEmbeddings

hf_embeddings = HuggingFaceEndpointEmbeddings(
    model= "mixedbread-ai/mxbai-embed-large-v1",
    task="feature-extraction",
    # huggingfacehub_api_token="<HF_TOKEN>",
)

texts = ["Hello, world!", "How are you?"]
embeddings = hf_embeddings.embed_documents(texts)
len(embeddings), len(embeddings[0]), len(embeddings[1])

(2, 1024, 1024)