In [None]:
# !pip install langchain-huggingface
# !pip install huggingface-hub
# !pip install transformers accelerate bitsandbytes langchain

## Using HuggingFaceEndpoint for API calls

In [None]:
from google.colab import userdata
HUGGINGFACE_API_KEY = userdata.get("HUGGINGFACE_API_KEY")

In [None]:
from langchain_huggingface import HuggingFaceEndpoint
import os

In [None]:
os.environ["HUGGINGFACEHUB_API_TOKEN"] = HUGGINGFACE_API_KEY

In [None]:
repo_id = "mistralai/Mistral-7B-Instruct-v0.3"

llm = HuggingFaceEndpoint(repo_id=repo_id, temperature = 0.1, max_new_tokens = 128, token = HUGGINGFACE_API_KEY)

                    token was transferred to model_kwargs.
                    Please make sure that token is what you intended.


The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to /root/.cache/huggingface/token
Login successful


In [None]:
print(llm.invoke("Can you help me write a hello world program in python"))

?

Sure! Here's a simple "Hello, World!" program in Python:

```python
print("Hello, World!")
```

To run this program, save it to a file with a `.py` extension (for example, `hello_world.py`), then open a terminal or command prompt, navigate to the directory where you saved the file, and run the command `python hello_world.py`. You should see the text "Hello, World!" printed to the console.

If you're using a Python environment like Anaconda, you


In [None]:
from langchain import PromptTemplate

question = "Who won the cricket world cup in the year 2011?"

template = """
Question: {question}

Answer: Let's think step by step.
"""

prompt = PromptTemplate(template=template, input_variables=["question"])

chain = prompt | llm

print(chain.invoke(question))


1. We are asked who won the cricket world cup in the year 2011.
2. The cricket world cup is a tournament held every four years, so it's unlikely that there would be multiple world cups in a single year.
3. The 2011 Cricket World Cup was won by India.

Final answer: India won the cricket world cup in the year 2011.


## Using HuggingFacePipeline

In [None]:
from langchain_huggingface import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

In [None]:
model_id = "openai-community/gpt2"

pipe = AutoModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

In [None]:
pipeline = pipeline("text-generation", model=pipe, tokenizer=tokenizer, max_new_tokens=128)

In [None]:
gpt2 = HuggingFacePipeline(pipeline=pipeline)

In [None]:
print(gpt2.invoke("what is machine learning?"))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


what is machine learning?

The primary reason why we have never heard of machine learning is that there are not very many studies to support it. We know that computers are doing it for you and your family, you see, but there are so many issues around the field that nobody knows much about.

I've spent two hours talking with a lot of people who aren't trained in machine learning, about machine learning, and you can see what a lot of these are, and you can see what their response is. But what are the problems you see when people are learning the same data?

I've had people just start talking about it.


## Using GPU

In [None]:
# HuggingFacePipeline with GPU
gpt2_gpu = HuggingFacePipeline.from_model_id(
    model_id = model_id,
    task = "text-generation",
    device = 0,
    pipeline_kwargs = {"max_new_tokens": 128}
)

In [None]:
chain = prompt | gpt2_gpu

In [None]:
print(chain.invoke("Who won the cricket world cup in the year 2011?"))


Question: Who won the cricket world cup in the year 2011?

Answer: Let's think step by step.

All three of them had been on the World Cup stage.

What happened.

They were all invited.

How were they chosen?

They arrived on a first-class list.

How did they win?

They were part of a group.

How came they are the only one to win the World Cup?

Their group selection had been conducted over the course of seven years and the following summer the Sri Lankan side received the international cricket award.

A little bit of the story goes that the first time Sri Lankan cricket went to the World Cup at the World
