# Large Language Models

In the ever-evolving landscape of artificial intelligence, large language models (LLMs) have emerged as transformative tools, reshaping the way we engage with and analyse language. These sophisticated models, honed on massive repositories of text data, possess the remarkable ability to comprehend, generate, and translate human language with unprecedented accuracy and fluency. Among the prominent LLM frameworks, LangChain stands out for its efficiency and flexibility.

---
## 1.&nbsp; Installations and Settings 🛠️

LangChain is a framework that simplifies the development of applications powered by large language models (LLMs). Here we install their HuggingFace package as we'll be using open source models from HuggingFace.

In [None]:
!pip install -qqq -U langchain-huggingface

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/314.7 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━[0m [32m194.6/314.7 kB[0m [31m5.7 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m314.7/314.7 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m227.1/227.1 kB[0m [31m7.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m124.9/124.9 kB[0m [31m6.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m53.0/53.0 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m142.7/142.7 kB[0m [31m662.3 kB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m21.3/21.3 MB[0m [31m53.8 MB/s[0m eta [36m0:00:00[0m
[?25h

To use the LLMs, you'll need to create a HuggingFace access token for this project.
1. Sign up for an account at [HuggingFace](https://huggingface.co/)
2. Go in to your account and click `edit profile`
3. Go to `Access Tokens` and create a `New Token`
4. The `Type` of the new token should be set to `Read`

We've then saved ours as a Colab secret - this way we can use it in multiple notebooks without having to type it or reveal it.

In [None]:
import os
from google.colab import userdata

# Set the token as an environ variable
os.environ["HUGGINGFACEHUB_API_TOKEN"] = userdata.get('HF_TOKEN')

---
## 2.&nbsp; Setting up your LLM 🧠

A HuggingFace EndPoint is a service that lets you deploy machine learning models, specifically those from the HuggingFace Hub, for use in real-world applications. It basically provides the infrastructure and tools to turn your models into usable APIs. You can set up your own EndPoint and you pay for the compute resources used by the minute. However, HuggingFace generously lets us test smaller LLMs using Endpoints it's already set up for free!

There's a limit on the size of model you can use for free. Free tier limitations for model size aren't publicly disclosed, but models exceeding 10GB are likely inaccessible.

And, on the free tier HuggingFace prioritises fair use and might throttle heavy users. Here's what they say on their [FAQ page](https://huggingface.co/docs/api-inference/faq):

> Rate limits:
The free Inference API may be rate limited for heavy use cases. We try to balance the loads evenly between all our available resources, and favoring steady flows of requests. If your account suddenly sends 10k requests then you’re likely to receive 503 errors saying models are loading. In order to prevent that, you should instead try to start running queries smoothly from 0 to 10k over the course of a few minutes.

In [None]:
from langchain_huggingface import HuggingFaceEndpoint

# This info's at the top of each HuggingFace model page
hf_model = "mistralai/Mistral-7B-Instruct-v0.3"

llm = HuggingFaceEndpoint(
    repo_id = hf_model,
    # max_new_tokens=512,
    temperature=0.01,
    top_p=0.95,
    repetition_penalty=1.03,
    # huggingfacehub_api_token = "your_hf_token" # Instead of passing your HuggingFace token to os, you could include it here
)

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to /root/.cache/huggingface/token
Login successful


Here's a brief overview of some of the parameters:
* **repo_id:** The path to the HuggingFace model that will be used for generating text.
* **max_new_tokens:** The maximum number of tokens that the model should generate in its response.
* **temperature:** A value between 0 and 1 that controls the randomness of the model's generation. A lower temperature results in more predictable, constrained output, while a higher temperature yields more creative and diverse text.
* **top_p:** A value between 0 and 1 that controls the diversity of the model's predictions. A higher top_p value prioritizes the most probable tokens, while a lower top_p value encourages the model to explore a wider range of possibilities.
* **repetition_penalty** discourages repetitive outputs. It penalizes tokens that have already been generated, making the model less likely to use them again. This helps produce more diverse and interesting text.

There are many more parameters you can play with. Check out the [Docs](https://api.python.langchain.com/en/latest/llms/langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint.html).

There are also [usage examples](https://python.langchain.com/v0.2/docs/integrations/llms/huggingface_endpoint/) on LangChain's website.

---
## 3.&nbsp; Asking your LLM questions 🤖
Play around and note how small changes make a big difference.

In [None]:
answer_1 = llm.invoke("Which animals live at the north pole?")
print(answer_1)



The Arctic is home to a variety of animals that have adapted to the cold, harsh environment. Here are some of the most common animals found in the Arctic:

1. Polar Bears: The polar bear is the largest land carnivore and is well-adapted to life in the Arctic. They have a thick layer of body fat and a waterproof coat of fur that helps them stay warm and buoyant in the icy waters.

2. Arctic Fox: The Arctic fox is a small, agile predator that is well-adapted to the cold. They have thick fur that changes color with the seasons, and they are excellent hunters of small mammals and birds.

3. Reindeer: Reindeer, also known as caribou in North America, are large, hoofed mammals that are well-adapted to the Arctic. They have a thick coat of fur and a strong, curved antler that helps them navigate through the snow.

4. Beluga Whale: The beluga whale is a white, toothed whale that lives in the Arctic waters. They are well-adapted to the cold and can dive to depths of up to 1,000 feet.

5. Walr

In [None]:
answer_2 = llm.invoke("Write a poem about animals that live at the north pole.")
print(answer_2)



In the realm where the sun seldom shines,
A world of ice and snowy lines,
Lives a cast of creatures, so divine,
The North Pole's enchanting designs.

Polar bears, white as the snow they roam,
With fur so thick, they're never cold,
Their eyes, like stars, in the icy dome,
Guard the Arctic's ancient hold.

Seals and walruses, in the ocean deep,
Swimming with grace, their tales unfurled,
Their voices echo, a haunting peep,
In this frozen, icy world.

Arctic foxes, with coats so bright,
Dance upon the snowy plains,
Their agility, a sight to ignite,
A spectacle of nature's reigns.

Reindeer, strong and sure on their feet,
Through the snow, they pull Santa's sleigh,
Their antlers, a testament to meet,
The challenges of winter's day.

Penguins, though not native to this land,
Have found a home on the ice floes,
Their waddle and slide, a joyous band,
In the North Pole's icy repose.

Each creature, in its own special way,
Adapts to life in this cold domain,
A testament to survival's sway,
In 

In [None]:
answer_3 = llm.invoke("Explain the central limit theorem like I'm 5 years old.")
print(answer_3)



Alright, let's imagine you have a big bag of candies. Each candy has a different weight, but let's say the average weight of all candies is 10 grams and the weights are spread out around that average.

Now, if you pick 5 candies at random from the bag, the total weight of those 5 candies might not be exactly 50 grams (10 grams per candy). It could be more or less. But if you do this many times, the average total weight of the candies you picked will usually be pretty close to 50 grams.

The central limit theorem is like a rule that says this will happen, no matter what the individual candies weigh. As long as you have a lot of candies and you're picking a large enough group at a time, the average total weight of the candies you pick will be close to the average weight of an individual candy, no matter what that weight is.

So, even if you have candies that weigh 1 gram or 100 grams, if you pick enough of them at a time, the average total weight will still be close to the average weig

The answers provided by the 7B model may not seem as impressive as those from the latest OpenAI or Google models, but consider the significant size difference - they perform very well. These models may not have the most extensive knowledge base, but for our purposes, we only need them to generate coherent English. We'll then infuse them with specialised knowledge on a topic of your choice, resulting in a local, specialised model that can function offline.

---
## 4.&nbsp; Challenge 😀
Play around with this, and other, LLMs. keep a record of your findings:
1. Pose different questions to the model, each subtly different from the last. Observe the resulting outputs. Smaller models tend to be highly sensitive to minor changes in language and grammar.
2. Experiment with the parameters, one at a time, to assess their impact on the output.
3. Attempt to load different models. **Remember**: you can only use models under 10GB for free. This means most 7B or 8B will work, but when you move closer to 11B or 13B models, they are unlikely to function on the free tier of EndPoints. Explore the [models page on HuggingFace](https://huggingface.co/models). You can use the left hand menu to find `Text Generation` under `Natural Language Processing`. When you find a model you like, the repo id is at the top of the model card. Use this repo id to load the endpoint.


![](https://drive.google.com/uc?export=view&id=1hm_UJRaelxR1L4WRBfPJZYQyy7OS4Bj4)
