In [1]:
from dotenv import load_dotenv

load_dotenv()

True

#### Accessing Hugging Face Models using API

In [3]:
import os
from langchain_huggingface import HuggingFaceEndpoint

repo_id= "mistralai/Mistral-7B-Instruct-v0.2"
# Here in bellow code the HF_TOKEN is the token which is used to access the model from the huggingface hub. This token is stored in the .env file. You can get this token from the huggingface hub by creating an account on the huggingface and then go to the settings and then click on the API token and then copy the token and paste it in the .env file.
llm= HuggingFaceEndpoint(repo_id= repo_id,
                         max_length= 128,
                         temperature= 0.7)

                    max_length was transferred to model_kwargs.
                    Please make sure that max_length is what you intended.
Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured.


In [4]:
llm.invoke("What is machine learning?")

'\n\nMachine learning is a type of artificial intelligence (AI) that allows computer systems to automatically improve from experience without being explicitly programmed. It focuses on the development of computer programs that can access data and use it to learn for themselves. The process of learning begins with observations or data, such as examples, direct experience, or instruction, in order to look for patterns in data and make better decisions in the future based on the examples that we provide. The primary aim is to allow the computers to learn automatically without human intervention or assistance and adjust actions accordingly.\n\nHow does machine learning work?\n\nMachine learning algorithms build a mathematical model based on input data, and then use that model to make predictions or decisions without being explicitly programmed to perform the task. There are three types of machine learning: supervised learning, unsupervised learning, and reinforcement learning.\n\n1. Superv

#### Now, Let's try different model

In [5]:
repo_id= "mistralai/Mistral-7B-Instruct-v0.3"

llm= HuggingFaceEndpoint(repo_id= repo_id,
                         max_length= 128,
                         temperature= 0.7)

                    max_length was transferred to model_kwargs.
                    Please make sure that max_length is what you intended.
Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured.


In [6]:
llm.invoke("What is machine learning?")

' Machine learning is a subset of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. It focuses on the development of computer programs that can access data and use it to learn for themselves.\n\nMachine learning is a method of data analysis that automates analytical model building. It is a branch of artificial intelligence (AI) based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention.\n\nMachine learning algorithms are used to perform a variety of tasks, including:\n\n1. Classification: Classifying data into different categories based on features. For example, classifying emails as spam or not spam.\n2. Regression: Predicting a continuous value based on input features. For example, predicting the price of a house based on features such as its size, location, and number of bedrooms.\n3. Clustering: Grouping similar data poin

#### Let's create some PromptTemplate

In [7]:
from langchain import PromptTemplate, LLMChain

template= """ Question: {question}
Answer: Let's think step by step."""

prompt= PromptTemplate(template= template, input_variables= ["question"])
prompt

PromptTemplate(input_variables=['question'], input_types={}, partial_variables={}, template=" Question: {question}\nAnswer: Let's think step by step.")

In [9]:
llm_chain= LLMChain(llm= llm, prompt= prompt)

question= "How transformer works in LLMS?"
llm_chain.invoke(question)

{'question': 'How transformer works in LLMS?',
 'text': ' In Language Modeling, the goal is to predict the next word in a sequence given the context of the previous words. Transformers are a type of deep learning model that have been very successful in tasks like this.\n\nIn the context of Longformer, a Long Distance Attention Transformer for Text Classification, and Performer, a Linear Attention Transformer for Long Context, the transformer works as follows:\n\n1. Tokenization: The input text is first tokenized into subwords or words. Each token is assigned a unique integer ID.\n\n2. Position Encoding: Since the transformer only considers tokens that are close in the sequence, position encoding is used to provide information about the relative position of each token in the sequence.\n\n3. Attention Mechanism: The transformer uses an attention mechanism to weigh the importance of each token when predicting the next word. For each token, it computes a weighted sum of all other tokens, w

#### Let's create the HuggingFace Pipeline

In [10]:
# Note: Here the model will be running locally on the system.
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("openai-community/gpt2")
model = AutoModelForCausalLM.from_pretrained("openai-community/gpt2")

In [11]:
from langchain_huggingface import HuggingFacePipeline
from transformers import pipeline

pipe = pipeline('text-generation', model=model, tokenizer=tokenizer, max_new_tokens=100)
hf= HuggingFacePipeline(pipeline= pipe)

Device set to use cuda:0


In [12]:
hf.invoke("Langchain is a")

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


"Langchain is a multi-party blockchain library that provides a tool to connect and store assets. A simple view of your assets is now displayed:\n\n\nThe following code demonstrates how you can use the system's library to view your entire portfolio:\n\n\nI can't say I've tried the above system before, but I am confident any time you see those little images in your Twitter feed the right thing in the right place to visualize your portfolio.\n\nWhat I also found was that it would be really useful to"

#### Now Let's use Hugging Face Pipeline with GPU
###### Note: The current device is GPU-enabled by default, which is why the previous code ran on the GPU.

In [13]:
# Note: Here the Cuda version should be the same as per the one required by PyTorch. Try running the below command in terminal to install the PyTorch with the required Cuda version.
# pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
import torch

# torch.cuda.is_available() returns True if a GPU is available, False if not
device = 0 if torch.cuda.is_available() else -1  # Use GPU (device=0) if available, otherwise use CPU (device=-1)
device

0

In [14]:
gpu_llm= HuggingFacePipeline.from_model_id(
    model_id= "gpt2",
    task= "text-generation",
    device=device,  # Pass the selected device (0 for GPU or -1 for CPU)
    pipeline_kwargs= {"max_new_tokens": 100},
)

Device set to use cuda:0


In [15]:
# Here in the below code we are creating a chain in different way.

chain= prompt | gpu_llm  # Here we are utilizing the previously created prompt and gpu_llm to create a chain.

In [16]:
question= "What is machine learning?"
chain.invoke({"question": question})

" Question: What is machine learning?\nAnswer: Let's think step by step.\nLet's see how machine learning works in three basic areas:\nThe first is machine learning, applied to the input-output pipeline (or the data source in the previous lesson)\nand the other two are machine learning in general, where neural networks are used to learn more information relative to the input dataset\nand neural networks are used to learn more information relative to the dataset Learning about the input data to create a neural network (as we said in my previous example of learning about the data)"

Here in various outputs we do see the model hallucinate.