<a href="https://colab.research.google.com/github/shameer-phy/GenAI/blob/main/HuggingFace_usage/HuggingFace_Pipeline.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using AutoModel

In [None]:
# !pip install transformers

In [None]:
# Import necessary libraries
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import warnings
warnings.filterwarnings("ignore")

In [None]:
# Set the random seed for reproducibility
torch.random.manual_seed(0)

<torch._C.Generator at 0x785ba0783db0>

In [None]:
# Load the pre-trained language model from Hugging Face
# This model is optimized for causal language modeling
model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-4k-instruct",
    device_map="cuda",   # comment to use CPU
    torch_dtype="auto",     # Automatically selects the appropriate data type
    trust_remote_code=True, # Allows downloading code from the repository
)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
# Load the tokenizer associated with the model
# Tokenizer converts text into token IDs that the model can understand
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")

In [None]:
# Define the conversation messages in a list of dictionaries
messages = [
    {"role": "system", "content": "You are a helpful AI assistant."},
    {"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"},
    {"role": "assistant", "content": "Sure! Here are some ways to eat bananas and dragonfruits together: 1. Banana and dragonfruit smoothie: Blend bananas and dragonfruits together with some milk and honey. 2. Banana and dragonfruit salad: Mix sliced bananas and dragonfruits together with some lemon juice and honey."},
    {"role": "user", "content": "What about solving a 2x + 3 = 7 equation?"},
]

In [None]:
tokenizer("What about solving a 2x + 3 = 7 equation?")

{'input_ids': [1724, 1048, 17069, 263, 29871, 29906, 29916, 718, 29871, 29941, 353, 29871, 29955, 6306, 29973], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}

In [None]:
# Create a text-generation pipeline using the loaded model and tokenizer
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
)


In [None]:
# Define generation arguments such as the number of tokens to generate
generation_args = {
    "max_new_tokens": 500,        # Limit the length of the generated text
    "return_full_text": False,    # Only return new generated text, not the input
    "temperature": 0.0,           # 0.0 for deterministic output (no randomness)
    "do_sample": False,           # Use greedy decoding (no sampling)
}

In [None]:
# Tokenize and generate the output for the first set of messages
output = pipe(messages, **generation_args)
print(output[0]['generated_text'])

 To solve the equation 2x + 3 = 7, follow these steps:

1. Subtract 3 from both sides of the equation: 2x + 3 - 3 = 7 - 3
2. Simplify: 2x = 4
3. Divide both sides by 2: 2x/2 = 4/2
4. Simplify: x = 2

The solution to the equation 2x + 3 = 7 is x = 2.


In [None]:
# Modify the conversation with a new user query and generate a new response
messages[3]['content'] = "What is the capital of India?"
output = pipe(messages, **generation_args)
print(output[0]['generated_text'])

 The capital of India is New Delhi.


# Using pipeline

In [None]:
# Using the text2text-generation pipeline for translation
# Load a different model for translating Hindi to English
translation_pipe = pipeline("text2text-generation", model="snehalyelmati/mt5-hindi-to-english")

config.json:   0%|          | 0.00/773 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/1.20G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/445 [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/4.31M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/16.3M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/74.0 [00:00<?, ?B/s]

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


In [None]:
# Translate sentences from Hindi to English
print(translation_pipe("आप कैसे है?")[0]['generated_text'])
print(translation_pipe("मैं कल तुमसे मिलने आऊंगा")[0]['generated_text'])

How are you?
I'll be coming to see you tomorrow.
