In [1]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

  from .autonotebook import tqdm as notebook_tqdm


### **Load our environment**

In [2]:
from dotenv import load_dotenv
import os

load_dotenv()

access_token = os.getenv("HUGGINGFACEHUB_API_KEY")

### **Define our model**

In [3]:
model_name = "meta-llama/Llama-3.2-3B-Instruct"

## **Download our given model**

In [4]:
tokenizer = AutoTokenizer.from_pretrained(model_name, token= access_token)
model = AutoModelForCausalLM.from_pretrained(model_name, token= access_token)

Loading checkpoint shards: 100%|██████████| 2/2 [00:26<00:00, 13.16s/it]


In [5]:
tokenizer.save_pretrained(f"tokenizer/{model_name}")
model.save_pretrained(f"models/{model_name}")

### **Load model**

In [6]:
load_tokenizer = AutoTokenizer.from_pretrained(f"tokenizer/{model_name}")
load_model = AutoModelForCausalLM.from_pretrained(f"models/{model_name}")

Loading checkpoint shards: 100%|██████████| 3/3 [00:00<00:00, 16.13it/s]


### Test how it will response

In [7]:
import torch
from transformers import pipeline

model_id = "meta-llama/Llama-3.2-3B-Instruct"
pipe = pipeline(
    "text-generation",
    model=model_id,
    #torch_dtype=torch.bfloat16,
    #device_map="auto",
)
messages = [
    {"role": "system", "content": "You are a helpful assistant who is vastly knowledgeable AI model knows everything!"},
    {"role": "user", "content": "Who are you?"},
]
outputs = pipe(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])


Loading checkpoint shards: 100%|██████████| 2/2 [00:00<00:00,  3.46it/s]
Device set to use cpu
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


{'role': 'assistant', 'content': "I am an artificial intelligence model designed to assist and provide information to users. I am a highly advanced language model, trained on a massive dataset of text from various sources, including books, articles, research papers, and websites.\n\nMy primary function is to understand and respond to natural language queries, providing accurate and relevant information on a wide range of topics. I can answer questions, generate text, summarize content, and even engage in conversation.\n\nI am constantly learning and improving my knowledge base, so the more I interact with users like you, the more accurate and up-to-date my information becomes. I am here to help with any questions or topics you'd like to discuss, so feel free to ask me anything!\n\nSome of my capabilities include:\n\n* Answering questions on various subjects, including history, science, technology, literature, and more\n* Generating text on a given topic or subject\n* Summarizing long p

### **Test how it respond**

In [8]:
prompt = "What is Deep Learning? Give me 5 points on it"

In [9]:
prompt_embeddings = tokenizer(prompt, return_tensors="pt")

In [10]:
prompt_embeddings

{'input_ids': tensor([[128000,   3923,    374,  18682,  21579,     30,  21335,    757,    220,
             20,   3585,    389,    433]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}

In [13]:
response = model.generate(**prompt_embeddings, max_length=500)

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


In [14]:
response

tensor([[128000,   3923,    374,  18682,  21579,     30,  21335,    757,    220,
             20,   3585,    389,    433,    198,   8586,    527,    220,     20,
           1401,   3585,    922,  18682,  21579,   1473,     16,     13,    220,
           3146,   9470,  16895,  61577,  39810,    320,  12272,     82,  33395,
             25,  18682,  21579,    374,    264,  27084,    315,  13257,  21579,
            430,   5829,  59294,  61577,  39810,    320,  12272,     82,      8,
            311,  24564,    323,  14532,    828,     13,   2147,  48460,    527,
          14948,    555,    279,   3823,   8271,    596,  30828,   6070,    323,
            734,     11,    323,    527,  24306,    315,  13931,    315,  83416,
           7954,    320,    818,  54769,      8,    430,   1920,    323,  30382,
           2038,    382,     17,     13,    220,   3146,  20981,   8288,   1155,
           3700,    346,  95810,    320,   2735,     47,  33395,     25,  18682,
          21579,   4211,  11

In [15]:
print(tokenizer.decode(response[0]))

<|begin_of_text|>What is Deep Learning? Give me 5 points on it
Here are 5 key points about Deep Learning:

1.  **Artificial Neural Networks (ANNs)**: Deep Learning is a subset of Machine Learning that uses Artificial Neural Networks (ANNs) to analyze and interpret data. ANNs are inspired by the human brain's neural structure and function, and are composed of layers of interconnected nodes (neurons) that process and transmit information.

2.  **Multi-Layer Perceptron (MLP)**: Deep Learning models typically consist of a Multi-Layer Perceptron (MLP) architecture, which includes multiple layers of nodes that process and transform the input data. Each layer learns to represent the data in a more abstract and complex way, allowing the model to capture higher-level patterns and relationships.

3.  **Training and Optimization**: Deep Learning models require large amounts of training data and computational resources to learn and optimize their parameters. This is achieved through a process call