<a href="https://colab.research.google.com/github/JSJeong-me/AI-Innovation-2024/blob/main/Transformer/5-1-Pretrained-Model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 1. Install dependencies and fix seed

Welcome to Lesson 1!

If you would like to access the `requirements.txt` file for this course, go to `File` and click on `Open`.

In [1]:
# Install any packages if it does not exist
!pip install transformers torch



In [2]:
# Ignore insignificant warnings (ex: deprecations)
import warnings
warnings.filterwarnings('ignore')

In [None]:
# Set a seed for reproducibility
from transformers import AutoTokenizer, AutoModelForCausalLM

# 1. Tokenizer와 모델 불러오기
model_name = "yanolja/EEVE-Korean-Instruct-10.8B-v1.0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

In [4]:
# 2. 프롬프트 정의
prompt = "한국의 전통 음식은 무엇인가요?"

# 3. 입력 텍스트를 토큰화
inputs = tokenizer(prompt, return_tensors="pt")

# 4. 모델 예측 실행
output = model.generate(**inputs, max_length=100, do_sample=True, top_p=0.9, temperature=0.7)

# 5. 결과 디코딩
response = tokenizer.decode(output[0], skip_special_tokens=True)

print("모델의 응답:", response)

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


모델의 응답: 한국의 전통 음식은 무엇인가요? - 2022
한국의 전통 음식은 무엇인가요?
한국 전통 음식은 쌀, 채소, 해산물, 육류와 같은 다양한 재료를 사용합니다. 가장 인기 있는 한국 요리에는 비빔밥, 불고기, 김치, 그리고 떡이 있습니다.
비빔밥은 볶은 고기, 채소, 그리고 쌀을 매콤한 고추장 소스로 섞어 만든 한국의 볶음밥


In [3]:
# Hugging Face 토큰 설정 (필요 시 사용)
from google.colab import userdata
from huggingface_hub import login

# 비밀 정보에서 Hugging Face 토큰 가져오기
huggingface_token = userdata.get('HF_TOKEN')

In [4]:
import torch
from transformers import pipeline

In [5]:


model_id = "meta-llama/Llama-3.2-1B"

In [6]:


pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

config.json:   0%|          | 0.00/843 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/2.47G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/185 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/50.5k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/301 [00:00<?, ?B/s]

In [7]:


pipe("The key to life is")


Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


[{'generated_text': 'The key to life is to accept the fact that the world is a place of change. The'}]

In [8]:
pipe("Why the sky is blue?")

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


[{'generated_text': 'Why the sky is blue? Why the sun is so hot? Why the water is so cold'}]

In [9]:
pipe("What is the capital of Korea?")

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


[{'generated_text': 'What is the capital of Korea??\nA. Seoul\nB. Houston\nC. Charlotte'}]

In [10]:
pipe("What is the capital of North Korea?")

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


[{'generated_text': 'What is the capital of North Korea?.\nWhat is the time zone in North Korea?.\n'}]

In [12]:
pipe("한국의 전통 음식은 무엇인가요?")

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


[{'generated_text': '한국의 전통 음식은 무엇인가요? (What is Korean traditional food?)\n한국'}]

In [14]:
pipe("나는 김치를")

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


[{'generated_text': '나는 김치를 먹고 있다. 나는 김치에 살을 넣고 있다. 나는 살'}]

In [None]:
import torch
from transformers import pipeline

model_id = "meta-llama/Llama-3.2-1B-Instruct"
pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

In [17]:

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Who are you?"},
]
outputs = pipe(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])


Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


{'role': 'assistant', 'content': "I'm an artificial intelligence assistant, which means I'm a computer program designed to simulate conversations and answer questions to the best of my knowledge. I'm here to provide information, answer your queries, and help with any topics you'd like to discuss.\n\nI'm a large language model, which means I've been trained on a massive dataset of text from various sources, including books, articles, and websites. This training allows me to understand and generate human-like language, making me a versatile and helpful assistant.\n\nI can assist with a wide range of tasks, such as:\n\n* Answering questions on various topics, from science and history to entertainment and culture\n* Providing definitions and explanations of words and phrases\n* Offering suggestions and ideas for creative projects, like writing, art, or music\n* Helping with language translation and grammar correction\n* And much more!\n\nFeel free to ask me anything, and I'll do my best to

In [20]:
messages = [
    {"role": "system", "content": "You are a helpful assistant.You an expert to write the Python code."},
    {"role": "user", "content": "What is the capital of Korea?"},
]
outputs = pipe(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])




Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


{'role': 'assistant', 'content': 'The capital of Korea is Seoul.'}


## google/gemma-2b Generate text samples

https://huggingface.co/google/gemma-2b


In [None]:
!pip install transformers
!pip install torch
!pip install sentencepiece huggingface_hub # LLaMA 계열 모델에 필요

In [2]:
# Hugging Face 토큰 설정 (필요 시 사용)
from google.colab import userdata
from huggingface_hub import login

# 비밀 정보에서 Hugging Face 토큰 가져오기
huggingface_token = userdata.get('HF_TOKEN')

In [3]:
from transformers import AutoTokenizer, AutoModelForCausalLM

In [None]:


tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b")
model = AutoModelForCausalLM.from_pretrained("google/gemma-2b")

In [10]:


# input_text = "Write me a poem about Machine Learning."
input_text = "what time is it now?" # I am an engineer. I love

input_ids = tokenizer(input_text, return_tensors="pt")

outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))


<bos>what time is it now?

what time is it now?

what time is it now


In [None]:


# 프롬프트 설정

input_tex = "I am an engineer. I love"



In [None]:
input_ids = tokenizer(input_text, return_tensors="pt")

outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))

<bos>Write me a poem about Machine Learning.

I’m not sure what you mean by “


## 4. Generate Python samples with pretrained general model

Use the model to write a python function called `find_max()` that finds the maximum value in a list of numbers:

In [13]:
input_text =  "def find_max(numbers):"  # The find_max function returns the largest number from a list.

In [18]:
input_text ="The function find_max takes a list of numbers as input and returns the largest (maximum) number from the list."

In [19]:
input_ids = tokenizer(input_text, return_tensors="pt")

outputs = model.generate(**input_ids, max_length=256)
print(tokenizer.decode(outputs[0]))

<bos>The function find_max takes a list of numbers as input and returns the largest (maximum) number from the list. Write a function find_max that takes a list of numbers as input and returns the largest (maximum) number from the list.

A 100-W lightbulb generates $95 \mathrm{~W}$ of heat, which is dissipated through a glass bulb that has a radius of $3.0 \mathrm{~cm}$ and is $0.50 \mathrm{~mm}$ thick. What is the difference in temperature between the inner and outer surfaces of the glass?

A 100-turn, 2.0-cm-diameter coil is at rest with its axis vertical. A uniform magnetic field $60^{\circ}$ away from vertical increases from 0.50 T to 1.50 T in 0.60 s. What is the induced emf in the coil?

A 100-W lightbulb is plugged into a standard $120-\mathrm{V}$ (rms) outlet. Find $(a) I_{\text {mas }}$ and $(b) I_{\max }$ when a device like this is operating at the maximum current allowed by its


In [16]:
def find_max(numbers):
    max_number = numbers[0]
    for number in numbers:
        if number > max_number:
            max_number = number
    return max_number

In [17]:
numbers = [1, 2, 3, 4, 5]
result = find_max(numbers)
print(result)

5


## Running the model on a single / multi GPU


In [None]:
!pip install transformers
!pip install torch
!pip install accelerate

In [None]:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b")
model = AutoModelForCausalLM.from_pretrained("google/gemma-2b", device_map="auto")

input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))


## google/gemma-2-2b-it

In [None]:
!pip install -U transformers
!pip install -U accelerate
!pip install -U bitsandbytes

In [None]:
import torch
from transformers import pipeline

pipe = pipeline(
    "text-generation",
    model="google/gemma-2-2b-it",
    model_kwargs={"torch_dtype": torch.bfloat16},
    device="cuda",  # replace with "mps" to run on a Mac device
)

messages = [
    {"role": "user", "content": "Who are you? Please, answer in pirate-speak."},
]

outputs = pipe(messages, max_new_tokens=256)
assistant_response = outputs[0]["generated_text"][-1]["content"].strip()
print(assistant_response)
# Ahoy, matey! I be Gemma, a digital scallywag, a language-slingin' parrot of the digital seas. I be here to help ye with yer wordy woes, answer yer questions, and spin ye yarns of the digital world.  So, what be yer pleasure, eh? 🦜


## prompt =  "def find_max(numbers):"

In [None]:
messages = [
    {"role": "user", "content": "def find_max(numbers):"},
]

outputs = pipe(messages, max_new_tokens=256)
assistant_response = outputs[0]["generated_text"][-1]["content"].strip()
print(assistant_response)

```python
def find_max(numbers):
  """
  This function finds the maximum number in a list.

  Args:
    numbers: A list of numbers.

  Returns:
    The maximum number in the list.
  """
  if not numbers:
    return None  # Handle empty list case
  max_number = numbers[0]  # Initialize with the first element
  for number in numbers:
    if number > max_number:
      max_number = number
  return max_number
```

**Explanation:**

1. **Function Definition:**
   - `def find_max(numbers):` defines a function named `find_max` that takes a list of numbers (`numbers`) as input.

2. **Empty List Handling:**
   - `if not numbers:` checks if the list is empty. If it is, the function returns `None` to avoid errors.

3. **Initialization:**
   - `max_number = numbers[0]` sets the initial value of `max_number` to the first element of the list.

4. **Iteration and Comparison:**
   - `for number in numbers:` iterates through each


## 6. Generate Python samples with pretrained Python model

Here you'll use a version of TinySolar-248m-4k that has been further pretrained (a process called **continued pretraining**) on a large selection of python code samples. You can find the model on Hugging Face at [this link](https://huggingface.co/upstage/TinySolar-248m-4k-py).

You'll follow the same steps as above to load the model and use it to generate text.

In [None]:
model_path_or_name = ""

In [None]:
prompt = "def find_max(numbers):"

inputs = tiny_custom_tokenizer(
    prompt, return_tensors="pt"
).to(tiny_custom_model.device)

streamer = TextStreamer(
    tiny_custom_tokenizer,
    skip_prompt=True,
    skip_special_tokens=True
)

outputs = tiny_custom_model.generate(
    **inputs, streamer=streamer,
    use_cache=True,
    max_new_tokens=128,
    do_sample=False,
    repetition_penalty=1.1
)

Try running the python code the model generated above:

In [None]:
def find_max(numbers):
   max = 0
   for num in numbers:
       if num > max:
           max = num
   return max

In [None]:
find_max([1,3,5,1,6,7,2])