<a href="https://colab.research.google.com/github/cesarade/IA-Generativa-ChatGPT-Midjourney/blob/main/Base_LLM_vs_Fine_Tuned_LLM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Base LLMs vs Fine-tuned LLMs

## Enunciado del caso

En este caso práctico, se propone al alumno la implementación de un modelo base quee haya sido pre-entrenado (se recomienda T5) y si comparación con el mismo modelo depués de aplicarle Fine-tuned (se recomienda Flan-T5)

# Resolución del caso práctico

## 0. Instalación de librerías externas

In [1]:
!pip install transformers
!pip install sentencepiece
!pip install accelerate



# 1. Selección de un LLM base pre-entrenado

Tal y como hemos visto en secciones anteriores, existe una gran variedad de LLMs base que podemos utilzar: [https://huggingface.co/t5-base](https://huggingface.co/t5-base)

Este LLM esta compuesto por 220 millones de parámetros y ha sido pre-entrenado en número elevado de conjuntos de datos: [https://huggingface.co/google-t5/t5-base#training-details](https://huggingface.co/google-t5/t5-base#training-details)


## Lectura del modelo y del tokenizador

In [2]:
import accelerate
from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizador_T5 = T5Tokenizer.from_pretrained('t5-base')

model_T5 = T5ForConditionalGeneration.from_pretrained('t5-base', device_map = 'auto')

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
For now, this behavior is kept to avoid breaking backwards compatibility when padding/encoding with `truncation is True`.
- Be aware that you SHOULD NOT rely on t5-base automatically truncating your input to 512 when padding/encoding.
- If you want to encode/pad to sequences longer than 512 you can either instantiate this tokenizer with `model_max_length` or pass `max_length` when encoding/padding.
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior w

model.safetensors:   0%|          | 0.00/892M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

In [51]:
prompt = 'My name'

In [44]:
text = """The Second World War (also written World War II)1 was a global military \
conflict that took place between 1939 and 1945. It involved most of the world's \
nations - including all the major powers, as well as virtually all European nations \
- grouped into two opposing military alliances: the Allies on the one hand, and the \
Axis Powers on the other. It was the greatest war in history, with more than 100 \
million military personnel mobilized and a state of total war in which the major \
contenders devoted all their economic, military and scientific capabilities to the \
service of the war effort, blurring the distinction between civilian and military \
resources."""

prompt = f"Summarize: {text}"

In [53]:
prompt = "What do you think of Mars?"

In [6]:
prompt = "Translate to Spanish: 'How are you?'"

<pad> <extra_id_0> : 'How are you?' <extra_id_1> : 'How are you?' 'How are you?' 'How are you?' 'How are you?' 'How are you?' 'How are you?' 'How are you?' 'How are you?' 'How are you?' 'How are you?' 'How are you?' 'How are you?' 'How are


In [7]:
review = """Love these plugs, have a few now. We use them to plug in lights and \
set timers to turn them on and off via a phone app. Easy to use and linked to \
the internet and apps. Good value for money."""

prompt = f"Sentiment? Review: {review}"

<pad> True</s>


In [8]:
review1 = """Love these plugs, have a few now. We use them to plug in lights and \
set timers to turn them on and off via a phone app. Easy to use and linked to \
the internet and apps. Good value for money."""

review2 = """Tried and tried but could never get them to work right. Too bad \
I'm past my return date or they would have gone back."""

review3 = """A well-sized, reliable smart plug. The app is easy to use and set \
up, and works well. I used them to make several lamps. Everything works fine - \
no problems."""

review4 = """Great little product. Super easy to set up. Didn't even need to use \
the Alexa app to do so. Did it with my echo. Now I use it almost daily to turn on \
a light that was a pain to get to."""

review5 = """If I could give this zero stars I would. Plug wouldn’t connect. I \
had to keep connecting it and finally just gave up and returned it. Customer service \
was a complete waste of time."""

prompt = f"""
Review: {review1}
Sentiment: Positive

Review: {review2}
Sentiment: Negative

Review: {review3}
Sentiment: Positive

Review: {review5}
Sentiment:"""

<pad> <extra_id_0> easy to set up and use. <extra_id_1>. <extra_id_2> Positive Review: I'm past my return date or they would have gone back. <extra_id_3> : If I could give this zero stars I would. <extra_id_4> Positive Review: <extra_id_5> negative review: <extra_id_6> Positive Review: Would recommend. <extra_id_7> Positive Review: <extra_id_8> I <extra_id_9> Positive Review: Great product. Great value for money.</s>


In [43]:
prompt_tokens = tokenizador_T5(prompt, return_tensors='pt').input_ids.to('cuda')

outputs = model_T5.generate(prompt_tokens, max_length =100)

print(tokenizador_T5.decode(outputs[0]))

<pad> Yvonne</s>


# 2. Modelo fine-tuned

En este caso práctico, vamos a hacer del modelo base Flan-T5 (google/flan-t5-base).

Estos modelos se basan en T5 preentrenados (Raffel et al., 2020) y se les ha realizado fine-tunning para mejorar su rendimiento en más de 1.000 tareas adicionales y para soportar varios idiomas: [https://huggingface.co/google/flan-t5-base#training-details](https://huggingface.co/google/flan-t5-base#training-details)


## Lectura del modelo y tokenizador

In [46]:
from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizador_FT5 = T5Tokenizer.from_pretrained('google/flan-t5-base')

model_FT5 = T5ForConditionalGeneration.from_pretrained('google/flan-t5-base', device_map = 'auto')

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [48]:
prompt_tokens = tokenizador_FT5(prompt, return_tensors='pt').input_ids.to('cuda')

outputs = model_FT5.generate(prompt_tokens, max_length =200)

print(tokenizador_FT5.decode(outputs[0]))

<pad> The Second World War is a war in which the world's military and economic resources were largely devoted to the fight against the Nazis.</s>


# 3. Selección de un Fine-tuned LLM de 1000 millones de parámetros

En este último apartado vamoss a hacer uso de Flan-T5-Large que tiene un total de 1200 millosnes de párametros: [https://huggingface.co/google/flan-t5-large](https://huggingface.co/google/flan-t5-large)


## Lectura del modelo y tokenizador

In [49]:
from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizador_FT5 = T5Tokenizer.from_pretrained('google/flan-t5-large')

model_FT5 = T5ForConditionalGeneration.from_pretrained('google/flan-t5-large', device_map = 'auto')

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


## Lectura del modelo y del tokenizador

In [54]:
prompt_tokens = tokenizador_FT5(prompt, return_tensors='pt').input_ids.to('cuda')

outputs = model_FT5.generate(prompt_tokens, max_length =200)

print(tokenizador_FT5.decode(outputs[0]))

<pad> it is a planet</s>
