# Tutorial 7: Introducción a Large Language Models (LLM).

### Cuerpo Docente

- Profesores: [Andrés Abeliuk](https://aabeliuk.github.io/), [Felipe Villena](https://fabianvillena.cl/).
- Profesor Auxiliar: María José Zambrano


### Objetivos del Tutorial

- Entender que son los LLM.
- Comprender como utilizar la API de OpenAI.
- Que es Zero Shot Learning, y como utilizarlo.
- Que es Few Shot Learning, y como utilizarlo..
- Conocer la librería `scikit-learn`.

### Introducción

El poder de los Large Language Models se ha convertido en el nuevo tema de conversación en la comunidad de IA. Los primeros adoptantes se han abalanzado sobre las diferentes soluciones de IA generativa como GPT 3.5, GPT 4 y BARD para distintos casos de uso. Se han utilizado para tareas de preguntas y respuestas, escritura de texto creativo y análisis crítico. Dado que estos modelos están entrenados en tareas como la predicción de la siguiente oración en una gran variedad de corpus, se espera que sean excelentes en la generación de texto.

Las redes neuronales basadas en transformers permiten que el modelo se adapte también a tareas de aprendizaje automático basadas en el lenguaje como la clasificación, la traducción, la predicción y el reconocimiento de entidades. Por lo tanto, se ha facilitado para los científicos de datos el aprovechamiento de las plataformas de IA generativa para casos de uso de ML basados en el lenguaje más prácticos e industriales dando las instrucciones apropiadas. En este tutorial, nuestro objetivo es mostrar lo sencillo que es usar LLMs generativos para tareas de ML basadas en el lenguaje prevalentes utilizando el prompting y analizar críticamente los beneficios y limitaciones del prompting en escenarios de zero y few shot learning.

### ¿Qué es el Prompting?

Comencemos definiendo qué son los LLMs. Un es un sistema de aprendizaje profundo construido con múltiples capas de transformers y redes neuronales feed-forward que contienen cientos de millones a miles de millones de parámetros. Son entrenados en conjuntos de datos masivos de diferentes fuentes y están diseñados para comprender y generar texto. Algunas aplicaciones de ejemplo son la traducción de idiomas, la resumen de textos, la respuesta a preguntas, la generación de contenido y más. Existen diferentes tipos de LLMs: solo encoder (BERT), encoder + decoders (BART, T5) y solo decodificadores (PALM, GPT, etc.). Los LLMs con un componente decodificador se llaman LLMs generativos; este es el caso de la mayoría de los LLMs modernos.

Si le dices a un LLM generativo que realice una tarea, generará el texto correspondiente. Sin embargo, ¿cómo le decimos a un LLM generativo que realice una tarea específica? Es fácil; le damos una instrucción escrita. Los LLMs han sido diseñados para responder a los usuarios finales basándose en las instrucciones, también conocidas como prompts. Si has interactuado con un LLM como ChatGPT, has usado prompts. El prompting trata de empaquetar nuestra intención en una consulta de lenguaje natural que provocará que el modelo devuelva la respuesta deseada.

<img src="https://av-eks-lekhak.s3.amazonaws.com/media/__sized__/article_images/Screenshot_2023-08-26_at_11.29.48_AM-thumbnail_webp-600x300.webp">

Existen dos tipos principales de técnicas de prompting que examinaremos en las siguientes secciones: zero-shot y few-shot. Revisaremos sus detalles junto con algunos ejemplos básicos.

### Primero instalamos la librería OpenAI

In [1]:
!pip uninstall httpx
!pip uninstall openai

Found existing installation: httpx 0.28.0
Uninstalling httpx-0.28.0:
  Would remove:
    /usr/local/bin/httpx
    /usr/local/lib/python3.10/dist-packages/httpx-0.28.0.dist-info/*
    /usr/local/lib/python3.10/dist-packages/httpx/*
Proceed (Y/n)? y
  Successfully uninstalled httpx-0.28.0
Found existing installation: openai 1.54.5
Uninstalling openai-1.54.5:
  Would remove:
    /usr/local/bin/openai
    /usr/local/lib/python3.10/dist-packages/openai-1.54.5.dist-info/*
    /usr/local/lib/python3.10/dist-packages/openai/*
Proceed (Y/n)? y
  Successfully uninstalled openai-1.54.5


In [2]:
!pip install openai==1.55.3 httpx==0.27.2

Collecting openai==1.55.3
  Downloading openai-1.55.3-py3-none-any.whl.metadata (24 kB)
Collecting httpx==0.27.2
  Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)
Downloading openai-1.55.3-py3-none-any.whl (389 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m389.6/389.6 kB[0m [31m8.3 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading httpx-0.27.2-py3-none-any.whl (76 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.4/76.4 kB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: httpx, openai
Successfully installed httpx-0.27.2 openai-1.55.3


## Importar las librerías.

In [3]:
import pandas as pd
from openai import OpenAI

## Cargar el dataset

In [4]:
!wget https://raw.githubusercontent.com/fvillena/dcc-ia-nlp/master/tutoriales/data/ag_news/train.csv

--2024-12-10 21:54:11--  https://raw.githubusercontent.com/fvillena/dcc-ia-nlp/master/tutoriales/data/ag_news/train.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 28946992 (28M) [text/plain]
Saving to: ‘train.csv’


2024-12-10 21:54:12 (123 MB/s) - ‘train.csv’ saved [28946992/28946992]



In [5]:
train_df = pd.read_csv('train.csv')

In [6]:
train_df.head()

Unnamed: 0,Class Index,Title,Description
0,3,Wall St. Bears Claw Back Into the Black (Reuters),"Reuters - Short-sellers, Wall Street's dwindli..."
1,3,Carlyle Looks Toward Commercial Aerospace (Reu...,Reuters - Private investment firm Carlyle Grou...
2,3,Oil and Economy Cloud Stocks' Outlook (Reuters),Reuters - Soaring crude prices plus worries\ab...
3,3,Iraq Halts Oil Exports from Main Southern Pipe...,Reuters - Authorities have halted oil export\f...
4,3,"Oil prices soar to all-time record, posing new...","AFP - Tearaway world oil prices, toppling reco..."


In [7]:
labels = ['World','Sports', 'Business', 'Sci/Tech']

In [8]:
def map_label(class_index):
    return labels[class_index - 1]

In [9]:
train_df['Class Name'] = train_df['Class Index'].apply(map_label)

In [10]:
train_df.sample(5)

Unnamed: 0,Class Index,Title,Description,Class Name
97883,4,Consumers could be the losers in battle for ne...,"AFP - As videocassettes fade into history, a f...",Sci/Tech
89403,4,"Interview: Dell's Rollins lays out a road map,...","After eight years at the company, Kevin Rollin...",Sci/Tech
97949,2,Winky Wright Now Looking to Cash In (AP),AP - Shane Mosley gave Winky Wright his big ch...,Sports
37783,2,"Shockey unhappy with himself, not coach",Jeremy Shockey yesterday reiterated his commen...,Sports
83853,1,Thai premier visits deaths town,Thai Prime Minister Thaksin Shinawatra is visi...,World


## Usabilidad de la API de OpenAI para interactuar con Llama

In [11]:
from openai import OpenAI

client = OpenAI(
    api_key="",
    base_url="https://openrouter.ai/api/v1"
)



### Zero Shot Learning

El prompting de cero disparos es un escenario específico de aprendizaje de cero disparos único para LLMs generativos. En el de cero disparos, no proporcionamos datos etiquetados al modelo y esperamos que el modelo trabaje en un problema completamente nuevo. Por ejemplo, usar ChatGPT para prompting de cero disparos en tareas nuevas proporcionando instrucciones adecuadas. Los LLMs pueden adaptarse a problemas no vistos porque entienden contenido de muchas fuentes. Veamos algunos ejemplos.

Aquí hay un ejemplo de consulta para la clasificación de texto en clases de sentimiento positivo, neutral y negativo.

<img src="https://av-eks-lekhak.s3.amazonaws.com/media/__sized__/article_images/Screenshot_2023-08-26_at_3.02.54_AM-thumbnail_webp-600x300.webp">

#### Ejemplos de Tweets

Los ejemplos de tweets provienen del Dataset de Sentimiento de Aerolíneas de EE.UU. de Twitter. El conjunto de datos consiste en tweets de retroalimentación a diferentes aerolíneas etiquetados como positivos, neutrales o negativos. En la Figura 2 (Fuente: ChatGPT), proporcionamos el nombre de la tarea, es decir, Clasificación de Sentimiento, las clases, es decir, positivo, neutral y negativo, el texto y el prompt para clasificar. La retroalimentación de la aerolínea en la Figura 2 es positiva y aprecia la experiencia de vuelo con la aerolínea. ChatGPT clasificó correctamente el sentimiento de la reseña como positivo, mostrando la capacidad de ChatGPT para generalizar en una nueva tarea.

<img src="https://av-eks-lekhak.s3.amazonaws.com/media/__sized__/article_images/Screenshot_2023-08-26_at_3.00.55_AM-thumbnail_webp-600x300.webp">

La Figura 3 anterior muestra a Chat GPT con un enfoque de cero disparos en otro ejemplo, pero con sentimiento negativo. Chat GPT predice correctamente el sentimiento del tweet nuevamente. Mientras hemos mostrado dos ejemplos donde el modelo clasifica con éxito el texto de la reseña, hay varios casos límite donde incluso los LLMs más avanzados fallan. El usuario se queja de la calidad de la comida con la aerolínea; Chat GPT identifica incorrectamente el sentimiento como neutral.

In [12]:
content1 = f"I will provide you with a dataset of news articles, which you must classify correctly. The labels are {labels[0]}, {labels[1]}, {labels[2]}, and {labels[3]}, and please list the article and its the label in your answer."
content1

'I will provide you with a dataset of news articles, which you must classify correctly. The labels are World, Sports, Business, and Sci/Tech, and please list the article and its the label in your answer.'

In [13]:
new_examples = train_df.sample(5)

In [14]:
descriptions = list(new_examples['Description'])

content2 = ''

for idx, new in enumerate(descriptions):
  content2 += f'New {idx + 1}: {new}\n'


In [15]:
print(content2)

New 1: But he admits he never expected to be on the brink of breaking the NCAA career receptions record when he first arrived in West Lafayette five years ago.
New 2: Sunday night's game against the Browns marks Kyle Boller's 17th career start for the Ravens. With the extra experience come more expectations for the quarterback.
New 3: Royal Dutch/Shell has agreed to pay \$120 million to settle fraud charges with the S.E.C. involving overstated oil reserves, the U.S. securities regulator said today.
New 4: Industrial output in the US rose for a 16th straight month in September,  ahead of analysts' forecasts.
New 5: AP - Marion Jones should be stripped of her five Olympic medals if allegations that she used banned drugs before the 2000 Sydney Games are true, World Anti-Doping Agency chief Dick Pound said Friday.



In [16]:
chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": content1,
        },
        {
            'role':'user',
            'content': content2
        }
    ],
    model="meta-llama/llama-3.3-70b-instruct",
)

In [17]:
print(chat_completion)

ChatCompletion(id='gen-1733867654-RATYw9zKv1OLX9QbuDAQ', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Here are the classifications for each news article:\n\n1. New 1: Sports\n2. New 2: Sports\n3. New 3: Business\n4. New 4: Business\n5. New 5: Sports\n\nLet me know if I got them all correct!', refusal='', role='assistant', audio=None, function_call=None, tool_calls=None))], created=1733867654, model='meta-llama/llama-3.3-70b-instruct', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=59, prompt_tokens=239, total_tokens=298, completion_tokens_details=None, prompt_tokens_details=None), provider='DeepInfra')


In [18]:
print(chat_completion.choices[0].message.content)

Here are the classifications for each news article:

1. New 1: Sports
2. New 2: Sports
3. New 3: Business
4. New 4: Business
5. New 5: Sports

Let me know if I got them all correct!


In [19]:
new_examples

Unnamed: 0,Class Index,Title,Description,Class Name
95558,2,Stubblefield eyes NCAA receptions mark,But he admits he never expected to be on the b...,Sports
82969,2,Boller Expects More,Sunday night's game against the Browns marks K...,Sports
8941,3,Shell Settles Oil Reserve Case for \$120 Million,Royal Dutch/Shell has agreed to pay \$120 mill...,Business
46641,3,US industrial output still rising,Industrial output in the US rose for a 16th st...,Business
108718,2,Pound: Jones Should Lose Olympic Medals (AP),AP - Marion Jones should be stripped of her fi...,Sports


### Few Shot Learning

Se refiere a la capacidad de los LLM para adaptarse y realizar tareas específicas de procesamiento del lenguaje natural con un número muy limitado de ejemplos de entrenamiento. A diferencia del zero-shot learning, donde no se utilizan ejemplos de entrenamiento específicos de la tarea, en el few-shot learning se proporcionan unos pocos ejemplos para guiar al modelo sobre la nueva tarea. Este enfoque aprovecha la rica representación del lenguaje que los LLMs han desarrollado durante su entrenamiento extensivo en diversos datos de texto para generalizar a nuevas tareas con mínima información adicional.

Los LLMs pueden realizar few-shot learning de manera efectiva porque han sido entrenados en tareas de predicción de texto, lo que les permite comprender patrones lingüísticos y semánticos complejos. Al proporcionar ejemplos de entrada y salida durante la inferencia, se les guía para generar las respuestas o las predicciones deseadas. Este método es especialmente útil cuando se dispone de datos limitados o cuando etiquetar datos es impracticable debido a costos o restricciones de tiempo, permitiendo a los LLMs extender su utilidad más allá de las capacidades de aprendizaje de cero disparos.

In [20]:
content1 = f"I will provide you with a dataset of news articles, which you must classify correctly. The labels are {labels[0]}, {labels[1]}, {labels[2]}, and {labels[3]}, and please list the article and its the label in your answer. \n"

content1 += f'Additionally, I will provide you a few examples:\n'



In [21]:
few_shot_examples = train_df.sample(5)
few_shot_examples

Unnamed: 0,Class Index,Title,Description,Class Name
49496,4,Sun loses \$1 billion lawsuit to Kodak,Let #39;s say you #39;re a technology company ...,Sci/Tech
83279,4,China launches another satellite,China successfully launched another satellite ...,Sci/Tech
95059,3,XM Satellite convertible debt priced,XM Satellite Radio Holdings Inc. (XMSR.O: Quot...,Business
68002,4,Space probe aims to slam into comet,and possibly help future scientists determine ...,Sci/Tech
3528,2,Billie Jean King Retires as U.S. Fed Cup Coach,NEW YORK (Reuters) - Billie Jean King cut her...,Sports


In [22]:

descriptions = list(few_shot_examples['Description'])
classes = list(few_shot_examples['Class Name'])

for idx, data in enumerate(zip(descriptions, classes)):
  content1 += f'Article {idx + 1}: {data[0]}\n'
  content1 += f'Class Name: {data[1]}\n'


In [23]:
print(content1)

I will provide you with a dataset of news articles, which you must classify correctly. The labels are World, Sports, Business, and Sci/Tech, and please list the article and its the label in your answer. 
Additionally, I will provide you a few examples:
Article 1: Let #39;s say you #39;re a technology company that #39;s now struggling to cope with technological changes that threaten the very premise of your business... Waitaminute, maybe that particular hypothetical isn #39;t such 
Class Name: Sci/Tech
Article 2: China successfully launched another satellite Saturday, its 40th consecutive successful liftoff, the government-run Xinhua news agency reported.
Class Name: Sci/Tech
Article 3: XM Satellite Radio Holdings Inc. (XMSR.O: Quote, Profile, Research) on Thursday said a \$300 million offering of senior notes convertible into common stock has been priced at \$50 a share.
Class Name: Business
Article 4: and possibly help future scientists determine how to keep such space rocks from coll

In [24]:
articule_examples = train_df.sample(5)
articule_examples

Unnamed: 0,Class Index,Title,Description,Class Name
50816,1,US: UN Gaza Resolution Not Balanced,"UNITED NATIONS Oct. 5, 2004 - The United State...",World
117344,3,"Oil Tops \$44, Heating Supply a Concern",Oil prices surged nearly 6 percent on Wednesda...,Business
38353,3,Boeing CEO: Jet Market Recovery Slower,Boeing Co. Chief Executive Harry Stonecipher s...,Business
18935,1,Russian Town Mourns As Putin Vows Fight,"BESLAN, Russia - Wails of mourning echoed thro...",World
110289,4,"Online Music Store Wippit Signs Universal, War...",Reuters - European digital music service Wippi...,Sci/Tech


In [25]:
descriptions = list(articule_examples['Description'])

content2 = ''

for idx, new in enumerate(descriptions):
  content2 += f'Articule {idx + 1}: {new}\n'

content2 += "Please, in your answer show the article and its label. But it just shows this one."

In [26]:
print(content2)

Articule 1: UNITED NATIONS Oct. 5, 2004 - The United States said Tuesday a draft Security Council resolution by Arab states demanding and end to a large-scale Israeli offensive in Gaza was not balanced and vowed that the measure will not go forward.
Articule 2: Oil prices surged nearly 6 percent on Wednesday as the US government reported a fall in heating oil stocks in the midst of the season #39;s first arctic cold blast in the Northeast.
Articule 3: Boeing Co. Chief Executive Harry Stonecipher said on Friday the US aircraft maker #39;s archrival Airbus was exaggerating the speed of recovery in the commercial airplane market.
Articule 4: BESLAN, Russia - Wails of mourning echoed through the streets of this southern Russian town on Sunday, and the region's top police officer reportedly resigned in the wake of the school hostage-taking that left more than 380 people dead - nearly half of them children.    A shaken President Vladimir Putin went on national television Saturday to make a r

In [27]:
chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": content1,
        },
        {
            'role':'user',
            'content': content2
        }
    ],
    model="gpt-4",
)

In [28]:
print(chat_completion.choices[0].message.content)

Articule 1: UNITED NATIONS Oct. 5, 2004 - The United States said Tuesday a draft Security Council resolution by Arab states demanding and end to a large-scale Israeli offensive in Gaza was not balanced and vowed that the measure will not go forward.
Class Name: World
Articule 2: Oil prices surged nearly 6 percent on Wednesday as the US government reported a fall in heating oil stocks in the midst of the season #39;s first arctic cold blast in the Northeast.
Class Name: Business
Articule 3: Boeing Co. Chief Executive Harry Stonecipher said on Friday the US aircraft maker #39;s archrival Airbus was exaggerating the speed of recovery in the commercial airplane market.
Class Name: Business
Articule 4: BESLAN, Russia - Wails of mourning echoed through the streets of this southern Russian town on Sunday, and the region's top police officer reportedly resigned in the wake of the school hostage-taking that left more than 380 people dead - nearly half of them children.    A shaken President Vla

In [29]:
articule_examples

Unnamed: 0,Class Index,Title,Description,Class Name
50816,1,US: UN Gaza Resolution Not Balanced,"UNITED NATIONS Oct. 5, 2004 - The United State...",World
117344,3,"Oil Tops \$44, Heating Supply a Concern",Oil prices surged nearly 6 percent on Wednesda...,Business
38353,3,Boeing CEO: Jet Market Recovery Slower,Boeing Co. Chief Executive Harry Stonecipher s...,Business
18935,1,Russian Town Mourns As Putin Vows Fight,"BESLAN, Russia - Wails of mourning echoed thro...",World
110289,4,"Online Music Store Wippit Signs Universal, War...",Reuters - European digital music service Wippi...,Sci/Tech



## Utilizar la librería `scikit-llm`

La librería `scikit-learn` es una librería programada en Python, que busca encapsular la interación con la API de OpenAI, y llevarla a una interfaz conocida como lo es `scikit-learn`. Además, implementa las principales técnicas de prompting al interacturar con un LLM, como por ejemplo zero y few shot learning.


Para visualizar más ejemplo de su uso pueden visitar, su [repositorio](https://github.com/iryna-kondr/scikit-llm).

#### Instalación

In [30]:
!pip install scikit-llm

Collecting scikit-llm
  Downloading scikit_llm-1.4.1-py3-none-any.whl.metadata (4.3 kB)
Downloading scikit_llm-1.4.1-py3-none-any.whl (51 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m51.4/51.4 kB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: scikit-llm
Successfully installed scikit-llm-1.4.1


#### Importar principales clasificadores

In [31]:
from skllm.config import SKLLMConfig

from skllm.models.gpt.classification.zero_shot import ZeroShotGPTClassifier
from skllm.datasets import get_classification_dataset

from skllm.models.gpt.classification.few_shot import FewShotGPTClassifier
from skllm.datasets import get_classification_dataset

from sklearn.metrics import classification_report

from sklearn.metrics import confusion_matrix, classification_report

#### Configurar API de OpenAi

In [32]:
SKLLMConfig.set_openai_key("")

#### Ejemplo Zero shot learning

In [33]:
X, y_true = get_classification_dataset()

clf = ZeroShotGPTClassifier(openai_model="gpt-3.5-turbo")
clf.fit(None, ["positive", "negative", "neutral"])
y_pred = clf.predict(X)

100%|██████████| 30/30 [00:23<00:00,  1.26it/s]


#### Ejemplo few shot learning

In [34]:
print(classification_report(y_true, y_pred, target_names=["positive", "negative", "neutral"]))

              precision    recall  f1-score   support

    positive       0.71      1.00      0.83        10
    negative       1.00      0.60      0.75        10
     neutral       1.00      1.00      1.00        10

    accuracy                           0.87        30
   macro avg       0.90      0.87      0.86        30
weighted avg       0.90      0.87      0.86        30



In [35]:


X, y_true = get_classification_dataset()

clf = FewShotGPTClassifier(openai_model="gpt-3.5-turbo")
clf.fit(X, y_true)
y_pred = clf.predict(X)

100%|██████████| 30/30 [00:18<00:00,  1.65it/s]


In [36]:
print(classification_report(y_true, y_pred))

              precision    recall  f1-score   support

    negative       1.00      1.00      1.00        10
     neutral       1.00      1.00      1.00        10
    positive       1.00      1.00      1.00        10

    accuracy                           1.00        30
   macro avg       1.00      1.00      1.00        30
weighted avg       1.00      1.00      1.00        30



### Ejercicio

Intente realizar un ejercicio de clasificación utilizando las técnicas de zero y few shot learning en el dataset de `ag_news` para todos los documentos de texto.