In [1]:
pip install openai azure-storage-blob azure-identity


Collecting openai
  Using cached openai-1.59.7-py3-none-any.whl.metadata (27 kB)
Collecting azure-storage-blob
  Using cached azure_storage_blob-12.24.0-py3-none-any.whl.metadata (26 kB)
Collecting azure-identity
  Using cached azure_identity-1.19.0-py3-none-any.whl.metadata (80 kB)
Collecting anyio<5,>=3.5.0 (from openai)
  Using cached anyio-4.8.0-py3-none-any.whl.metadata (4.6 kB)
Collecting distro<2,>=1.7.0 (from openai)
  Using cached distro-1.9.0-py3-none-any.whl.metadata (6.8 kB)
Collecting httpx<1,>=0.23.0 (from openai)
  Using cached httpx-0.28.1-py3-none-any.whl.metadata (7.1 kB)
Collecting jiter<1,>=0.4.0 (from openai)
  Using cached jiter-0.8.2-cp312-cp312-win_amd64.whl.metadata (5.3 kB)
Collecting pydantic<3,>=1.9.0 (from openai)
  Using cached pydantic-2.10.5-py3-none-any.whl.metadata (30 kB)
Collecting sniffio (from openai)
  Using cached sniffio-1.3.1-py3-none-any.whl.metadata (3.9 kB)
Collecting tqdm>4 (from openai)
  Using cached tqdm-4.67.1-py3-none-any.whl.metadata 

In [3]:
import os
import tiktoken
import openai
import numpy as np
import pandas as pd
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from tenacity import retry, wait_random_exponential, stop_after_attempt

# Load environment variables
load_dotenv()

# Option 1 - Use Azure AD authentication with az cli (use az login in terminal)
# default_credential = DefaultAzureCredential()
# token = default_credential.get_token("https://cognitiveservices.azure.com/.default")
# openai.api_type = "azure_ad"
# openai.api_base = os.environ.get("OPENAI_API_BASE")
# openai.api_key = token.token
# openai.api_version = "2022-12-01"


api_key = os.environ.get("AZURE_OPENAI_API_KEY")

# Define embedding model and encoding
EMBEDDING_MODEL = 'text-embedding-ada-002'
COMPLETION_MODEL = 'text-davinci-003'
encoding = tiktoken.get_encoding('cl100k_base')


Test if completitions work:

In [5]:
import os
from openai import AzureOpenAI

client = AzureOpenAI(
  azure_endpoint = "https://openai-tajamar-1.openai.azure.com/", 
  api_key=api_key,
  api_version="2024-02-01"
)


deployment_name='gpt-4o-mini' #This will correspond to the custom name you chose for your deployment when you deployed a model. 
    
response = client.chat.completions.create(
model=deployment_name,

messages=[

    {"role": "system", "content": "YOU ARE USEFUL ASSISTANT"},

    {"role": "user", "content": "How are you"}

]

)

print(response.choices[0].message.content)



I'm just a computer program, so I don't have feelings, but I'm here and ready to help you! How can I assist you today?


Do it in a streaming fashion:

Test if embeddings work:

In [7]:
response = client.embeddings.create(
    input = "Hola mundo, soy Sergio",
    model= "text-embedding-ada-002"  # model = "deployment_name".
)

print(response.model_dump_json(indent=2))

{
  "data": [
    {
      "embedding": [
        0.0021280688233673573,
        -0.011602435261011124,
        -0.018362557515501976,
        -0.02084742859005928,
        -0.007282582577317953,
        0.026964033022522926,
        -0.022631438449025154,
        0.016553062945604324,
        -0.02783055230975151,
        -0.024950651451945305,
        0.016553062945604324,
        0.0039280070923268795,
        -0.01269832719117403,
        0.010551143437623978,
        0.013571217656135559,
        -0.0121376384049654,
        0.01850273087620735,
        -0.02265692502260208,
        0.03884044289588928,
        0.020223025232553482,
        -0.011627920903265476,
        -0.0037527920212596655,
        0.035986024886369705,
        -0.006298191845417023,
        -0.005393443629145622,
        0.000798822205979377,
        0.010570257902145386,
        -0.007881500758230686,
        0.022733381018042564,
        -0.03249446302652359,
        0.021204231306910515,
        -0.00573431

Test if tokenizer works:

In [8]:
tokens = encoding.encode("Hello world!")
print(tokens)
print(len(tokens))

[9906, 1917, 0]
3


# Ejercicio 1: Generación de texto con diferentes prompts
# Objetivo: Experimentar con distintos prompts para observar cómo varía la respuesta del modelo.


In [9]:
prompts = [
    "Write a poem about the sea",
    "Summarize the latest advancements in AI",
    "Explain quantum physics in simple terms",
    "Generate a list of creative startup ideas"
]

for prompt in prompts:
    response = client.chat.completions.create(
        model=deployment_name,
        messages=[
            {"role": "system", "content": "You are a knowledgeable assistant."},
            {"role": "user", "content": prompt}
        ]
    )
    print(f"Prompt: {prompt}")
    print(f"Response: {response.choices[0].message.content}\n")

Prompt: Write a poem about the sea
Response: **Whispers of the Sea**

In the hush of dawn’s soft light,  
The sea awakens, deep and bright,  
A tapestry of azure hue,  
Where dreams are born, and hearts renew.  

Gentle waves kiss sandy shores,  
Their rhythmic dance, a song of yore,  
Each crest a tale, each trough a sigh,  
A timeless story, where spirits fly.  

The salty breeze, like whispered lore,  
Calls to the wanderers, forevermore,  
With shells that echo the ocean’s song,  
Reminding us where we belong.  

Beneath the surface, mysteries sleep,  
Where shadows play and secrets keep,  
Coral gardens, vibrant and brave,  
A world untouched, the ocean's cave.  

At sunset, where the sky ignites,  
The horizon blurs in fiery flights,  
The sea, a mirror of the soul,  
Reflects our wishes, our hopes made whole.  

In stormy tempests, fierce and wild,  
The sea’s embrace can be beguiled,  
Yet in its fury, we oft find grace,  
A power profound, a sacred space.  

So here I stand, w

# Ejercicio 2: Comparación de similitud entre textos usando embeddings
# Objetivo: Calcular la similitud coseno entre dos fragmentos de texto.

In [11]:
from sklearn.metrics.pairwise import cosine_similarity


In [18]:
text1 = "Machine learning is fascinating."
text2 = "Artificial intelligence drives machine learning."
text3 = "I'm an AI engineer."

embedding1 = client.embeddings.create(input=text1, model=EMBEDDING_MODEL).data[0].embedding
embedding2 = client.embeddings.create(input=text2, model=EMBEDDING_MODEL).data[0].embedding
embedding3 = client.embeddings.create(input=text3, model=EMBEDDING_MODEL).data[0].embedding


import numpy as np  
from sklearn.metrics.pairwise import cosine_similarity  

embedding1 = np.array(embedding1).reshape(1, -1)  
embedding2 = np.array(embedding2).reshape(1, -1)  
embedding3 = np.array(embedding3).reshape(1, -1)  

similarity_1_2 = cosine_similarity(embedding1, embedding2)[0][0]  
similarity_1_3 = cosine_similarity(embedding1, embedding3)[0][0]  

print(f"Similarity between text1 and text2: {similarity_1_2:.2f}")  
print(f"Similarity between text1 and text3: {similarity_1_3:.2f}")  

Similarity between text1 and text2: 0.88
Similarity between text1 and text3: 0.83


# Ejercicio 3: Tokenización de frases complejas
# Objetivo: Practicar la tokenización de frases más largas y analizar la cantidad de tokens generados.

In [19]:
sentences = [
    "The quick brown fox jumps over the lazy dog.",
    "Exploring the universe through the lens of scientific discovery is an unending journey of knowledge.",
    "AI is transforming industries at an unprecedented pace."
]

for sentence in sentences:
    tokens = encoding.encode(sentence)
    print(f"Sentence: {sentence}")
    print(f"Tokens: {tokens}")
    print(f"Number of tokens: {len(tokens)}\n")

Sentence: The quick brown fox jumps over the lazy dog.
Tokens: [791, 4062, 14198, 39935, 35308, 927, 279, 16053, 5679, 13]
Number of tokens: 10

Sentence: Exploring the universe through the lens of scientific discovery is an unending journey of knowledge.
Tokens: [45053, 5620, 279, 15861, 1555, 279, 18848, 315, 12624, 18841, 374, 459, 653, 2518, 11879, 315, 6677, 13]
Number of tokens: 18

Sentence: AI is transforming industries at an unprecedented pace.
Tokens: [15836, 374, 46890, 19647, 520, 459, 31069, 18338, 13]
Number of tokens: 9



# Ejercicio 4: Generación de texto en streaming
# Objetivo: Implementar una generación de texto que muestre la respuesta en tiempo real (streaming).

In [20]:
import time

stream_response = client.chat.completions.create(
    model=deployment_name,
    messages=[
        {"role": "system", "content": "You are a storytelling assistant."},
        {"role": "user", "content": "Tell me a story about a brave knight."}
    ],
    stream=True
)

for chunk in stream_response:
    if chunk.choices:
        print(chunk.choices[0].delta.content or "", end="", flush=True)
        time.sleep(0.05)

print("\n\n--- End of Story ---")

Once upon a time, in a realm cloaked in mist and legend, there existed a kingdom known as Eldoria. This kingdom flourished under the benevolent rule of Queen Seraphina, a wise and just monarch. However, peace in Eldoria was threatened by a malevolent dragon named Drakthar, who soared through the skies, casting a shadow of fear over the land. The dragon demanded tribute from the villagers, hoarding gold, livestock, and even the bravest of souls who dared to defy him.

Amidst the despair, there emerged a knight of unparalleled valor named Sir Cedric. He was known throughout the land not only for his skill with a sword but also for his unwavering sense of justice and compassion for the downtrodden. Sir Cedric had heard the cries of the villagers and witnessed their suffering, and he could no longer stand idly by.

One fateful morning, clad in gleaming armor and wielding his enchanted sword, Elysium, Sir Cedric set out on a quest to confront Drakthar. Before leaving, he visited the small v

## Ejercicio 2: Comparación de similitudes semánticas entre dos párrafos

En este ejercicio, aprenderás a usar los embeddings de OpenAI para calcular la similitud semántica entre dos párrafos, lo que te permitirá comparar la cercanía entre textos utilizando la métrica de similitud coseno.

Pasos:

Usa el modelo de embeddings text-embedding-ada-002 de OpenAI.
Calcular la similitud coseno entre dos fragmentos de texto.
Imprimir el valor de similitud entre los párrafos.

Resultado esperado:

Similitud entre text1 y text2: 0.94
Similitud entre text1 y text3: 0.15
Explicación: Los estudiantes aprenderán cómo usar embeddings para evaluar la similitud semántica entre diferentes fragmentos de texto. En este caso, text1 y text2 son similares en contenido relacionado con la inteligencia artificial, por lo que la similitud es alta (0.94), mientras que text3 habla sobre actividades al aire libre, lo que genera una similitud baja con los otros textos.



In [21]:
# Frases de ejemplo
text1 = "Artificial intelligence is transforming the world in every sector."
text2 = "AI is revolutionizing industries globally, changing the way we work."
text3 = "I love hiking and spending time in nature."

embedding1 = client.embeddings.create(input=text1, model=EMBEDDING_MODEL).data[0].embedding
embedding2 = client.embeddings.create(input=text2, model=EMBEDDING_MODEL).data[0].embedding
embedding3 = client.embeddings.create(input=text3, model=EMBEDDING_MODEL).data[0].embedding


# Calcular la similitud coseno
similarity_1_2 = cosine_similarity([embedding1], [embedding2])[0][0]
similarity_1_3 = cosine_similarity([embedding1], [embedding3])[0][0]

print(f"Similitud entre text1 y text2: {similarity_1_2:.2f}")
print(f"Similitud entre text1 y text3: {similarity_1_3:.2f}")


Similitud entre text1 y text2: 0.93
Similitud entre text1 y text3: 0.73


## Ejercicio 3: Extracción de palabras clave de un texto utilizando embeddings

Este ejercicio tiene como objetivo enseñar cómo usar OpenAI para extraer palabras clave relevantes de un texto, basándose en la similitud de los embeddings.

Pasos:

Usa un fragmento de texto.
Genera embeddings para cada palabra.
Evalúa la similitud de cada palabra con el texto completo para identificar las palabras más relevantes.

Resultado esperado:

Palabra: artificial - Similitud: 0.92
Palabra: intelligence - Similitud: 0.88
Palabra: healthcare - Similitud: 0.83
Palabra: finance - Similitud: 0.80
Palabra: education - Similitud: 0.78
Palabra: revolutionize - Similitud: 0.70

Explicación: El modelo de OpenAI se utiliza para evaluar la similitud de cada palabra clave con el contexto general del texto. Las palabras más relevantes, como "artificial" e "intelligence", tienen una mayor similitud con el texto completo, mientras que otras, como "revolutionize", tienen una menor similitud.


In [22]:


# Fragmento de texto
text = "Artificial intelligence has the potential to revolutionize multiple sectors including healthcare, finance, and education."

# Generar embeddings para el texto completo
text_embedding = client.embeddings.create(input=text, model=EMBEDDING_MODEL).data[0].embedding

# Listado de palabras clave a evaluar
keywords = ["artificial", "intelligence", "healthcare", "revolutionize", "finance", "education"]

# Calcular embeddings para cada palabra y compararlos con el texto
word_similarities = {}
for word in keywords:
    word_embedding = client.embeddings.create(input=word, model=EMBEDDING_MODEL).data[0].embedding
    similarity = cosine_similarity([text_embedding], [word_embedding])[0][0]
    word_similarities[word] = similarity

# Mostrar palabras clave ordenadas por relevancia
sorted_keywords = sorted(word_similarities.items(), key=lambda x: x[1], reverse=True)

for word, similarity in sorted_keywords:
    print(f"Palabra: {word} - Similitud: {similarity:.2f}")


Palabra: healthcare - Similitud: 0.83
Palabra: intelligence - Similitud: 0.81
Palabra: education - Similitud: 0.80
Palabra: revolutionize - Similitud: 0.79
Palabra: finance - Similitud: 0.79
Palabra: artificial - Similitud: 0.79


## Ejercicio 4: Generación de resúmenes con OpenAI
Enunciado:
En este ejercicio, el objetivo es generar un resumen breve de un texto largo utilizando el modelo de OpenAI.

Pasos:

Usa un texto largo y genera un resumen de este utilizando un prompt adecuado.
Configura correctamente la solicitud a OpenAI para obtener un resumen conciso.

Resultado esperado:

Resumen generado:
Artificial intelligence (AI) is the intelligence demonstrated by machines, contrasting with natural intelligence. AI is defined as the study of intelligent agents, which are devices that perceive their environment and take actions to maximize their goals. The field of AI aims to mimic human cognitive functions, such as learning and problem-solving, though tasks considered to require intelligence are often excluded from AI as machines improve.

Explicación: Este ejercicio enseña cómo utilizar OpenAI para generar resúmenes de textos largos. El modelo utiliza el contexto dado para condensar la información sin perder el sentido de lo que se quiere transmitir.


In [23]:


# Texto largo
long_text = """
Artificial intelligence (AI) is intelligence demonstrated by machines, in contrast to the natural intelligence displayed by humans and animals. Leading AI textbooks define the field as the study of "intelligent agents": any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals. Colloquially, the term "artificial intelligence" is often used to describe machines (or computers) that mimic "cognitive" functions that humans associate with the human mind, such as "learning" and "problem-solving". As machines become increasingly capable, tasks considered to require "intelligence" are often removed from the definition of AI, a phenomenon known as the AI effect. AI research has been defined as the study of "intelligent agents"."

"""

prompt = f"Provide a brief summary of the following text:\n{long_text}"
# Solicitar resumen

response = client.chat.completions.create(
        model=deployment_name,
        messages=[
            {"role": "system", "content": "You are a knowledgeable assistant."},
            {"role": "user", "content": prompt}
        ],
    max_tokens=100
    )

# Mostrar resumen
print(f"Resumen generado:\n{response.choices[0].message.content}")


Resumen generado:
Artificial intelligence (AI) refers to the intelligence exhibited by machines, differing from the natural intelligence of humans and animals. It is often described as the study of "intelligent agents"—devices that perceive their environments and act to achieve specific goals. The term commonly encompasses machines that emulate cognitive functions typical of human thought, such as learning and problem-solving. As machines advance in capability, tasks previously deemed requiring "intelligence" are removed from AI's definition, a phenomenon known as the AI effect.


## Ejercicio 5: Generación de respuestas basadas en conocimiento específico
Enunciado:
Este ejercicio te ayudará a generar respuestas detalladas y útiles a partir de un contexto o conocimiento específico, por ejemplo, sobre la historia de la informática.

Pasos:

Crea un conjunto de datos que contenga información básica sobre un tema.
Genera una respuesta a una pregunta específica utilizando ese contexto como referencia.

Resultado esperado:

Charles Babbage is known as the father of computing.

Explicación: Este ejercicio demuestra cómo los modelos de OpenAI pueden generar respuestas útiles basadas en información específica proporcionada en un contexto. Los estudiantes aprenderán cómo usar esta capacidad para obtener respuestas precisas dentro de un marco de conocimiento.


In [24]:

# Contexto sobre la historia de la informática
context = """
The history of computer science dates back to the early 19th century, with key figures such as Charles Babbage, who is considered the "father of computing" for his invention of the Analytical Engine. In the mid-20th century, computers became more practical, with the development of the first programmable computers and the invention of the transistor. Later, the development of the internet revolutionized the field and led to the rise of modern computing technologies.
"""

# Pregunta
question = "Who is known as the father of computing?"

# Solicitar respuesta basada en el contexto

response = client.chat.completions.create(
        model=deployment_name,
        messages=[
            {"role": "system", "content": "You are a knowledgeable assistant."},
            {"role": "user", "content": f"Based on the following information, answer the question:\n{context}\n\nQuestion: {question}\nAnswer:"}
        ],
    max_tokens=100
    )

# Mostrar respuesta
print(f"Respuesta generada:\n{response.choices[0].message.content}")


Respuesta generada:
Charles Babbage is known as the father of computing.



## Ejercicio 6: Análisis de emociones en un texto

En este ejercicio, el objetivo es enseñar cómo realizar un análisis básico de emociones en un texto utilizando los modelos de OpenAI. El modelo debería identificar la emoción principal expresada en un texto.

Pasos:

Escribe varios textos que expresen diferentes emociones.
Usa OpenAI para clasificar la emoción principal de cada texto.

Resultado esperado:

Texto: I am so happy! Today was the best day of my life.
Emoción: Happiness

Texto: I feel really sad and upset after hearing the news.
Emoción: Sadness

Texto: The project is going great, I'm feeling confident and motivated.
Emoción: Confidence

Explicación: Los estudiantes aprenderán a utilizar OpenAI para realizar análisis emocionales de textos y comprender cómo los modelos pueden identificar emociones basadas en el contenido proporcionado.


In [25]:

# Textos de ejemplo con emociones
texts = [
    "I am so happy! Today was the best day of my life.",
    "I feel really sad and upset after hearing the news.",
    "The project is going great, I'm feeling confident and motivated.",
]

# Solicitar análisis de emociones
for text in texts:
    response = client.chat.completions.create(
        model=deployment_name,
        messages=[
            {"role": "system", "content": "You are a knowledgeable assistant that analise sentiment."},
            {"role": "user", "content": f"Analyze the emotion in the following text:\n{text}\nEmotion:"}
        ],
    max_tokens=10
    )
    print(f"Texto: {text}\nEmoción: {response.choices[0].message.content.strip()}\n")


Texto: I am so happy! Today was the best day of my life.
Emoción: The emotion expressed in the text is overwhelmingly positive.

Texto: I feel really sad and upset after hearing the news.
Emoción: The emotion expressed in the text is sadness and upset

Texto: The project is going great, I'm feeling confident and motivated.
Emoción: The emotions expressed in the text are positive and optimistic

