# M-Shots Learning

In this notebook, we'll explore small prompt engineering techniques and recommendations that will help us elicit responses from the models that are better suited to our needs.

In [1]:
from openai import OpenAI
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

OPENAI_API_KEY  = os.getenv('OPENAI_API_KEY')

# Formatting the answer with Few Shot Samples.

To obtain the model's response in a specific format, we have various options, but one of the most convenient is to use Few-Shot Samples. This involves presenting the model with pairs of user queries and example responses.

Large models like GPT-3.5 respond well to the examples provided, adapting their response to the specified format.

Depending on the number of examples given, this technique can be referred to as:
* Zero-Shot.
* One-Shot.
* Few-Shots.

With One Shot should be enough, and it is recommended to use a maximum of six shots. It's important to remember that this information is passed in each query and occupies space in the input prompt.



In [2]:
# Function to call the model.
def return_OAIResponse(user_message, context):
    client = OpenAI(
    # This is the default and can be omitted
    api_key=OPENAI_API_KEY,
)

    newcontext = context.copy()
    newcontext.append({'role':'user', 'content':"question: " + user_message})

    response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=newcontext,
            temperature=1,
        )

    return (response.choices[0].message.content)

In this zero-shots prompt we obtain a correct response, but without formatting, as the model incorporates the information he wants.

In [3]:
#zero-shot
context_user = [
    {'role':'system', 'content':'You are an expert in F1.'}
]
print(return_OAIResponse("Who won the F1 2010?", context_user))

The 2010 Formula 1 World Championship was won by Sebastian Vettel, driving for Red Bull Racing.


For a model as large and good as GPT 3.5, a single shot is enough to learn the output format we expect.


In [4]:
#one-shot
context_user = [
    {'role':'system', 'content':
     """You are an expert in F1.

     Who won the 2000 f1 championship?
     Driver: Michael Schumacher.
     Team: Ferrari."""}
]
print(return_OAIResponse("Who won the F1 2011?", context_user))

Driver: Sebastian Vettel.
Team: Red Bull Racing.


Smaller models, or more complicated formats, may require more than one shot. Here a sample with two shots.

In [5]:
#Few shots
context_user = [
    {'role':'system', 'content':
     """You are an expert in F1.

     Who won the 2010 f1 championship?
     Driver: Sebastian Vettel.
     Team: Red Bull Renault.

     Who won the 2009 f1 championship?
     Driver: Jenson Button.
     Team: BrawnGP."""}
]
print(return_OAIResponse("Who won the F1 2006?", context_user))

The 2006 F1 championship was won by Fernando Alonso, driving for Renault.


In [6]:
print(return_OAIResponse("Who won the F1 2019?", context_user))

The 2019 F1 Championship was won by Lewis Hamilton driving for Mercedes.


We've been creating the prompt without using OpenAI's roles, and as we've seen, it worked correctly.

However, the proper way to do this is by using these roles to construct the prompt, making the model's learning process even more effective.

By not feeding it the entire prompt as if they were system commands, we enable the model to learn from a conversation, which is more realistic for it.

In [7]:
#Recomended solution
context_user = [
    {'role':'system', 'content':'You are and expert in f1.\n\n'},
    {'role':'user', 'content':'Who won the 2010 f1 championship?'},
    {'role':'assistant', 'content':"""Driver: Sebastian Vettel. \nTeam: Red Bull. \nPoints: 256. """},
    {'role':'user', 'content':'Who won the 2009 f1 championship?'},
    {'role':'assistant', 'content':"""Driver: Jenson Button. \nTeam: BrawnGP. \nPoints: 95. """},
]

print(return_OAIResponse("Who won the F1 2019?", context_user))

Driver: Lewis Hamilton. 
Team: Mercedes. 
Points: 413.


We could also address it by using a more conventional prompt, describing what we want and how we want the format.

However, it's essential to understand that in this case, the model is following instructions, whereas in the case of use shots, it is learning in real-time during inference.

In [8]:
context_user = [
    {'role':'system', 'content':"""You are and expert in f1.
    You are going to answer the question of the user giving the name of the rider,
    the name of the team and the points of the champion, following the format:
    Drive:
    Team:
    Points: """
    }
]

print(return_OAIResponse("Who won the F1 2019?", context_user))

Driver: Lewis Hamilton
Team: Mercedes
Points: 413


In [9]:
context_user = [
    {'role':'system', 'content':
     """You are classifying .

     Who won the 2010 f1 championship?
     Driver: Sebastian Vettel.
     Team: Red Bull Renault.

     Who won the 2009 f1 championship?
     Driver: Jenson Button.
     Team: BrawnGP."""}
]
print(return_OAIResponse("Who won the F1 2006?", context_user))

Driver: Fernando Alonso.
Team: Renault.


Few Shots for classification.


In [10]:
context_user = [
    {'role':'system', 'content':
     """You are an expert in reviewing product opinions and classifying them as positive or negative.

     It fulfilled its function perfectly, I think the price is fair, I would buy it again.
     Sentiment: Positive

     It didn't work bad, but I wouldn't buy it again, maybe it's a bit expensive for what it does.
     Sentiment: Negative.

     I wouldn't know what to say, my son uses it, but he doesn't love it.
     Sentiment: Neutral
     """}
]
print(return_OAIResponse("I'm not going to return it, but I don't plan to buy it again.", context_user))

Sentiment: Negative


# Exercise
 - Complete the prompts similar to what we did in class. 
     - Try at least 3 versions
     - Be creative
 - Write a one page report summarizing your findings.
     - Were there variations that didn't work well? i.e., where GPT either hallucinated or wrong
 - What did you learn?

## Version 1

Trying IPL Cricket in India as One Version 

In [11]:
# Zero Shot
context_user = [
    {'role':'system', 'content':'You are an expert in Cricket.'}
]
print(return_OAIResponse("Who won the IPL 2015?", context_user))

Mumbai Indians won the IPL 2015 by defeating Chennai Super Kings in the final.


In [21]:
#one-shot
context_user = [
    {'role':'system', 'content':
     """You are an expert in IPL.

     Who won the 2011 IPL match?
     Team: Chennai Super Kings.
     Won by: 58 Runs.
     Defeated: Royal Challengers Banglore."""}
]
print(return_OAIResponse("Who won the IPL 2012?", context_user))

The Kolkata Knight Riders won the IPL 2012 by defeating the Chennai Super Kings.


In [None]:
#Few- shot
context_user = [
    {'role':'system', 'content':'You are and expert in IPL.\n\n'},
    {'role':'user', 'content':'Who won the 2010 IPL Match?'},
    {'role':'assistant', 'content':"""Team: Chennai Super Kings. \nWon by: 22 runs. \nDefeated: Mumbai Indians. """},
    {'role':'user', 'content':'Who won the 2009 IPL Match?'},
    {'role':'assistant', 'content':"""Team: Deccan Chargers. \nWon by: 6 runs. \nDefeated: Royal Challengers Bangalore. """},
]

print(return_OAIResponse("Who won the IPL 2016?", context_user))

Team: Sunrisers Hyderabad 
Won by: 8 runs
Defeated: Royal Challengers Bangalore


# Observations:
* Zero-shot: Response received is correct but unstructured means it provided additional information.
* One-shot: Response received is correct but not accurate as it didnt included "Won by" parameter. i understood that one shot prompting may be insufficient for formats.
* Few-shot: Response received is correct and accurate as it gave the response in expected format. i understood that few shot prompting will improve the model ability.

Issues and Hallucinations: 
* The one-shot prompt’s failure to include the “Won by” field suggests that GPT-3.5 struggled to generalize the format from a single example. No hallucinations were observed, as all responses were factually correct based on IPL records.

Key Finding: 
* The few-shot approach was more effective for ensuring the desired format, highlighting the need for multiple examples when precise output structure is required.

## Version 2

largest cities in Germany

In [22]:
# Zero Shot
context_user = [
    {'role':'system', 'content':'You are an expert Geography.'}
]
print(return_OAIResponse("Which is the largest city in Germany?", context_user))

The largest city in Germany is Berlin.


In [33]:
#one-shot
context_user = [
    {'role':'system', 'content':
     """You are an expert in Geography.

     Which is the largest city in Germany?
     City: Berlin.
     Land Area: 891.3 km2.
     State: Berlin."""}
]
print(return_OAIResponse("Which is the largest city in Germany?", context_user))

The largest city in Germany is Berlin.


In [34]:
#Recomended solution
context_user = [
    {'role':'system', 'content':'You are an expert in Geography.\n\n'},
    {'role':'user', 'content':'Which is the first largest city in Germany?'},
    {'role':'assistant', 'content':"""City: Berlin. \nLand Area: 891.3 km2. \nState: Berlin. """},
    {'role':'user', 'content':'Which is the second largest city in Germany?'},
    {'role':'assistant', 'content':"""City: Hamburg. \nLand Area: 755.2 km2. \nState: Hamburg. """},
]

print(return_OAIResponse("Which is the third largest city in Germany?", context_user))

City: Munich. 
Land Area: 310.4 km2. 
State: Bavaria.


# Observations:
* Zero-shot: Response received is correctwitout additional information but unstructured .
* One-shot: Response received is correct but not accurate as it didnt included "Land Area and State" parameters. i understood that one shot prompting may be insufficient for formats.
* Few-shot: Response received is correct and accurate as it gave the response in expected format. i understood that few shot prompting will improve the model ability. i tried giving wrong state in the example for Berlin, yet model gave the expected output

Issues and Hallucinations: 
* The one-shot prompt’s failure to include the “Land Area and State” field suggests that GPT-3.5 struggled to generalize the format from a single example. The model’s ability to provide a correct state for Munich in the few-shot case suggests it relied on its internal knowledge rather than strictly following the flawed. No hallucinations were observed, as all responses were factually correct based on Geographical records.

Key Finding: 
* The few-shot approach was more effective for ensuring the desired format, highlighting the need for multiple examples when precise output structure is required.

## Version 3

AI Models by Release Year

In [None]:
# Zero Shot
context_user = [
    {'role':'system', 'content':'You are an expert in artificial intelligence and its history.'}
]
print(return_OAIResponse("When was the Machine learning concept introduced?", context_user))

The concept of machine learning was introduced in the 1950s with the development of the first artificial neural network by Frank Rosenblatt in 1958, called the Perceptron. This marked the beginning of the formal study and research into computational models that can learn from data and improve over time, laying the foundation for modern machine learning algorithms and techniques.


In [30]:
#one-shot
context_user = [
    {'role':'system', 'content':
     """You are an expert in artificial intelligence and its history.

     Provide details about Machine Learning?
     Description: mathematical representations of algorithms trained on data to make predictions or classifications.
     Models: Supervised, unsupervised, Reinforcement learning.
     Algorithms: Regression, Classification."""}
]
print(return_OAIResponse("Provide details about Machine Learning?", context_user))

Machine learning is a domain of artificial intelligence that involves the development of algorithms and statistical models that enable computers to improve their performance on a specific task through the analysis of data, without being explicitly programmed. In other words, machine learning algorithms use statistical techniques to enable machines to learn from and make predictions or decisions based on data.

Key components of machine learning include:

1. Data: Machine learning models are trained on large amounts of data, which are used to identify patterns and relationships that can be used to make predictions or classifications.

2. Algorithms: Machine learning algorithms are the mathematical representations of the models that are trained on the data. These algorithms can be classified into various categories, such as regression algorithms for predicting continuous values and classification algorithms for identifying categories or classes.

3. Models: Machine learning models are th

In [32]:
#Recomended solution
context_user = [
    {'role':'system', 'content':'You are an expert in artificial intelligence and its history.\n\n'},
    {'role':'user', 'content':'Can you provide details about Machine Learning?'},
    {'role':'assistant', 'content':"""Description: mathematical representations of algorithms trained on data to make predictions or classifications. \nModels: Supervised, unsupervised, Reinforcement learning. \nApplications: Image Recognition, Speech Recognition, Social Media Features, Cybersecurity, Medical Diagnosis, NLP. """},
    {'role':'user', 'content':'Can you provide details about Deep Learning?'},
    {'role':'assistant', 'content':"""Description: mathematical representations of algorithms trained on data to make predictions or classifications. \nModels: CNN, RNN, LSTMs, GANs, Transformers, Autoencoders, DBNs, DQNs. \nApplications: Computer Vision, NLP, Speech Recognition, Robotics, Game Playing, Predictive Analytics. """},
]

print(return_OAIResponse("Can you provide details about Machine Learning", context_user))

Machine learning is a branch of artificial intelligence that involves developing algorithms and models that allow computers to learn and make predictions or decisions based on data. Instead of being explicitly programmed, the computer learns from patterns in the data provided to it. There are several types of machine learning approaches including supervised learning, unsupervised learning, and reinforcement learning.

In supervised learning, the model is trained on labeled data, with input-output pairs used for learning. The model learns to map inputs to outputs based on the training data. Regression and classification are common tasks in supervised learning.

Unsupervised learning, on the other hand, involves training the model on unlabeled data and the model is tasked with finding patterns or structure in the data without explicit guidance. Clustering and dimensionality reduction are common unsupervised learning tasks.

Reinforcement learning is a type of machine learning where an ag

# Observations:
* Zero-shot: The prompt asked about the introduction of the machine learning concept, and the model provided a detailed historical response (1950s, Perceptron by Frank Rosenblatt). While accurate, the response was verbose and not formatted, consistent with zero-shot behavior. .
* One-shot: The example provided for this version is structured format but the Response received is long unstructured explanation that included some of the requested components(models, algorithms) but ignored the concise format.i understood that one shot prompting may be insufficient for formats.
* Few-shot: Using examples for machine learning and deep learning, the model provided a detailed but unstructured response for machine learning, covering description, models, and applications but not strictly adhering to the example format. The response was accurate but verbose, suggesting that even with multiple examples, the model prioritized its default response style over the specified structure.

Issues and Hallucinations: 
* No hallucinations were observed, as the responses were factually correct. However, the model’s tendency to produce verbose, unstructured text despite examples indicates a limitation in enforcing strict formatting for abstract or broad topics like AI.

Key Finding: 
* Complex or abstract queries may require more explicit instructions or additional examples to achieve the desired format, as the model tends to revert to its natural response style.

## Learnings
* Example Quality Matters: Accurate and clear examples are critical for effective few-shot learning. Errors in examples (e.g., Bremen for Berlin) can confuse the model or be ignored, depending on its internal knowledge.
* Few-Shot Superiority: Providing multiple examples significantly improves the model’s ability to adhere to a desired format, especially for structured outputs like sports or geographical data.
* Domain Complexity: Simple, fact-based domains (IPL, cities) respond better to few-shot prompting than abstract topics (AI), where the model may prioritize its default response style.
* Prompt Design: Using conversation-style prompts with clear user-assistant roles (as in the recommended solutions) enhances learning compared to embedding examples in the system prompt. However, explicit instructions may be needed for complex topics to enforce strict formatting.
* Model Limitations: GPT-3.5’s performance in one-shot scenarios suggests that smaller models may require more examples or explicit instructions to achieve consistent formatting, especially for nuanced or complex queries.