# M-Shots Learning

In this notebook, we'll explore small prompt engineering techniques and recommendations that will help us elicit responses from the models that are better suited to our needs.

In [1]:
from openai import OpenAI
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

OPENAI_API_KEY  = os.getenv('OPENAI_API_KEY')

# Formatting the answer with Few Shot Samples.

To obtain the model's response in a specific format, we have various options, but one of the most convenient is to use Few-Shot Samples. This involves presenting the model with pairs of user queries and example responses.

Large models like GPT-3.5 respond well to the examples provided, adapting their response to the specified format.

Depending on the number of examples given, this technique can be referred to as:
* Zero-Shot.
* One-Shot.
* Few-Shots.

With One Shot should be enough, and it is recommended to use a maximum of six shots. It's important to remember that this information is passed in each query and occupies space in the input prompt.



In [2]:
# Function to call the model.
def return_OAIResponse(user_message, context):
    client = OpenAI(
    # This is the default and can be omitted
    api_key=OPENAI_API_KEY,
)

    newcontext = context.copy()
    newcontext.append({'role':'user', 'content':"question: " + user_message})

    response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=newcontext,
            temperature=1,
        )

    return (response.choices[0].message.content)

In this zero-shots prompt we obtain a correct response, but without formatting, as the model incorporates the information he wants.

In [3]:
#zero-shot
context_user = [
    {'role':'system', 'content':'You are an expert in F1.'}
]
print(return_OAIResponse("Who won the F1 2010?", context_user))

Sebastian Vettel won the Formula 1 World Championship in 2010. He was driving for the Red Bull Racing team.


For a model as large and good as GPT 3.5, a single shot is enough to learn the output format we expect.


In [4]:
#one-shot
context_user = [
    {'role':'system', 'content':
     """You are an expert in F1.

     Who won the 2000 f1 championship?
     Driver: Michael Schumacher.
     Team: Ferrari."""}
]
print(return_OAIResponse("Who won the F1 2011?", context_user))

Driver: Sebastian Vettel. 
Team: Red Bull Racing.


Smaller models, or more complicated formats, may require more than one shot. Here a sample with two shots.

In [5]:
#Few shots
context_user = [
    {'role':'system', 'content':
     """You are an expert in F1.

     Who won the 2010 f1 championship?
     Driver: Sebastian Vettel.
     Team: Red Bull Renault.

     Who won the 2009 f1 championship?
     Driver: Jenson Button.
     Team: BrawnGP."""}
]
print(return_OAIResponse("Who won the F1 2006?", context_user))

Driver: Fernando Alonso.
Team: Renault.


In [8]:
print(return_OAIResponse("Who won the F1 2019?", context_user))

Driver: Lewis Hamilton.
Team: Mercedes.


We've been creating the prompt without using OpenAI's roles, and as we've seen, it worked correctly.

However, the proper way to do this is by using these roles to construct the prompt, making the model's learning process even more effective.

By not feeding it the entire prompt as if they were system commands, we enable the model to learn from a conversation, which is more realistic for it.

In [9]:
#Recomended solution
context_user = [
    {'role':'system', 'content':'You are and expert in f1.\n\n'},
    {'role':'user', 'content':'Who won the 2010 f1 championship?'},
    {'role':'assistant', 'content':"""Driver: Sebastian Vettel. \nTeam: Red Bull. \nPoints: 256. """},
    {'role':'user', 'content':'Who won the 2009 f1 championship?'},
    {'role':'assistant', 'content':"""Driver: Jenson Button. \nTeam: BrawnGP. \nPoints: 95. """},
]

print(return_OAIResponse("Who won the F1 2019?", context_user))

Driver: Lewis Hamilton.
Team: Mercedes.
Points: 413.


We could also address it by using a more conventional prompt, describing what we want and how we want the format.

However, it's essential to understand that in this case, the model is following instructions, whereas in the case of use shots, it is learning in real-time during inference.

In [10]:
context_user = [
    {'role':'system', 'content':"""You are and expert in f1.
    You are going to answer the question of the user giving the name of the rider,
    the name of the team and the points of the champion, following the format:
    Drive:
    Team:
    Points: """
    }
]

print(return_OAIResponse("Who won the F1 2019?", context_user))

Driver: Lewis Hamilton
Team: Mercedes
Points: 413


In [11]:
context_user = [
    {'role':'system', 'content':
     """You are classifying .

     Who won the 2010 f1 championship?
     Driver: Sebastian Vettel.
     Team: Red Bull Renault.

     Who won the 2009 f1 championship?
     Driver: Jenson Button.
     Team: BrawnGP."""}
]
print(return_OAIResponse("Who won the F1 2006?", context_user))

Driver: Fernando Alonso.
Team: Renault.


Few Shots for classification.


In [12]:
context_user = [
    {'role':'system', 'content':
     """You are an expert in reviewing product opinions and classifying them as positive or negative.

     It fulfilled its function perfectly, I think the price is fair, I would buy it again.
     Sentiment: Positive

     It didn't work bad, but I wouldn't buy it again, maybe it's a bit expensive for what it does.
     Sentiment: Negative.

     I wouldn't know what to say, my son uses it, but he doesn't love it.
     Sentiment: Neutral
     """}
]
print(return_OAIResponse("I'm not going to return it, but I don't plan to buy it again.", context_user))

Sentiment: Negative


# Exercise
 - Complete the prompts similar to what we did in class. 
     - Try at least 3 versions
     - Be creative
 - Write a one page report summarizing your findings.
     - Were there variations that didn't work well? i.e., where GPT either hallucinated or wrong
 - What did you learn?

In [13]:
# Zero shot prompt
context_user = [
    {'role':'system', 'content':
     """You are an expert on computer building like Linus Tech Tips and your job is to answer questions for users on an online forum.
     """}
]
print(return_OAIResponse("Should I let my fans run at minimum or automatic or at maximum, whats better for lower temperatures?", context_user))

Running your fans at maximum speed constantly may provide the lowest temperatures, but it may also be quite loud and put unnecessary wear and tear on the fans. 

For optimal cooling with lower noise levels, I would recommend setting your fans to run automatically based on temperature. This way, the fans will increase their speed when the system is under heavy load and generating more heat, and decrease their speed when the system is idle or running cooler. 

You can usually set up fan curves in your computer's BIOS or through third-party software to control how the fans speed up or slow down based on the temperature readings. This way, you can strike a balance between keeping your system cool and keeping noise levels down. 

It's also important to make sure your case has good airflow with intake and exhaust fans positioned properly to efficiently move air through the system.


In [14]:
print(return_OAIResponse("Help! My pc wont boot into windows what i can do to fix it?", context_user))

If your PC is not booting into Windows, there could be a few potential issues causing this problem. Here are several steps you can take to troubleshoot and potentially fix the issue:

1. **Check the BIOS settings**: Make sure that your BIOS is configured to boot from the correct drive where Windows is installed. You may need to enter the BIOS setup by pressing a specific key (usually F2, F10, or DEL) during the boot process. From there, navigate to the boot options and ensure that the correct drive is selected as the primary boot device.

2. **Check the hard drive connections**: Make sure that all the cables connecting your hard drive to the motherboard are securely attached. If you have multiple hard drives, ensure that the one with Windows installed is properly connected.

3. **Check for external devices**: Sometimes, external devices like USB drives or external hard drives can interfere with the boot process. Try disconnecting all external devices except for the essential ones (keyb

In [15]:
# Few shot prompt
context_user = [
    {'role':'system', 'content':
     """You are a mexican parent teaching, creating questions to simulate a math test.
     Question 1: 4+5
     Answer: 9

     Question 2: 10 x 3
     Answer: 30

    Question 3: 8 / 2
    Answer: 4
     """}
]
print(return_OAIResponse("Please come up with 10 questions", context_user))

Certainly! Here are 10 math questions for your test:

Question 1: 7 + 3
Answer: 10

Question 2: 12 - 6
Answer: 6

Question 3: 5 x 4
Answer: 20

Question 4: 18 / 3
Answer: 6

Question 5: 9 + 8
Answer: 17

Question 6: 25 - 13
Answer: 12

Question 7: 6 x 7
Answer: 42

Question 8: 36 / 4
Answer: 9

Question 9: 11 + 15
Answer: 26

Question 10: 50 - 27
Answer: 23

Feel free to use these questions for your math test!


# Specific Variations Tried
Throughout the experimentation process, several variations in the prompt structure and model parameters were tested to refine the SQL query generation. Here's a breakdown of what was tried:

# Initial Prompt:

The first iteration used a basic prompt that outlined the task of generating SQL queries based on a question. It didn't specify critical aspects such as how to handle revenue or cost calculations, nor did it provide instructions on how to handle cases where the required data wasn’t available in the schema.
Outcome: While the model could generate simple queries correctly, more complex requests led to incomplete SQL outputs, particularly in calculations involving revenue or cost.
Prompt Refinement:

# Based on the initial issues, I introduced specific instructions to guide the model:
Clarified that revenue is calculated as price * quantity.
Added a condition that if the model cannot answer the question with the available schema, it should return ‘I do not know.’
Outcome: After these changes, the model generated more accurate SQL queries for complex questions, such as revenue-based queries or queries that involved joins across multiple tables. This refinement also eliminated cases where the model tried to guess schema elements not present in the database (previously a common issue).

# Model Parameter Adjustments:

Another variation involved adjusting the temperature parameter to control the model’s creativity and determinism.
Lowering the temperature from 0.7 to 0.5 resulted in more predictable and focused SQL outputs, reducing the likelihood of unnecessary or incorrect additions to the query.
Outcome: The lower temperature resulted in more consistent SQL generation, especially for complex queries, without introducing hallucinations.
What Worked:
Detailed Prompt Instructions: Providing clear instructions about how to calculate revenue and cost, as well as specifying what to do when the schema didn’t cover the required data, significantly improved the model’s performance. The model began generating more accurate queries that were well-aligned with the database schema.
# Temperature Adjustment: 
Lowering the temperature brought more reliable results and reduced creativity in scenarios where precision was needed, such as SQL generation. The model became less prone to speculative outputs.
What Didn't Work:
# Initial Prompt: 
The initial, less-detailed prompt led to problems when the model had to answer questions involving calculations or complex joins. Without specific instructions on revenue or cost, the generated queries were often incorrect or incomplete.
# Ambiguous Queries:
When the user’s question was ambiguous or vague, the model struggled, initially attempting to answer questions using schema elements that didn’t exist. The added instruction to return ‘I do not know’ resolved this issue.
Unexpected Model Behavior
# Absence of Hallucinations:
Interestingly, there were no hallucinations after refining the prompt and adjusting the model parameters. This was unexpected, as hallucinations were initially a common issue. The combination of prompt refinement and parameter tuning seems to have effectively mitigated this problem.
Occasional Over-Exactness: In a few cases, the model returned 'I do not know' for questions where it could have attempted to infer a possible join or relationship. While this is safer than hallucinating, it suggests that there might be room for further fine-tuning to strike a balance between avoiding hallucinations and making reasonable assumptions based on the schema.