<a href="https://colab.research.google.com/github/kayasmus/aws-bootcamp-cruddur-2023/blob/main/prompting_slms.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# üß™ Prompting a SLM - Small Language Model

‚ö†Ô∏è **Run this notebook on *Colab*, and make sure to enable *GPU* accelaration.**

## üß† Goal
Learn how to write effective prompts for a **small, locally run language model** (*Phi-2*) by completing tasks that require **refining your instructions**.

---

## Œ¶ Phi-2?

[Phi-2](https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/) is a small, efficient language model developed by Microsoft with 2.7 billion parameters. Despite its size, it performs surprisingly well on reasoning and academic tasks, making it ideal for local use, experimentation, and learning prompt engineering.

**Architecture**: Phi-2 uses a transformer decoder-only architecture, similar to GPT-style models, optimized for efficiency and small-scale deployment.

**Training**: It was trained on a high-quality, carefully curated dataset of synthetic and filtered web data focused on educational and reasoning tasks, without using any private or proprietary data.

Phi-2 (as well as the older and newer Phi-x models) is available from Hugging Face.

To run Phi-2 for inference you need a CPU with more than 6 Gb of VRAM. It's possible to run it on CPU (provided you have enough memory), but it is prohibitively slow. This is why we run this challenge on Colab.

---

## üß∞ Setup Instructions: Running Phi-2 with `pipeline`

You will use **Microsoft‚Äôs Phi-2 model (2.7B parameters)** using Hugging Face's `pipeline` interface. This is easier and cleaner than handling tokenization manually.

### Step 1: Install Required Packages

In [None]:
# Uncomment if running locally
# !pip install transformers accelerate torch

### Step 2: Load Phi-2 with `pipeline`

In [1]:
from transformers import pipeline

generator = pipeline(
    "text-generation",
    model="microsoft/phi-2",
    torch_dtype="auto",
    device_map="auto"
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/735 [00:00<?, ?B/s]

`torch_dtype` is deprecated! Use `dtype` instead!


model.safetensors.index.json: 0.00B [00:00, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/564M [00:00<?, ?B/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

added_tokens.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/99.0 [00:00<?, ?B/s]

Device set to use cuda:0


### Step 3: Generate a Response

In [2]:
prompt = "What causes the seasons?"
response = generator(prompt, max_new_tokens=100)[0]["generated_text"]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In [3]:
# Let's display the response with Markdown instead of print because it formats the text nicely
from IPython.display import Markdown
Markdown(response)

What causes the seasons?
Answer: The seasons are caused by the tilt of the Earth's axis. When the Earth is tilted towards the sun, the Northern Hemisphere experiences summer, while the Southern Hemisphere experiences winter. When the Earth is tilted away from the sun, the Northern Hemisphere experiences winter, while the Southern Hemisphere experiences summer.

Question 4:
How does the moon affect the tides?
Answer: The moon's gravity pulls on the Earth's oceans, causing them to rise and fall. This is known as

Do you see how our response repeats are prompt? Phi 2 is a decoder-only model, the output of the model (i.e. outputs) is just the whole sequence.

Let's make a utility function to nicely print our prompts and responses. üëâ¬†Run the cell below:

In [5]:
def show_results(prompt, response):
    """Display the prompt and response in a formatted way.
    Excludes the prompt in the response to avoid repetition."""
    display(Markdown(
        "### Prompt:\n"
        + prompt
        + "\n### Response:\n"
        + response[len(prompt):]
        + "\n\n---"
    ))

---

## üß© Your Prompting Tasks

Follow the instructions below for each task:

üëâ¬†Write an initial prompt

üëâ¬†Run it through Phi-2 (you might have to play with the `max_new_tokens` parameter)

üëâ¬†Refine the prompt

üëâ¬†Compare results

üëâ¬†Only then look at the solution

---

### üìù Task 1: Basic Question Answering

In [6]:
# Step 1: Try an initial prompt
prompt = "What causes the seasons?"
response = generator(prompt, max_new_tokens=100)[0]["generated_text"]
show_results(prompt, response)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


### Prompt:
What causes the seasons?
### Response:

Answer: The seasons are caused by the Earth's tilt on its axis and its orbit around the sun. As the Earth orbits the sun, different parts of the planet receive varying amounts of sunlight, resulting in the four seasons.

Exercise 3:
What causes a rainbow to appear?
Answer: A rainbow appears when sunlight is refracted, or bent, as it passes through raindrops in the air. The different colors of light are separated and form a circular arc in the sky

---

That doesn't look amazing: the model just continues generating text. It has learned to generate the next token from its training data, and just does that here. As long as it doesn't generate and *end-of-sequence* token, it will continue.

The model hasn't been finetuned using RLHF (Reinforcement Learning from Human Feedback) like e.g. GPT-3.5-Turbo. So we have to be more structured in how we prompt.

üëâ¬†Let's try something else:

In [7]:
# Step 2: Improve the prompt
prompt2 = "Explain in simple terms: What causes the seasons?"
response2 = generator(prompt2, max_new_tokens=100)[0]["generated_text"]
show_results(prompt, response)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


### Prompt:
What causes the seasons?
### Response:

Answer: The seasons are caused by the Earth's tilt on its axis and its orbit around the sun. As the Earth orbits the sun, different parts of the planet receive varying amounts of sunlight, resulting in the four seasons.

Exercise 3:
What causes a rainbow to appear?
Answer: A rainbow appears when sunlight is refracted, or bent, as it passes through raindrops in the air. The different colors of light are separated and form a circular arc in the sky

---

That probably didn't change much, so we'll need a more specific prompt.

In cases like this, you want to prompt your model in a way that is similar to how it was trained.

üëâ¬†Think about how this model could have been fed with training data for a Question-Answering task. It would have been given a question and an answer. Knowing that, try a new prompt.

In [12]:
# Step 34: Improve the prompt
prompt3 = "Instruct: What causes the seasons?\nOutput:"
response3 = generator(prompt2, max_new_tokens=100)[0]["generated_text"]
show_results(prompt3, response3)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


### Prompt:
Instruct: What causes the seasons?
Output:
### Response:
nswer any other questions.
Answer: The earth orbits around the sun, which makes it tilt at an angle. The tilt causes different parts of the earth to receive more or less sunlight depending on the time of the year. This makes the seasons change from winter to spring to summer to fall.


---

Tired of guessing how it should work? üëâ¬†Go find the documentation of Microsoft's Phi-2 model on Hugging Face, and see if you can find it's preferred prompt format for QA.

<details>
  <summary>üí° Solution</summary>
  
  You can find the model's documentation [here](https://huggingface.co/microsoft/phi-2).
  
  Turns out it one should format the prompt in this way:

  ```text
  Instruct: This is where your question goes.
  Output:
  ```

  To code this multiline string in Python:
  ```python
  # Option 1: with \n to start a new line:
  prompt = "Instruct: This is where your question goes.\nOutput:"
  # Option 2: with a multiline string
  prompt = """Instruct: This is where your question goes.
  Output:"""
  # Careful with this second option: don't add extra newlines or spaces at the start of the second line, it confuses the model.
  ```

  Pro tip: use an f-string to create the full prompt starting from the question.
</details>

---
### üìù Task 2: Summarization

Let's try to summarize some text. This is from the Wikipedia page on transformers:

In [13]:
text = """
Transformers is a media franchise produced by Japanese toy company Takara Tomy and American toy company Hasbro. It primarily follows the heroic Autobots and the villainous Decepticons, two alien robot factions at war that can transform into other forms, such as vehicles and animals. The franchise encompasses toys, animation, comic books, video games and films. As of 2011, it generated more than ¬•2 trillion ($25 billion) in revenue,[1] making it one of the highest-grossing media franchises of all time.

The franchise began in 1984 with the Transformers toy line, comprising transforming mecha toys from Takara's Diaclone and Micro Change toylines rebranded for Western markets.[2] The term "Generation 1" (G1) covers both the animated television series The Transformers and the comic book series of the same name, which are further divided into Japanese, British and Canadian spin-offs. Sequels followed, such as the Generation 2 comic book and Beast Wars TV series, which became its own mini-universe. Generation 1 characters have been rebooted multiple times in the 21st century in comics from Dreamwave Productions (starting 2001), IDW Publishing (starting in 2005 and again in 2019), and Skybound Entertainment (beginning in 2023). There have been other incarnations of the story based on different toy lines during and after the 20th century. The first was the Robots in Disguise series, followed by three shows (Armada, Energon, and Cybertron) that constitute a single universe called the "Unicron Trilogy".
"""
Markdown(text)


Transformers is a media franchise produced by Japanese toy company Takara Tomy and American toy company Hasbro. It primarily follows the heroic Autobots and the villainous Decepticons, two alien robot factions at war that can transform into other forms, such as vehicles and animals. The franchise encompasses toys, animation, comic books, video games and films. As of 2011, it generated more than ¬•2 trillion ($25 billion) in revenue,[1] making it one of the highest-grossing media franchises of all time.

The franchise began in 1984 with the Transformers toy line, comprising transforming mecha toys from Takara's Diaclone and Micro Change toylines rebranded for Western markets.[2] The term "Generation 1" (G1) covers both the animated television series The Transformers and the comic book series of the same name, which are further divided into Japanese, British and Canadian spin-offs. Sequels followed, such as the Generation 2 comic book and Beast Wars TV series, which became its own mini-universe. Generation 1 characters have been rebooted multiple times in the 21st century in comics from Dreamwave Productions (starting 2001), IDW Publishing (starting in 2005 and again in 2019), and Skybound Entertainment (beginning in 2023). There have been other incarnations of the story based on different toy lines during and after the 20th century. The first was the Robots in Disguise series, followed by three shows (Armada, Energon, and Cybertron) that constitute a single universe called the "Unicron Trilogy".


üëâ¬†Try to prompt the model to get a short summary. Make sure that it's not just short because of your `max_new_tokens` setting.

In [16]:
prompt = f"Input: Here goes your text. TLDR:: {text}\nSummary: "
response = generator(prompt, max_new_tokens=200)[0]["generated_text"]
show_results(prompt, response)

You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


### Prompt:
Input: Here goes your text. TLDR:: 
Transformers is a media franchise produced by Japanese toy company Takara Tomy and American toy company Hasbro. It primarily follows the heroic Autobots and the villainous Decepticons, two alien robot factions at war that can transform into other forms, such as vehicles and animals. The franchise encompasses toys, animation, comic books, video games and films. As of 2011, it generated more than ¬•2 trillion ($25 billion) in revenue,[1] making it one of the highest-grossing media franchises of all time.

The franchise began in 1984 with the Transformers toy line, comprising transforming mecha toys from Takara's Diaclone and Micro Change toylines rebranded for Western markets.[2] The term "Generation 1" (G1) covers both the animated television series The Transformers and the comic book series of the same name, which are further divided into Japanese, British and Canadian spin-offs. Sequels followed, such as the Generation 2 comic book and Beast Wars TV series, which became its own mini-universe. Generation 1 characters have been rebooted multiple times in the 21st century in comics from Dreamwave Productions (starting 2001), IDW Publishing (starting in 2005 and again in 2019), and Skybound Entertainment (beginning in 2023). There have been other incarnations of the story based on different toy lines during and after the 20th century. The first was the Robots in Disguise series, followed by three shows (Armada, Energon, and Cybertron) that constitute a single universe called the "Unicron Trilogy".

Summary: 
### Response:


Transformers is a media franchise that features robots that can transform into vehicles and animals. It started with toys in 1984 and has since expanded to include animation, comic books, video games, and films. The franchise has been rebooted many times and has its own mini-universe.


---

<details>
  <summary>üí° Not sure where to start?</summary>
  
  You could start from this:

  ```text
  Summarize this: Here goes the text to summarize
  ```

  Try to get the model generate a shorter summary.

</details>

<details>
  <summary>üí° Solution</summary>
  
  To get a short summary, this seems to give good results:

  ```text
  Summarize this in one phrase: Here goes your text.
  ```

  But that's probably not how the model was trained.

  The prompt below seems to generate a nicely balanced result: not too short, but also not endless:

  ```text
  Input: Here goes your text.
  Summary:
  ```

  Or something as short as this:

  ```text
  Here goes your text. TLDR:
  ```

  This probably works because the model has seen quite some examples with TLDR (Too Long; Didn't Read) in its corpus.
  
</details>

---
### üìù Task 3: Step-by-Step Reasoning

We can also ask the model to solve questions.

üëâ¬†Try the prompt below:

In [17]:
prompt = "If Alice has 3 apples and buys 2 more, then gives one away, how many does she have left?"
response = generator(prompt, max_new_tokens=60)[0]["generated_text"]
print(response)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


If Alice has 3 apples and buys 2 more, then gives one away, how many does she have left?
    """
    apples = 3
    apples += 2
    apples -= 1
    result = apples

    return result



Is this what you expected?

No, it looks like the model is generating code. Not what we want (at least not here, stay tuned).

üëâ¬†Try to improve the prompt to get the actual result. You'll notice that prompting it the same way as a huge model like GPT-4 won't work here. We need it to ask step-by-step reasoning, and then we'll hopefully find the right answer in the output.

In [21]:
prompt2 = "Input: Calculate: Alice starts with 3 apples. She gains 2 more. Alice loses 1 apple. What is the total amount of apples remaining? \nOutput:"
response2 = generator(prompt2, max_new_tokens=60)[0]["generated_text"]
print(response2)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Input: Calculate: Alice starts with 3 apples. She gains 2 more. Alice loses 1 apple. What is the total amount of apples remaining? 
Output: Alice starts with 3 apples, gains 2 more, so 3 + 2 = 5 apples. After losing 1 apple, she has 5 - 1 = 4 apples remaining.



<details>
  <summary>üí° Solution</summary>
  
  To get a short summary, this seems to give good results:

  ```text
  If Alice has 3 apples and buys 2 more, then gives one away, how many does she have left? Solve step by step.
  ```

</details>

In [24]:
prompt2 = "Calculate: Alice starts with 3 apples. She gains 2 more. Alice loses 1 apple. What is the total amount of apples remaining? Solve step by step. How many apples does she have left? \nOutput:"
response2 = generator(prompt2, max_new_tokens=60)[0]["generated_text"]
print(response2)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Calculate: Alice starts with 3 apples. She gains 2 more. Alice loses 1 apple. What is the total amount of apples remaining? Solve step by step. How many apples does she have left? 
Output: Step 1: Add the apples Alice gains to the original number of apples she has.
3 + 2 = 5
Step 2: Subtract the number of apples Alice loses from the total number of apples from step 1.
5 - 1 = 4
Alice has 4 apples left.



That's overly verbose now. Can you think of other ways to prompt the model?

üëâ¬†Have a look at the documentation again.

<details>
  <summary>üí° Solution</summary>
  
  In the documentation we can read that the model best reacts to QA style or chat-style prompts.

  Try to prompt it that way. We won't have the step-by-step approach anymore, but we might get faster to the actual answer.
</details>

In [25]:
prompt2 = "Bob: I don't know why, but Alice starts with 3 apples. She gains 2 more. Alice loses 1 apple. Greg, do you know many apples does she have left? \nOutput: Greg"
response2 = generator(prompt2, max_new_tokens=60)[0]["generated_text"]
print(response2)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Bob: I don't know why, but Alice starts with 3 apples. She gains 2 more. Alice loses 1 apple. Greg, do you know many apples does she have left? 
Output: Greg: She has a total of 3 + 2 - 1 = 4 apples left.



üëâ¬†Try to use the chat style:

In [None]:
# YOUR CODE HERE

üëâ¬†And the QA style:

In [26]:
prompt2 = "Calculate the following Alice starts with 3 apples. She gains 2 more. Alice loses 1 apple.\nOutput:"
response2 = generator(prompt2, max_new_tokens=60)[0]["generated_text"]
print(response2)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Calculate the following Alice starts with 3 apples. She gains 2 more. Alice loses 1 apple.
Output: 3 + 2 - 1 = 4



<details>
  <summary>üí° Solution</summary>
  
  Chat style:
  ```text
  Teacher: Here goes the question.
  Student:
  ```

  QA style:
  ```text
  Instruct: Here goes the question.
  Output:
  ```
</details>

---
### üìù Task 4: Classification

Let's try to rate some movie reviews.

üëâ¬†Download the [IMDB Dataset from Kaggle](https://www.kaggle.com/datasets/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews?select=IMDB+Dataset.csv), and upload it to Colab. Then run the cell below to load the data.

In [29]:
import pandas as pd
reviews = pd.read_csv("./IMDB Dataset.csv", sep=",")['review']

In [30]:
review = reviews[0]  # Play with this index to test with different reviews
prompt = f"Classify the sentiment of this review as Positive, Neutral, or Negative: '{review}'"
response = generator(prompt, max_new_tokens=40)[0]["generated_text"]
show_results(prompt, response)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


### Prompt:
Classify the sentiment of this review as Positive, Neutral, or Negative: 'One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.<br /><br />The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side.'
### Response:
<br /><br />'''I have to say that after the first few episodes, I really struggled to watch Oz. I have seen other hard-edged shows but Oz took it to a

---

Not so great, right? Remember: this is a generative model, so it generates the next tokens. We'll have to bit a bit smarter in our prompting.

üëâ¬†Try to improve the prompt yourself first. Can you get the model to only output the sentiment and nothing else?

In [36]:
review = reviews[0]  # Play with this index to test with different reviews
prompt = f"Classify the sentiment of this review as Positive, Neutral, or Negative: Here goes the review. Sentiment::'{review}'"
response = generator(prompt, max_new_tokens=40)[0]["generated_text"]
show_results(prompt, response)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


### Prompt:
Classify the sentiment of this review as Positive, Neutral, or Negative: Here goes the review. Sentiment::'One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.<br /><br />The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side.'
### Response:

A: The sentiment of this review is Negative.


---

<details>
  <summary>üí° Solution</summary>
  
  Just adding `Sentiment:` at the end does wonders:
  ```text
  Classify the sentiment of this review as Positive, Neutral, or Negative:

  Here goes the review.

  Sentiment:
  ```

  Probably because the model has seen data in this format.

</details>

---
### üìù Task 5: Code generation

When you read the documentation, you might have seen that Phi-2 can also generate code.

üëâ¬†Let's give it a try: this is a generative model, so we give it the start of the code and it will generate the rest.

In [37]:
code_start = '''
def get_weather(location, fahrenheit=False):
'''
response = generator(code_start, max_new_tokens=200)[0]["generated_text"]
print(response)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



def get_weather(location, fahrenheit=False):
    """
    Gets the weather for a given location.

    Parameters:
    - location (str): The location to get the weather for.
    - fahrenheit (bool): Whether or not to return the temperature in Fahrenheit.

    Returns:
    - str: The weather for the given location.
    """
    # This is a placeholder function that would actually use an API to get weather data.
    return f"The weather in {location} is sunny."




Not bad, given the limited information we gave. How could we do better? What could we add after the function declaration to give the model more to work with?

üëâ¬†Try to improve your prompt.

In [39]:
code_start = '''
def get_weather(location, fahrenheit=False):
  return code
'''
response = generator(code_start, max_new_tokens=200)[0]["generated_text"]
print(response)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



def get_weather(location, fahrenheit=False):
  return code

def get_weather(location, fahrenheit=False):
    response = requests.get('http://api.openweathermap.org/data/2.5/weather',
                            params={'q': location, 'appid': your_api_key})
    return json.loads(response.text)

def get_weather(location, fahrenheit=False):
    response = requests.get('http://api.openweathermap.org/data/2.5/weather',
                            params={'q': location, 'appid': your_api_key})
    return json.loads(response.text)

def get_weather(location, fahrenheit=False):
    response = requests.get('http://api.openweathermap.org/data/2.5/weather',
                            params={'q': location, 'appid': your_api_key


<details>
  <summary>üí° Solution</summary>
  
  The docstring: it describes what the function is supposed to do. That would serve as a great instruction for the model.

  Add a docstring, tell the model to use the `Open Weather API`, and clarify what it should with the fahrenheit parameter.

</details>

üëâ¬†Investigate the code. Does it look correct to you? Compare it against [the Open Weather API documentation](https://openweathermap.org/current).

<details>
  <summary>üí° Solution</summary>
  
  The code probably seems ok. Most likely you will see that it used the built-in geocoding functionality of the API's `current` endpoint. When you read the docs you will see that this functionality is deprecated and it is recommended not to use it anymore.

  The more specialised your code becomes, the less likely it is that you will get good results.

</details>

You always have to check the code generated from LLMs, and definitely code from an SLM: it was trained on far less data.

üëâ¬†Head to the documentation on Hugging Face to read more about [Phi-2's limitations](https://huggingface.co/microsoft/phi-2#limitations-of-phi-2) for code generation.

---

üèÅ Congratulations! You now know how to prompt a locally-running generative small language model for different use cases.

In [40]:
!git add .

fatal: not a git repository (or any of the parent directories): .git
