# Introduction

**Prompt**: a "prompt" is the input text given to the model to generate a response. It is the question, command, or instruction that you provide to guide the model's output.

## Elements of a Prompt

* Instruction: the main text of the prompt that tells the model what to do.
* Context: External information that the model should consider to generate a more relevant and accurate response.
* Input: The question or input that the model must answer or respond to.
* Response Format: The structure or format that the model must adhere to when generating a response. Typically format is provided as a json object or a pydantic object.

source: [Prompt Engineering Guide](https://www.promptingguide.ai/introduction/elements)

### Example

In [2]:
prompt_template = """
You are a helpful assistant that provides accurate and concise answers to questions based on current news articles.

### Context:
{context}

### Input:
{input}

### Response Format:
{response_format}
"""

context = "Yesterday (on 12/01/2011), the stock market crashed due to the ongoing trade war between the US and China. The Dow Jones Industrial Average dropped by 800 points, the worst single-day drop of the year. Investors are concerned about the impact of the trade war on the global economy. The US President has announced new tariffs on Chinese goods, and China has responded with retaliatory measures. The trade tensions have escalated, leading to fears of a prolonged economic downturn."
question = "What caused the stock market crash yesterday?"

response_format = {
    "response": "",
    "date": "",
}

prompt = prompt_template.format(context=context, input=question, response_format=response_format)
print(prompt)



You are a helpful assistant that provides accurate and concise answers to questions based on current news articles.

### Context:
Yesterday (on 12/01/2011), the stock market crashed due to the ongoing trade war between the US and China. The Dow Jones Industrial Average dropped by 800 points, the worst single-day drop of the year. Investors are concerned about the impact of the trade war on the global economy. The US President has announced new tariffs on Chinese goods, and China has responded with retaliatory measures. The trade tensions have escalated, leading to fears of a prolonged economic downturn.

### Input:
What caused the stock market crash yesterday?

### Response Format:
{'response': '', 'date': ''}



In [7]:
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv("../.env")

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": prompt}
    ]
)

response = response.choices[0].message.content
response

"{'response': 'The stock market crash yesterday was caused by escalating trade tensions between the US and China, including the announcement of new tariffs by the US and retaliatory measures from China, which raised concerns about the global economy.', 'date': '12/01/2011'}"

## LLM Settings
Large Language Models (LLMs) have various settings that control their output behavior. Understanding these settings can help us generate more appropriate responses.

### Temperature
This setting controls how random or predictable the model’s outputs are. A lower temperature makes the model more predictable, often picking the most likely next word. This is good for tasks like answering factual questions. A higher temperature adds randomness, which can make outputs more creative or varied—useful for things like creative writing. Basically, lower temperatures are for concise, factual responses, while higher temperatures are for creativity.

**Effect:** 
- **Low Temperature:** Predictable and repetitive responses.
- **High Temperature:** More varied and creative responses.

In [14]:
prompt = "Write a short description (less than 15 words) about the ocean"

def complete(prompt, temperature):
    return client.chat.completions.create(
        model="gpt-4o-mini",
        temperature=temperature,
        messages=[
            {"role": "user", "content": prompt}
        ]
    ).choices[0].message.content

print("Temp 0: ",complete(prompt, 0))

print("Temp 1.5: ",complete(prompt, 1.5))

Temp 0:  The ocean is a vast, mysterious expanse teeming with diverse life and ecosystems.
Temp 1.5:  The ocean is a vast, fluid expanse, teeming with diverse life and mysteries.


### Maximum Tokens
This controls how many words the model generates. Setting a max length helps you avoid overly long or off-topic responses and also keeps costs down.

**Effect:** 
- **Low Maximum Tokens:** Very short responses.
- **High Maximum Tokens:** Longer, more detailed responses.

In [18]:
prompt = "explain photosynsthesis."

def complete(prompt, max_tokens):
    return client.chat.completions.create(
        model="gpt-4o-mini",
        max_tokens=max_tokens,
        messages=[
            {"role": "user", "content": prompt}
        ]
    ).choices[0].message.content

print("Max tokens 10: ",complete(prompt, 10))
print("Max tokens 30:",complete(prompt, 30))

Max tokens 10:  Photosynthesis is the biochemical process by which green plants
Max tokens 30: Photosynthesis is a biochemical process through which green plants, algae, and certain bacteria convert light energy, usually from the sun, into chemical energy in the


### Top P (Nucleus Sampling)
**Explanation:** This is another setting that influences randomness. If Top P is low, the model only considers the most confident words for the response. If Top P is higher, model considers a wider range of words, making the responses more diverse. Generally, tweak either Temperature or Top P, not both.

**Effect:** 
- **Low Top P:** Only very high-probability words are considered, resulting in more predictable responses.
- **High Top P:** More diversity and creativity in responses by including lower-probability words.

In [22]:
prompt = "Describe a Sunset in less than 20 words."

def complete(prompt, top_p):
    return client.chat.completions.create(
        model="gpt-4o-mini",
        top_p=top_p,
        messages=[
            {"role": "user", "content": prompt}
        ]
    ).choices[0].message.content

print("Top p 0.2: ",complete(prompt, 0.1))
print("Top p 0.9: ",complete(prompt, 0.9))

Top p 0.2:  The sun dipped below the horizon, painting the sky in hues of orange, pink, and purple, whispering day’s farewell.
Top p 0.9:  The sky blazes in fiery oranges and pinks, as the sun dips below the horizon, casting a warm, golden glow.


### Frequency Penalty
This reduces the chance of words repeating by applying a penalty based on how often they’ve already appeared. A higher frequency penalty means the model repeats words less, keeping responses varied.

**Effect:** 
- **Low Frequency Penalty:** More repetitions.
- **High Frequency Penalty:** Less repetition, encourages use of varied words.

In [23]:
prompt = "Tell me about dogs in less than 20 words."

def complete(prompt, frequency_penalty):
    return client.chat.completions.create(
        model="gpt-4o-mini",
        frequency_penalty=frequency_penalty,
        messages=[
            {"role": "user", "content": prompt}
        ]
    ).choices[0].message.content

print("Freq penalty -1",complete(prompt, -1))
print("Freq penalty 1.5",complete(prompt, 1.5))

Top p 0.2:  Dogs are loyal companions, known for their intelligence, friendliness, and diverse breeds, serving roles in work and love.
Top p 0.9:  Dogs are loyal, social animals known for companionship, intelligence, and diverse breeds. They bond closely with humans.


### Presence Penalty
This also discourages repetition but treats all repeats the same, whether a word appears twice or ten times. A higher presence penalty makes the output more varied, while a lower one keeps it more focused. Use higher values for creative text and lower ones for more consistent, on-topic responses. Alter either one of Frequency Penalty or Presence Penalty, not both.

**Effect:** 
- **Low Presence Penalty:** Sticks closely to the main topic.
- **High Presence Penalty:** Introduces new and varied topics.

In [25]:
prompt = "Discuss space exploration in less than 50 words."

def complete(prompt, presence_penalty):
    return client.chat.completions.create(
        model="gpt-4o-mini",
        presence_penalty=presence_penalty,
        messages=[
            {"role": "user", "content": prompt}
        ]
    ).choices[0].message.content

print("Presence penalty -1: ",complete(prompt, -1))
print("Presence penalty 1.5: ",complete(prompt, 1.5))

Presence penalty -1:  Space exploration involves the investigation of celestial bodies, using technology to gather data and understand the universe. It enhances our knowledge of planetary systems, potential extraterrestrial life, and Earth's origins. Notable missions include the Apollo moon landings, Mars rovers, and the Hubble Space Telescope, shaping humanity's view of space.
Presence penalty 1.5:  Space exploration involves studying celestial bodies and phenomena beyond Earth, advancing our understanding of the universe. It encompasses robotic missions, crewed spaceflights, and telescopic observations, aiming to uncover cosmic origins, search for extraterrestrial life, and foster technological innovations, ultimately expanding humanity's presence in space.


### Effect on Response When Each Setting is Decreased or Increased

| Setting             | Decrease                                 | Increase                                  |
|---------------------|------------------------------------------|-------------------------------------------|
| **Temperature**     | More predictable and repetitive output.  | More creative, diverse, and risk-taking.  |
| **Maximum Tokens**  | Shorter, less detailed responses.        | Longer, more detailed responses.          |
| **Top P**           | Limited to high-probability words; safer output. | Broader word choice; more varied output.  |
| **Frequency Penalty** | More repetition of words or phrases.   | Less repetition; more varied vocabulary.  |
| **Presence Penalty** | Sticks closely to current topics.       | More new topics and ideas introduced.     |

## General Best Practices
Prompt engineering tactics recommended by [OpenAI](https://platform.openai.com/docs/guides/prompt-engineering/strategy-write-clear-instructions)