# Prompt Basics

A prompt is a piece of text that conveys to a LLM what the user wants. What the user wants can be many things like:

- Asking a question
- Giving an instruction
- Etc...

The key components of a prompt are:
1. Instruction: where you describe what you want
2. Context: additional information to help with performance
3. Input data: data the model has not seem to illustrate what you need
4. Output indicator: How you prime the model to output what you want, for example asking the model ["Let's think step by step" and the end of a prompt can boost reasoning performance](https://arxiv.org/pdf/2201.11903.pdf). You can also write "Output:" to prime the model to just write the output and nothing else.

[Prompts can also be seen as a form of programming that can customize the outputs and interactions with an LLM.](https://ar5iv.labs.arxiv.org/html/2302.11382#:~:text=prompts%20are%20also%20a%20form%20of%20programming%20that%20can%20customize%20the%20outputs%20and%20interactions%20with%20an%20llm.)

In [16]:
from IPython.display import Markdown

In [9]:
instruction = "Extract the links from this text:"

In [14]:
input_data = """
In the evolving field of artificial intelligence, recent studies have highlighted the profound impact of large language models (LLMs) on natural language processing capabilities. Smith et al. (2023) in their groundbreaking research, presented in the Journal of AI Research, discussed how the integration of LLMs has revolutionized machine translation, making it significantly more accurate and context-aware (https://www.journalofairesearch.org/integration-of-llms). This leap in technology underscores the need for continuous innovation and ethical considerations in AI development. Additionally, the work by Davis and O'Neil (2024) sheds light on the ethical implications of LLMs in content generation, revealing potential biases that could perpetuate misinformation if not addressed properly (https://www.ethicsinaijournal.org/llm-content-generation-implications).

Conversely, initiatives to make LLMs more transparent and accountable have been gaining traction. The OpenAI team's recent publication (2025) in the AI Transparency Review highlights the successful implementation of new algorithms that enhance the interpretability of LLM decisions, thereby making them more reliable and trustworthy (https://www.aitransparencyreview.org/enhancing-llm-interpretability). According to Anderson and Yamamoto (2025), these advancements not only contribute to the reliability of AI systems but also foster a deeper understanding among users, facilitating a more informed and responsible use of AI technologies (https://www.aiuserinsights.org/understanding-ai-decisions). This body of work underscores the dynamic nature of AI research and the collaborative effort required to harness the full potential of LLMs while mitigating associated risks.
"""

Markdown(input_data)


In the evolving field of artificial intelligence, recent studies have highlighted the profound impact of large language models (LLMs) on natural language processing capabilities. Smith et al. (2023) in their groundbreaking research, presented in the Journal of AI Research, discussed how the integration of LLMs has revolutionized machine translation, making it significantly more accurate and context-aware (https://www.journalofairesearch.org/integration-of-llms). This leap in technology underscores the need for continuous innovation and ethical considerations in AI development. Additionally, the work by Davis and O'Neil (2024) sheds light on the ethical implications of LLMs in content generation, revealing potential biases that could perpetuate misinformation if not addressed properly (https://www.ethicsinaijournal.org/llm-content-generation-implications).

Conversely, initiatives to make LLMs more transparent and accountable have been gaining traction. The OpenAI team's recent publication (2025) in the AI Transparency Review highlights the successful implementation of new algorithms that enhance the interpretability of LLM decisions, thereby making them more reliable and trustworthy (https://www.aitransparencyreview.org/enhancing-llm-interpretability). According to Anderson and Yamamoto (2025), these advancements not only contribute to the reliability of AI systems but also foster a deeper understanding among users, facilitating a more informed and responsible use of AI technologies (https://www.aiuserinsights.org/understanding-ai-decisions). This body of work underscores the dynamic nature of AI research and the collaborative effort required to harness the full potential of LLMs while mitigating associated risks.


In [3]:
from openai import OpenAI
client = OpenAI()

def get_response(prompt_question):
    response = client.chat.completions.create(
        model="gpt-3.5-turbo-0125",
        messages=[{"role": "system", "content": "You are a helpful research and programming assistant"},
                  {"role": "user", "content": prompt_question}]
    )
    
    return response.choices[0].message.content

In [18]:
prompt = instruction + input_data

output = get_response(prompt)

Markdown(output)

Here are the links extracted from the text:

1. https://www.journalofairesearch.org/integration-of-llms
2. https://www.ethicsinaijournal.org/llm-content-generation-implications
3. https://www.aitransparencyreview.org/enhancing-llm-interpretability
4. https://www.aiuserinsights.org/understanding-ai-decisions

Input Data & Context Information

In [19]:
context = "You are a url extraction engine. You will be fed text and return ONLY a list containing all the urls in the text."

In [20]:
prompt = context + "\n\n" + instruction + "\n" + text

Markdown(prompt) 

You are a url extraction engine. You will be fed text and return ONLY a list containing all the urls in the text.

Extract the links from this text:

In the evolving field of artificial intelligence, recent studies have highlighted the profound impact of large language models (LLMs) on natural language processing capabilities. Smith et al. (2023) in their groundbreaking research, presented in the Journal of AI Research, discussed how the integration of LLMs has revolutionized machine translation, making it significantly more accurate and context-aware (https://www.journalofairesearch.org/integration-of-llms). This leap in technology underscores the need for continuous innovation and ethical considerations in AI development. Additionally, the work by Davis and O'Neil (2024) sheds light on the ethical implications of LLMs in content generation, revealing potential biases that could perpetuate misinformation if not addressed properly (https://www.ethicsinaijournal.org/llm-content-generation-implications).

Conversely, initiatives to make LLMs more transparent and accountable have been gaining traction. The OpenAI team's recent publication (2025) in the AI Transparency Review highlights the successful implementation of new algorithms that enhance the interpretability of LLM decisions, thereby making them more reliable and trustworthy (https://www.aitransparencyreview.org/enhancing-llm-interpretability). According to Anderson and Yamamoto (2025), these advancements not only contribute to the reliability of AI systems but also foster a deeper understanding among users, facilitating a more informed and responsible use of AI technologies (https://www.aiuserinsights.org/understanding-ai-decisions). This body of work underscores the dynamic nature of AI research and the collaborative effort required to harness the full potential of LLMs while mitigating associated risks.


In [22]:
output = get_response(prompt)

Markdown(output)

Here is the list of URLs extracted from the provided text:
1. https://www.journalofairesearch.org/integration-of-llms
2. https://www.ethicsinaijournal.org/llm-content-generation-implications
3. https://www.aitransparencyreview.org/enhancing-llm-interpretability
4. https://www.aiuserinsights.org/understanding-ai-decisions

In [23]:
output_indicator = "Output:\n"

In [24]:
prompt = context + "\n\n" + instruction + "\n" + input_data + "\n" + output_indicator

In [25]:
Markdown(prompt)

You are a url extraction engine. You will be fed text and return ONLY a list containing all the urls in the text.

Extract the links from this text:

In the evolving field of artificial intelligence, recent studies have highlighted the profound impact of large language models (LLMs) on natural language processing capabilities. Smith et al. (2023) in their groundbreaking research, presented in the Journal of AI Research, discussed how the integration of LLMs has revolutionized machine translation, making it significantly more accurate and context-aware (https://www.journalofairesearch.org/integration-of-llms). This leap in technology underscores the need for continuous innovation and ethical considerations in AI development. Additionally, the work by Davis and O'Neil (2024) sheds light on the ethical implications of LLMs in content generation, revealing potential biases that could perpetuate misinformation if not addressed properly (https://www.ethicsinaijournal.org/llm-content-generation-implications).

Conversely, initiatives to make LLMs more transparent and accountable have been gaining traction. The OpenAI team's recent publication (2025) in the AI Transparency Review highlights the successful implementation of new algorithms that enhance the interpretability of LLM decisions, thereby making them more reliable and trustworthy (https://www.aitransparencyreview.org/enhancing-llm-interpretability). According to Anderson and Yamamoto (2025), these advancements not only contribute to the reliability of AI systems but also foster a deeper understanding among users, facilitating a more informed and responsible use of AI technologies (https://www.aiuserinsights.org/understanding-ai-decisions). This body of work underscores the dynamic nature of AI research and the collaborative effort required to harness the full potential of LLMs while mitigating associated risks.

Output:


In [28]:
output = get_response(prompt)

Markdown(output)

['https://www.journalofairesearch.org/integration-of-llms', 'https://www.ethicsinaijournal.org/llm-content-generation-implications', 'https://www.aitransparencyreview.org/enhancing-llm-interpretability', 'https://www.aiuserinsights.org/understanding-ai-decisions']

In [4]:
import tiktoken

def get_num_tokens(prompt, model="gpt-4-turbo-preview"):
    """Calculates the number of tokens in a text prompt"""
    enc = tiktoken.encoding_for_model(model)
    return len(enc.encode(prompt))

get_num_tokens(instruction)

7

# Prompt Engineering Strategies

- Strategy 1: Write clear instructions
- Strategy 2: Provide reference text
- Strategy 3: Break tasks into subtasks
- Strategy 4: Give the model time to think
- Strategy 5: Use external tools
- Strategy 6: Test changes systematically

In [27]:
clear_instructions_prompt = ""

In [28]:
reference_text_prompt = ""

In [31]:
task_breakdown_prompt = """The following is a paper explaining this architecture named 'Transformer':
'''
'''
1. Break this paper down into its main sections.
2. Summarize each section in one sentence.
3. Write a pargraph combining these summarizations
4. Your output should be both the paragraph summary and the bullet points section summary.
"""

In [None]:
time_to_think_prompt = ""

In [None]:
external_tools_prompt = ""

In [None]:
test_changes_systematically_prompt = ""

# Why Is Prompting Hard?

1. People don't know what they want
2. People know what they want but can't explain it well to the model
3. People know what they want but don't know how to get the best performance from the model

[Principles for Prompt Engineering by Karina Nguyen](https://youtu.be/6d60zVdcCV4?list=PLcfpQ4tk2k0U3Nr2KKWJsVnjF5S9bTXY-&t=126)

# How do We Solve This?

1. Figure out what you want -> define the task clearly

2. Stablish a metric to score the output performance on your tasks

3. Experimentation (this is where prompt engineering comes alive)

# Final notes

Prompt Engineering is a trial-and-error process that also relies on your intuition.


Build domain understanding - use your domain expertise to customize prompt for relevance.


Build model understanding - adapt prompt to suit model strengths & weaknesses to improve quality.

Iterate & Validate - define acceptance or termination criteria so you iterate to meet expectations but don't over-engineer the prompt to reduce reusability

# References

- [A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT](https://ar5iv.labs.arxiv.org/html/2302.11382)
- [Prompt-Engineering-Guide](https://github.com/dair-ai/Prompt-Engineering-Guide)
- [A Survey of Large Language Models](https://arxiv.org/pdf/2303.18223.pdf)
- [Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing](https://arxiv.org/pdf/2107.13586.pdf)
- [prompt engineering guide - zero shot prompting example](https://www.promptingguide.ai/techniques/zeroshot)
- [Finetuned language models are zero-shot learners](https://arxiv.org/pdf/2109.01652.pdf)
- [prompt engineering guide - few shot prompting](https://www.promptingguide.ai/techniques/fewshot)
- [prompt engineering guide - chain of thought prompting](https://www.promptingguide.ai/techniques/cot)
- [Wei et al. (2022)](https://arxiv.org/abs/2201.11903)
- [prompt engineering guide - self-consistency](https://www.promptingguide.ai/techniques/consistency)
- [prompt engineering guide - generate knowledge](https://www.promptingguide.ai/techniques/knowledge)
- [Liu et al. 2022](https://arxiv.org/pdf/2110.08387.pdf)
- [prompt engineering guide - tree of thoughts (ToT)](https://www.promptingguide.ai/techniques/tot)
- [Prompt Engineering by Lilian Weng](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/)
- [Prompt Engineering vs. Blind Prompting](https://mitchellh.com/writing/prompt-engineering-vs-blind-prompting#the-demonstration-set)