<a href="https://colab.research.google.com/github/aljebraschool/Generative-AI-Internship-Codes/blob/main/1_prompt_engineering_overview_part2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Overview of Prompt Engineering Techniques & Best Practices

## Part 2: Prompt Engineering Techniques

In this section, we cover the best prompt engineering techniques and how to apply them.

We first load the necessary libraries:

In [None]:
! pip install openai==0.28 langchain google-search-results chromadb tiktoken --quiet

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.5/76.5 kB[0m [31m974.1 kB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m975.5/975.5 kB[0m [31m5.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m559.5/559.5 kB[0m [31m8.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m9.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m337.4/337.4 kB[0m [31m10.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m127.5/127.5 kB[0m [31m8.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m17.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m92.0/92.0 kB[0m [31m6.3 MB/

In [None]:
import openai
import IPython
import os

In [None]:
#configure openai key
openai.api_key =

In [None]:
#code completion function
def get_completion(message, model = 'gpt-3.5-turbo', temperature = 0, max_tokens = 350):
  response = openai.ChatCompletion.create(
    messages = message,
    model = model,
    temperature = temperature,
    max_tokens = max_tokens
  )

  return response.choices[0].message['content']

### Few-shot In-Context Learning

Below we provide an example of few-shot prompting with demonstrations:

In [None]:
prompt = """ Your task is to classify an input text (delimited by ```) as either offensive or non-offensive.

You the following example to guide your classification:

Text: I love you
Output: non-offensive

Text: I dislike all those people working at the company
Output: offensive

Text: I think this feature is not ideal
Output: non-offensive

Text: Those people are so stupid
Output: offensive

Text: {user_input}
Output:


"""

message = [
    {
        'role': 'user',
        'content': prompt.format(user_input = "``` I love playing mario ```")
    }
]

response = get_completion(message)
print(response)

non-offensive


### Chain-of-Though (CoT) Prompting

Below is an example of CoT applied. Specifically, we aim to build a movie recommendation system using CoT.

Let's first define a set of movies:

In [None]:
movies = """
The Enigma Code
Category: Historical Drama
Rating: 8.3/10
Description: Set during World War II, this gripping historical drama follows the life of Alan Turing, a brilliant mathematician tasked with cracking the Enigma code used by the Nazis. His efforts contribute significantly to the Allies' victory.
Actors: Benedict Cumberbatch, Keira Knightley, Matthew Goode
Language: English
Release Date: March 15, 2014
Award Winner: Academy Award for Best Adapted Screenplay

Shadows of the Samurai
Category: Action/Adventure
Rating: 7.9/10
Description: In feudal Japan, a skilled samurai seeks vengeance against the corrupt warlord who murdered his master. With his swordsmanship and determination, he embarks on a dangerous journey to restore justice.
Actors: Ken Watanabe, Tadanobu Asano, Rinko Kikuchi
Language: Japanese
Release Date: November 7, 2017
Award Winner: None

Mind Games
Category: Psychological Thriller
Rating: 8.1/10
Description: A renowned psychologist becomes entangled in a twisted game of cat and mouse with a patient who harbors dark secrets. As their sessions progress, the lines between reality and deception blur, leading to a mind-bending climax.
Actors: Leonardo DiCaprio, Natalie Portman, Michael Fassbender
Language: English
Release Date: August 22, 2019
Award Winner: None

La Casa del Tango
Category: Musical/Drama
Rating: 8.7/10
Description: In the vibrant world of Buenos Aires, a passionate tango dancer finds love and inspiration amidst the backdrop of political unrest. This musical drama explores the power of dance and the pursuit of dreams.
Actors: Antonio Banderas, Penélope Cruz, Javier Bardem
Language: Spanish
Release Date: June 5, 2020
Award Winner: Golden Globe for Best Foreign Language Film

Timeless Love
Category: Romance/Fantasy
Rating: 7.5/10
Description: A magical encounter transports a modern-day writer back in time to Victorian England, where she falls in love with a charming aristocrat. As they navigate the complexities of time, their love is put to the ultimate test.
Actors: Rachel McAdams, Tom Hiddleston, Emma Thompson
Language: English
Release Date: February 14, 2022
Award Winner: None

The Pursuit of Justice
Category: Legal Drama
Rating: 8.4/10
Description: Inspired by true events, this gripping legal drama follows a determined lawyer's fight against a powerful pharmaceutical company responsible for a life-threatening drug. The courtroom battle becomes a quest for justice and redemption.
Actors: Denzel Washington, Viola Davis, Michael B. Jordan
Language: English
Release Date: October 10, 2022
Award Winner: None

The Forgotten Island
Category: Adventure/Mystery
Rating: 7.6/10
Description: A group of explorers stumbles upon a mysterious island believed to be uninhabited. As they uncover the island's secrets, they encounter deadly challenges and unravel an ancient civilization's enigma.
Actors: Chris Pratt, Bryce Dallas Howard, Tom Holland
Language: English
Release Date: July 2, 2023
Award Winner: None

The Silent Witness
Category: Crime/Thriller
Rating: 8.2/10
Description: A talented forensic pathologist becomes entangled in a high-stakes murder investigation when she discovers crucial evidence that points to a powerful criminal network. With her life on the line, she must outsmart the perpetrators.
Actors: Emily Blunt, Jake Gyllenhaal, Mark Ruffalo
Language: English
Release Date: November 18, 2023
Award Winner: None

A Tale of Two Worlds
Category: Fantasy/Adventure
Rating: 7.8/10
Description: When a young orphan discovers a magical portal to a parallel universe, she embarks on a thrilling adventure to save both realms from an impending disaster. Along the way, she learns about the power of friendship and self-belief.
Actors: Millie Bobby Brown, Tom Holland, Helena Bonham Carter
Language: English
Release Date: April 5, 2024
Award Winner: None

A Symphony of Souls
Category: Music/Drama
Rating: 9.0/10
Description: Set against the backdrop of a renowned symphony orchestra, this emotionally charged drama explores the lives and intertwining stories of its members. Through the power of music, they find solace, love, and redemption.
Actors: Meryl Streep, Tom Hanks, Cate Blanchett
Language: English
Release Date: December 25, 2024
Award Winner: None
"""

We then take those movies and list the different steps the model should perform. Notice that the after detailing the steps, we ask the model to providing the reasoning steps and the final response. This is a format you can use to elicit reasoning in LLMs.

In [None]:
# the system message contains the logic (step by step) for the system to follow
system_message = """
You task is to make movie recommendations based on a user request (delimited by ```).

Step 1: Check if the user is asking about movies. If the user is not asking about movies, just respond "Please ask something about movies!".

Step 2: If the user is asking for a movie recommendation, check if they have any specific requests or interests.

Step 3: Check if there are any movie/s we can recommend from the following: {movies}

Step 4: Prepare a response to the user with the movie recommendation/s. The recommendation have to be about movies that are available in the list above. The response needs to have a friendly and helpful tone.

Return a response with the following reasoning steps and final output to the user:
Step 1: <Step 1 reasoning>
Step 2: <Step 2 reasoning>
Step 3: <Step 3 reasoning>
Step 4: <final response>
"""

messages = [
    {
        "role": "system",
        "content": system_message.format(movies=movies)
    },
    {
        "role": "user",
        "content": "```Do you have any drama movies?```"
    }
]

movie_recommendation_response = get_completion(messages, temperature=0, max_tokens=500)

print(movie_recommendation_response)

Step 1: The user is asking about movies.

Step 2: The user is specifically asking for drama movies.

Step 3: Based on your interest in drama movies, I recommend the following:
1. The Enigma Code
   - Category: Historical Drama
   - Rating: 8.3/10
   - Description: Set during World War II, this gripping historical drama follows the life of Alan Turing, a brilliant mathematician tasked with cracking the Enigma code used by the Nazis.
   - Actors: Benedict Cumberbatch, Keira Knightley, Matthew Goode
   - Language: English
   - Release Date: March 15, 2014
   - Award Winner: Academy Award for Best Adapted Screenplay

2. La Casa del Tango
   - Category: Musical/Drama
   - Rating: 8.7/10
   - Description: In the vibrant world of Buenos Aires, a passionate tango dancer finds love and inspiration amidst the backdrop of political unrest.
   - Actors: Antonio Banderas, Penélope Cruz, Javier Bardem
   - Language: Spanish
   - Release Date: June 5, 2020
   - Award Winner: Golden Globe for Best For

### Prompt Chaining

The example below demonstrates how to chain separate prompts to achieve a specific task. The previous prompt list all the reasoning steps. In the following prompt, we ask the model to extract only the final response to the user:

In [None]:
from typing_extensions import final
# Prompt 1: step by step reasoning (provided above)
# Prompt 2: extract only the final response we will send to the user

system_message_2 = """
You will be given a list of steps that a model has responded with. Your task is to extract only the full response in Step 4 from the following text: {movie_recommendation_response}

Step 4:
"""

messages = [
    {
        "role": "system",
        "content": system_message_2.format(movie_recommendation_response=movie_recommendation_response)
    }
]

final_response = get_completion(messages, temperature=0)

IPython.display.Markdown(final_response)

I recommend checking out "The Enigma Code" for a historical drama experience or "La Casa del Tango" for a musical drama set in Buenos Aires. Enjoy your movie night!

### ReAct

The code below shows an example of combining LLMs with external tools to achieve a task. In particular, the example uses the ReAct framework to prompt and guide the model to the result by leveraging the LLM and a search engine. We will be using SerpAPI in the below example. As mentioned in the course introduction, you will need to register for a free account with SerpAPI to complete this part of the exercise. You can [register here](https://serpapi.com/).

In [None]:
!pip install -U langchain-community # Install the required package

Collecting langchain-community
  Downloading langchain_community-0.2.6-py3-none-any.whl (2.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.2/2.2 MB[0m [31m24.1 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl (28 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading marshmallow-3.21.3-py3-none-any.whl (49 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m49.2/49.2 kB[0m [31m5.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading typing_inspect-0.9.0-py3-none-any.whl (8.8 kB)
Collecting mypy-extensions>=0.3.0 (from typing-inspect<1,>=0.4.0->dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading mypy_extensions-1.0.0-py3-none-any.whl (4.7 kB)
Installing collected packages: mypy-extensi

In [None]:
#import neccessary libraries from langchain
from langchain.agents import load_tools #used to create tools needed for the task
from langchain.agents import AgentType #used to choose the agent type for your task
from langchain.llms import OpenAI  #used to load openai model
from langchain.agents import initialize_agent  #used to initialize the langchain agent

In [None]:
#set the needed key
os.environ['OPENAI_API_KEY'] =
os.environ['SERPAPI_API_KEY'] = "4d93e9c06a92aca7da1e9e5663869776f4373aff939b33ce296b245eb67b6ce3"

In [None]:
#build the llm model using openai
llm = OpenAI(temperature = 0)

#tools to be used, serpapi for searching the web, llm-math for doing math
tools = load_tools(["serpapi", 'llm-math'], llm = llm)

#initialize the agent with the tools and llm, use zero_shot...
agent = initialize_agent(tools, llm, agent = AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose = True)

In [None]:
# run the agent with the prompt or question you're interested in
agent.run("Which team won the 2023 NBA finals?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should search for the answer using a search engine.
Action: Search
Action Input: "2023 NBA finals winner"[0m
Observation: [36;1m[1;3mDenver Nuggets[0m
Thought:[32;1m[1;3m I should double check the answer using a calculator.
Action: Calculator
Action Input: "2023 - 2023"[0m
Observation: [33;1m[1;3mAnswer: 0[0m
Thought:[32;1m[1;3m I now know the final answer.
Final Answer: The Denver Nuggets won the 2023 NBA finals.[0m

[1m> Finished chain.[0m


'The Denver Nuggets won the 2023 NBA finals.'

In [None]:
# run the agent with the prompt or question you're interested in
agent.run("What is the best programming language?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m There are many different programming languages, so it's important to consider what criteria we are using to determine the "best" one.
Action: Search
Action Input: "best programming language"[0m
Observation: [36;1m[1;3m['1. Javascript. JavaScript is a high-level programming language that is one of the core technologies of the World Wide Web. · 2. Python · 3. Go · 4.', "JavaScript: It's like the Beyoncé of programming languages – always in the spotlight. With frameworks like React, Angular, and Vue.js, ...", 'Top Programming Languages to Learn · Python · JavaScript · HTML/CSS · C · C++ · Go (or Golang) · Swift · Java.', "According to Stack Overflow's 2023 Developer's Survey, JavaScript is the most popular language among developers for the eleventh year in a row.", 'The most complete is Python. It has huge amount of libraries and vast uses. It also has a delightful specification and syntax. The most ...', 'Rust, Elixir, Cloj

'Based on various rankings and criteria, it seems like Python and JavaScript are considered the best programming languages.'

### Data Augmentation / RAG

The example below demonstrates how to apply data augmented generation.

We first load the necessary LangChain modules:

In [None]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.embeddings.cohere import CohereEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch
from langchain.vectorstores import Chroma
from langchain.docstore.document import Document
from langchain.prompts import PromptTemplate

from langchain.chains.qa_with_sources import load_qa_with_sources_chain
from langchain.llms import OpenAI

Next, we split the loaded text file into chunks and embed the chunks using OpenAI Embeddings.

In [None]:
!pip install python-docx

Collecting python-docx
  Downloading python_docx-1.1.2-py3-none-any.whl (244 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m244.3/244.3 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: python-docx
Successfully installed python-docx-1.1.2


In [None]:
from docx import Document

# Assuming the file is named 'yoruba_learning_app.docx'
doc = Document('/content/CREATING A BETTER YORUBA LANGUAGE LEARNING APP.docx')
text_data = '\n'.join([paragraph.text for paragraph in doc.paragraphs])

# Now you can work with text_data
print(text_data[:500])  # Print first 500 characters

CREATING A BETTER YORUBA LANGUAGE LEARNING APP: LEVERAGING COLLECTIVE INTELLIGENCE AND EXPERTISE

I am currently pursuing my graduate studies in “collective intelligence” at Muhammed VI Polytechnic University (UM6P), where I am enrolled in a course called “Computer Science Method in Collective Intelligence.” This course emphasizes the application of computer tools to implement group intelligence concepts, specifically addressing how technology can be leveraged to solve large-scale societal probl


In [None]:
# split text into chunks
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0, separator=" ")
texts = text_splitter.split_text(text_data)

# embeddings obtained from OpenAI
embeddings = OpenAIEmbeddings()

The next step is to store the embeddings into Chroma, a dedicated vector storage. And finally, we perform a search by querying Chroma.

In [None]:
# test search
docsearch = Chroma.from_texts(texts, embeddings, metadatas=[{"source": str(i)} for i in range(len(texts))])

query = "What is akomolede?"
docs = docsearch.similarity_search(query)

chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="stuff")
chain({"input_documents": docs, "question": query}, return_only_outputs=True)

{'output_text': " Akomolede is an app for learning the Yoruba language, created by an author who assembled a team of experts in Yoruba language and graphics design. It is currently available on Google's Play Store and aims to address deficiencies in existing Yoruba language learning apps. \nSOURCES: 4, 1"}