# Prompt Engineering
<img src="./assets/pe_banner.jpg">

Prompt Engineering is this thrilling new discipline that opens the door to a world of possibilities with large language models (LLMs).

As a prompt engineer, you'll delve into the depths of LLMs, unraveling their capabilities and limitations with finesse. But prompt engineering isn't about mere prompts. It is aa combination of skills and techniques, enabling you to interact and innovate through the use of LLMs.

In this module, we will step into the fascinating world of prompt engineering, where we will learn about key principals of working with LLMs through prompts.

## Local Model using GPT4ALL
> GPT4All is an open-source software ecosystem that allows anyone to train and deploy powerful and customized large language models (LLMs) on everyday hardware. Nomic AI oversees contributions to the open-source ecosystem ensuring quality, security and maintainability.

It provides easy to setup and use python bindings.

```python
!pip install gpt4all
```

For OpenAI bindings
```python
!pip install --upgrade openai
```

In [9]:
!poetry add streamlit

Using version [39;1m^1.37.0[39;22m for [36mstreamlit[39m

[34mUpdating dependencies[39m
[2K[34mResolving dependencies...[39m [39;2m(4.1s)[39;22m[34mResolving dependencies...[39m [39;2m(0.3s)[39;22m[34mResolving dependencies...[39m [39;2m(3.0s)[39;22m[34mResolving dependencies...[39m [39;2m(3.3s)[39;22m

No dependencies to install or update

[34mWriting lock file[39m


<a target="_blank" href="https://colab.research.google.com/github/raghavbali/llm_workshop_dhs23/blob/main/module_04/prompt_engineeering_and_langchain.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

In [1]:
import gpt4all
from IPython.display import display, Markdown
from openai import OpenAI
import json
import os

In [2]:
# NOTE: If you have access to openAI, this can be easily used with the same
MODEL_TYPE_OPENAI = 'OPENAI'
MODEL_TYPE_LOCALAI = 'LOCAL_LLM'

In [3]:
OPENAI_TOKEN = '<YOUR KEY>'
OPEN_AI_MODEL = "gpt-4o-mini"
openai_client = OpenAI(
        api_key=OPENAI_TOKEN,
    )

# llama quantized
LOCAL_MODEL_NAME = "Nous-Hermes-2-Mistral-7B-DPO.Q4_0.gguf"
#or "GPT4All-13B-snoozy.ggmlv3.q4_0.bin"
ollm_model = gpt4all.GPT4All(LOCAL_MODEL_NAME)

In [4]:
def get_completion(prompt, model_type):
    if model_type == "OPENAI":
        messages = [{"role": "user", "content": prompt}]
        response = openai_client.chat.completions.create(
            model=OPEN_AI_MODEL,
            messages = messages,
            temperature=0
        )
        return response.choices[0].message.content
    else:
        with ollm_model.chat_session():
            return ollm_model.generate(prompt)

## Prompting Basics

+ Be Clear and Provide Specific Instructions
+ Allow Time to **Think**



In [5]:
# Be Clear and Specific

# Example: Clearly state which text to look at, provide delimiters
text = """
The dominant sequence transduction models are based on complex recurrent or 
convolutional neural networks in an encoder-decoder configuration. The best 
performing models also connect the encoder and decoder through an attention 
mechanism. We propose a new simple network architecture, the Transformer, 
based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. 
Experiments on two machine translation tasks show these models to be superior in quality 
while being more parallelizable and requiring significantly less time to train.
"""

prompt = f"""
Summarize the text delimited by triple backticks \
into a single sentence. Identify key contributions.
```{text}```
"""
display(Markdown(f"> sample output {MODEL_TYPE_LOCALAI}"))
print(get_completion(prompt, MODEL_TYPE_LOCALAI))

display(Markdown(f"> sample output {MODEL_TYPE_OPENAI}"))
print(get_completion(prompt, MODEL_TYPE_OPENAI))

> sample output LOCAL_LLM

The proposed Transformer architecture, based solely on attention mechanisms without recurrence or convolutions, outperforms existing complex recurrent/convolutional neural network-based encoder-decoder models in quality while reducing training time and increasing parallelizability for machine translation tasks.


> sample output OPENAI

The authors introduce the Transformer, a novel network architecture for sequence transduction that relies exclusively on attention mechanisms, demonstrating superior performance in machine translation tasks compared to traditional recurrent and convolutional models, while also being more efficient in training time and parallelization. Key contributions include the elimination of recurrence and convolutions, and improved training efficiency and translation quality.


In [6]:
# Be Clear and Specific
text = """
The dominant sequence transduction models are based on complex recurrent or 
convolutional neural networks in an encoder-decoder configuration. The best 
performing models also connect the encoder and decoder through an attention 
mechanism. We propose a new simple network architecture, the Transformer, 
based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. 
Experiments on two machine translation tasks show these models to be superior in quality 
while being more parallelizable and requiring significantly less time to train.
"""
prompt = f"""
Summarize the text delimited by triple backticks \
into a single sentence. Provide response in markdown format
with a title for the summary.
```{text}```

"""
display(Markdown(f"> sample output {MODEL_TYPE_LOCALAI}"))
print(get_completion(prompt, MODEL_TYPE_LOCALAI))

display(Markdown(f"> sample output {MODEL_TYPE_OPENAI}"))
print(get_completion(prompt, MODEL_TYPE_OPENAI))

> sample output LOCAL_LLM

Title: Transformer Model for Machine Translation

Summary: The proposed Transformer model, based solely on attention mechanisms without recurrence or convolutions, outperforms dominant sequence transduction models in quality while being more parallelizable and requiring less training time.


> sample output OPENAI

# Summary of Transformer Model Proposal

The Transformer is a novel network architecture that relies exclusively on attention mechanisms, eliminating the need for recurrent or convolutional layers, and demonstrates superior performance and efficiency in machine translation tasks.


In [7]:
# Be Clear and Specific, aka provide step by step instructions
text = """To make tea you first need to have a cup full of water,
half cup milk, some sugar and tea leaves. Start by boiling water.
Once it comes to a boil, add milk to it. Next step is to add tea and
let it boil for another minute.
Add sugar to taste. Serve in a tall glass
"""

prompt = f"""
Read the text delimited by triple single quotes.
Check if it contains a sequence of instructions, \
re-write the instructions in the following format:

Point 1 - ...
Point 2 - …
…
Point N - …

If the text does not contain a sequence of instructions, \
then apologize that you cannot rephrase such text.

'''{text}'''
"""

display(Markdown(f"> sample output {MODEL_TYPE_LOCALAI}"))
print(get_completion(prompt, MODEL_TYPE_LOCALAI))

display(Markdown(f"> sample output {MODEL_TYPE_OPENAI}"))
print(get_completion(prompt, MODEL_TYPE_OPENAI))

> sample output LOCAL_LLM

To make tea:
1. Fill a cup with water.
2. Add half a cup of milk.
3. Put some sugar (to taste).
4. Add tea leaves.
5. Boil the mixture for about one minute.
6. Serve in a tall glass.


> sample output OPENAI

Point 1 - Have a cup full of water.  
Point 2 - Have half a cup of milk.  
Point 3 - Gather some sugar and tea leaves.  
Point 4 - Start by boiling the water.  
Point 5 - Once the water comes to a boil, add the milk.  
Point 6 - Add the tea and let it boil for another minute.  
Point 7 - Add sugar to taste.  
Point 8 - Serve in a tall glass.  


In [8]:
# without instructions
# openAI
prompt= "What are snakes?"
display(Markdown(f"> sample output {MODEL_TYPE_OPENAI}"))
print(get_completion(prompt, MODEL_TYPE_OPENAI))

> sample output OPENAI

Snakes are elongated, legless reptiles belonging to the suborder Serpentes. They are part of the class Reptilia and are characterized by their unique body structure, which includes a long, flexible body, a lack of limbs, and a highly mobile jaw that allows them to consume prey much larger than their head. Snakes are found in a variety of habitats, including forests, deserts, grasslands, and aquatic environments, and they are distributed across every continent except Antarctica.

Key features of snakes include:

1. **Body Structure**: Snakes have a cylindrical body covered in scales, which can vary in texture and color. Their lack of limbs is a defining characteristic, and they move by contracting their muscles and using their scales to grip surfaces.

2. **Diet**: Most snakes are carnivorous, feeding on a diet that can include rodents, birds, amphibians, fish, and even other reptiles. Some species are specialized feeders, while others are more generalist.

3. **Locomotion**: Snakes mov

In [9]:
# Be Clear and Specific, aka provide examples
prompt = f"""
Your task is to answer in conversation style mentioned in triple back quotes.
Keep answers very short similar to examples provided below.

```
<kid>: What are birds?
<father>: birds are cute little creatures that can fly

<kid>: What are whales?
<father>: Whales are very big fish that roam the oceans
```

<kid>: What are snakes?
<father>:
"""
display(Markdown(f"> sample output {MODEL_TYPE_LOCALAI}"))
print(get_completion(prompt, MODEL_TYPE_LOCALAI))

display(Markdown(f"> sample output {MODEL_TYPE_OPENAI}"))
print(get_completion(prompt, MODEL_TYPE_OPENAI))

> sample output LOCAL_LLM

Snakes are long, slithering reptiles.


> sample output OPENAI

Snakes are long, legless reptiles that slither on the ground.


In [10]:
# Allow for time to think (similar to step by step instructions)
text = """
Our last holiday was in Germany. We visited Berlin and Hamburg.
"""
prompt = f"""
Summarize the text delimited by triple \
backticks briefly. Then follow the instructions :
1 - Translate the summary to German.
2 - List each city in the text.
3 - Output a python dictionary object that contains the following \
keys: original_text, german_translation, num_cities, city_names.

Text:
```{text}```
"""

display(Markdown(f"> sample output {MODEL_TYPE_LOCALAI}"))
print(get_completion(prompt, MODEL_TYPE_LOCALAI))

display(Markdown(f"> sample output {MODEL_TYPE_OPENAI}"))
print(get_completion(prompt, MODEL_TYPE_OPENAI))

> sample output LOCAL_LLM

Summary: Unser letter Urlaub war in Deutschland. Wir besuchten Berlin und Hamburg.
Translation to German: Our last vacation was in Germany. We visited Berlin and Hamburg.
Cities mentioned: Berlin, Hamburg

Python dictionary object:
{
    "original_text": "Our last holiday was in Germany. We visited Berlin and Hamburg.",
    "german_translation": "Unser letter Urlaub war in Deutschland. Wir besuchten Berlin und Hamburg.",
    "num_cities": 2,
    "city_names": ["Berlin", "Hamburg"]
}


> sample output OPENAI

**Summary:** The text describes a holiday in Germany where the cities Berlin and Hamburg were visited.

1. **German Translation:** Unser letzter Urlaub war in Deutschland. Wir haben Berlin und Hamburg besucht.
2. **Cities:** Berlin, Hamburg
3. **Python Dictionary:**
```python
{
    "original_text": "Our last holiday was in Germany. We visited Berlin and Hamburg.",
    "german_translation": "Unser letzter Urlaub war in Deutschland. Wir haben Berlin und Hamburg besucht.",
    "num_cities": 2,
    "city_names": ["Berlin", "Hamburg"]
}
```


In [11]:
# Allow time to think, aka ask LLM to generate its own answer and then compare

prompt = f"""
Determine if the user's solution delimited by triple back ticks\
is correct or not.
To solve the problem the instructions are as follows:
- Step 1: prepare your own solution to the problem.
- Step 2: Compare your solution to the user's solution \
and evaluate if the user's solution is correct or not.
Do not decide if the solution is correct until
you have done the problem yourself.

Use the following format:
Question:
```
question here
```
User's solution:
```
student's solution here
```
Actual solution:
```
steps to work out the solution and your solution here
```
Is the user's solution the same as actual solution \
just calculated:
```
yes or no
```
Final Answer:
```
correct or incorrect
```

Question:
```
I went to the market and bought 10 apples.
I gave 2 apples to the neighbor and 2 to the repairman.
I then went and bought 5 more apples and ate 1. How many apples did I remain with?
```
User's solution:
```
1. I started with 10 apples.
2. I gave away 2 apples to the neighbor and 2 to the repairman, so now I have 6 apples left.
3. Then I bought 5 more apples, so now I have 11 apples.
4. I then ate 1 apple, so I will have only 10 apples with me.
```
Actual Answer:
"""

display(Markdown(f"> sample output {MODEL_TYPE_LOCALAI}"))
print(get_completion(prompt, MODEL_TYPE_LOCALAI))

display(Markdown(f"> sample output {MODEL_TYPE_OPENAI}"))
print(get_completion(prompt, MODEL_TYPE_OPENAI))

> sample output LOCAL_LLM

Question:
```
I went to the market and bought 10 apples.
I gave 2 apples to the neighbor and 2 to the repairman.
I then went and bought 5 more apples and ate 1. How many apples did I remain with?
```
User's solution:
```
1. I started with 10 apples.
2. I gave away 2 apples to the neighbor and 2 to the repairman, so now I have 6 apples left.
3. Then I bought 5 more apples, so now I have 11 apples.
4. I then ate 1 apple, so I will have only 10 apples with me.
```
Actual Solution:
```
Step 1: Started with 10 apples.
Step 2: Gave away a total of 4


> sample output OPENAI

```
1. I started with 10 apples.
2. I gave away 2 apples to the neighbor and 2 to the repairman, so now I have 10 - 2 - 2 = 6 apples left.
3. Then I bought 5 more apples, so now I have 6 + 5 = 11 apples.
4. I then ate 1 apple, so I will have 11 - 1 = 10 apples left.
```
Is the user's solution the same as actual solution just calculated:
```
no
```
Final Answer:
```
incorrect
```


## Types of Prompts

<img src="./assets/pe_types.jpg">

### Zero-Shot Prompting
Zero-shot or without any examples. Since LLMs are trained on huge amounts of data and instructions, they work pretty well without any specific examples (shots) for usual tasks such as summarization, sentiment classification, grammar checks, etc.

_Sample Prompt_:
```
Classify the text as neutral, positive or negative.
Text: The food at this restaurant is so bad.
Sentiment:

```

### Few-Shot Prompting
LLMs are good for basic instructions they are trained with but for complex requirements they need some hand-holding or some examples to better understand the instructions.

_Sample Prompt_:
```
Superb drinks and amazing service! > Positive
I don't understand why this place is so expensive, worst food ever. > Negative
Totally worth it, tasty 100%. > Positive
This place is such an utter waste of time. >
```
**Note**: We did not explicitly instruct our LLM to do sentiment classification, rather gave examples (few-shot) to help it understand


### Chain of Thought (COT)
Tasks which are more complex and require a bit of *reasoning* (careful there 😉 ) require special measures. Introduced by in a paper of similar title by [Wei et. al.](https://arxiv.org/abs/2201.11903) combines few-shot prompting with additional instructions for the LLM to think through while generating the response.

_Sample Prompt_:
<img src="./assets/cot_few_shot.png">

> Source: [Wei et. al.](https://arxiv.org/abs/2201.11903)

#### COT Zero Shot ✨
Extension of COT setup where instead of providing examples on how to solve a problem, we explicitly state ``Let's think step by step``. This was introduced by [Kojima et. al.](https://arxiv.org/abs/2205.11916)

_Sample Prompt_:
```
I went to the market and bought 10 apples.
I gave 2 apples to the neighbor and 2 to the repairman.
I then went and bought 5 more apples and ate 1. How many apples did I remain with?
Let's think step by step.
```

## Advanced Prompting Techniques
Prompt Engineering or PE is an active area of research where new techniques
are being explored every day. Some of these are:

  - [Auto Chain of Thought](https://arxiv.org/abs/2210.03493)
  - [Majority Vote or Self-Consistency](https://arxiv.org/abs/2203.11171)
  - [Tree of Thoughts](https://arxiv.org/abs/2305.10601)
  - Augmented Generation/Retrieval
  - [Auto Prompt Engineering (APE)](https://arxiv.org/abs/2211.01910)
  - [Multi-modal Prompting](https://arxiv.org/abs/2302.00923)
  


## LangChain 🦜🔗
- [LangChain](https://python.langchain.com/docs/get_started/introduction.html) is a framework for developing LLM powered applications.
- It provides capabilities to connect LLMs to a number of different sources of data
- Provides interfaces for language models to interact with external environment (aka _Agentic_)
- Provides for required levels of abstractions to designing end to end applications

In [12]:
from langchain_openai import ChatOpenAI

In [13]:
from langchain import hub
from langchain_community.llms import GPT4All
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain_core.runnables import RunnablePassthrough,RunnableLambda

In [14]:
llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.5,
    max_tokens=None,
    timeout=None,
    max_retries=2,
    api_key=OPENAI_TOKEN,  # if you prefer to pass api key in directly instaed of using env vars
)

In [15]:
prompt = hub.pull("rlm/rag-prompt")
prompt

ChatPromptTemplate(input_variables=['context', 'question'], metadata={'lc_hub_owner': 'rlm', 'lc_hub_repo': 'rag-prompt', 'lc_hub_commit_hash': '50442af133e61576e74536c6556cefe1fac147cad032f4377b60c436e6cdcb6e'}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: {question} \nContext: {context} \nAnswer:"))])

In [16]:
prompt = """You are a friendly chatbot assistant that responds in a conversational
            manner to users questions. Keep the answers short, unless specifically
            asked by the user to elaborate on something.

            Question:
            {question}

            Answer:
         """

prompt_template = ChatPromptTemplate.from_template(prompt)

In [17]:
qa_chain = (
    {
        "question": RunnablePassthrough()
    }
      |
    prompt_template
      |
    llm
)

In [18]:
query = "What is the capital of Australia?"
result = qa_chain.invoke(query)
print(result.content)

The capital of Australia is Canberra.


## LangChain Conversation Buffer

LangChain provides us with an easy to use interface to enable LLMs to refer to context/memory
across multiple chains/calls

In [19]:
from langchain import LLMChain, PromptTemplate
from langchain.memory import ConversationBufferWindowMemory
from langchain_core.callbacks import StdOutCallbackHandler

In [20]:
prompt = """You are a friendly chatbot assistant that responds to the instructions of the user. 
            You use conversation history as needed.
            Conversation History: ```{history}```
            Instructions:{instructions}"""
chain_with_history = LLMChain(
    prompt=PromptTemplate.from_template(prompt),
    llm=llm,
    memory=ConversationBufferWindowMemory(k=2),
    verbose=True,
)


print(
    chain_with_history.run(
        {
            "instructions": """Generate a title for the text delimited within angle braces <We went on Holiday to Germany. 
            We explored different castles and museums while visited the cities of Berlin and Hamburg>"""
        },
        callbacks=[StdOutCallbackHandler()],
    )
)



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a friendly chatbot assistant that responds to the instructions of the user. 
            You use conversation history as needed.
            Conversation History: ``````
            Instructions:Generate a title for the text delimited within angle braces <We went on Holiday to Germany. 
            We explored different castles and museums while visited the cities of Berlin and Hamburg>[0m


  warn_deprecated(
  warn_deprecated(



[1m> Finished chain.[0m
"Discovering Germany: A Journey Through Castles and Culture in Berlin and Hamburg"


In [21]:
print(
    chain_with_history.run(
        {
            "instructions": """Translate AI Response response into German language"""
        },
        callbacks=[StdOutCallbackHandler()],
    )
)



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a friendly chatbot assistant that responds to the instructions of the user. 
            You use conversation history as needed.
            Conversation History: ```Human: Generate a title for the text delimited within angle braces <We went on Holiday to Germany. 
            We explored different castles and museums while visited the cities of Berlin and Hamburg>
AI: "Discovering Germany: A Journey Through Castles and Culture in Berlin and Hamburg"```
            Instructions:Translate AI Response response into German language[0m

[1m> Finished chain.[0m
"Deutschland entdecken: Eine Reise durch Schlösser und Kultur in Berlin und Hamburg"


In [22]:
print(
    chain_with_history.run(
        {
            "instructions": """Translate the German title to English"""
        },
        callbacks=[StdOutCallbackHandler()],
    )
)



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a friendly chatbot assistant that responds to the instructions of the user. 
            You use conversation history as needed.
            Conversation History: ```Human: Generate a title for the text delimited within angle braces <We went on Holiday to Germany. 
            We explored different castles and museums while visited the cities of Berlin and Hamburg>
AI: "Discovering Germany: A Journey Through Castles and Culture in Berlin and Hamburg"
Human: Translate AI Response response into German language
AI: "Deutschland entdecken: Eine Reise durch Schlösser und Kultur in Berlin und Hamburg"```
            Instructions:Translate the German title to English[0m

[1m> Finished chain.[0m
"Discovering Germany: A Journey Through Castles and Culture in Berlin and Hamburg"


In [23]:
print(
    chain_with_history.run(
        {
            "instructions": """Compare the original title you generated with german to english translated title, are they same?"""
        },
        callbacks=[StdOutCallbackHandler()],
    )
)



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a friendly chatbot assistant that responds to the instructions of the user. 
            You use conversation history as needed.
            Conversation History: ```Human: Translate AI Response response into German language
AI: "Deutschland entdecken: Eine Reise durch Schlösser und Kultur in Berlin und Hamburg"
Human: Translate the German title to English
AI: "Discovering Germany: A Journey Through Castles and Culture in Berlin and Hamburg"```
            Instructions:Compare the original title you generated with german to english translated title, are they same?[0m

[1m> Finished chain.[0m
Yes, the original title I generated in German, "Deutschland entdecken: Eine Reise durch Schlösser und Kultur in Berlin und Hamburg," translates to "Discovering Germany: A Journey Through Castles and Culture in Berlin and Hamburg" in English. They are the same in meaning, just expressed in different languages