
# Intro to LangChain

LangChain is a popular framework that allow users to quickly build apps and pipelines around **L**arge **L**anguage **M**odels. It can be used to for chatbots, **G**enerative **Q**uestion-**A**nwering (GQA), summarization, and much more.

The core idea of the library is that we can _"chain"_ together different components to create more advanced use-cases around LLMs. Chains may consist of multiple components from several modules:

* **Prompt templates**: Prompt templates are, well, templates for different types of prompts. Like "chatbot" style templates, ELI5 question-answering, etc

* **LLMs**: Large language models like GPT-3, BLOOM, etc

* **Agents**: Agents use LLMs to decide what actions should be taken, tools like web search or calculators can be used, and all packaged into logical loop of operations.

* **Memory**: Short-term memory, long-term memory.

In [None]:
!pip install -qU langchain

# Using LLMs in LangChain

LangChain supports several LLM providers, like Hugging Face and OpenAI.

Let's start our exploration of LangChain by learning how to use a few of these different LLM integrations.

## Hugging Face

We first need to install additional prerequisite libraries:

In [None]:
!pip install -qU huggingface_hub

For Hugging Face models we need a Hugging Face Hub API token. We can find this by first getting an account at [HuggingFace.co](https://huggingface.co/) and clicking on our profile in the top-right corner > click *Settings* > click *Access Tokens* > click *New Token* > set *Role* to *write* > *Generate* > copy and paste the token below:

In [None]:
import os

os.environ['HUGGINGFACEHUB_API_TOKEN'] = 'API_KEY'

We can then generate text using a HF Hub model (we'll use `google/flan-t5-x1`) using the Inference API built into Hugging Face Hub.

_(The default Inference API doesn't use specialized hardware and so can be slow and cannot run larger models like `bigscience/bloom-560m` or `google/flan-t5-xxl`)_

In [None]:
!pip install -qU transformers

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.7/7.7 MB[0m [31m20.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.8/3.8 MB[0m [31m35.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m45.8 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
import warnings

# Suppress specific warnings
warnings.filterwarnings("ignore", category=UserWarning, module="tqdm.auto")
warnings.filterwarnings("ignore", category=FutureWarning, module="huggingface_hub.utils._deprecation")

In [None]:
from langchain import PromptTemplate, HuggingFaceHub, LLMChain

# initialize HF LLM
flan_t5 = HuggingFaceHub(
#     repo_id="google/flan-t5-xl",
#     repo_id="google/flan-t5-base",
    repo_id="google/flan-t5-large",
    model_kwargs={"temperature":1e-10}
)

# build prompt template for simple question-answering
template = """Question: {question}

Answer: """
prompt = PromptTemplate(template=template, input_variables=["question"])

llm_chain = LLMChain(
    prompt=prompt,
    llm=flan_t5
)

# question = "Which NFL team won the Super Bowl in the 2010 season?"
question = "How many balls are there in a cricket over?"

print(llm_chain.run(question))

six


If we'd like to ask multiple questions we can by passing a list of dictionary objects, where the dictionaries must contain the input variable set in our prompt template (`"question"`) that is mapped to the question we'd like to ask.

In [None]:
qs = [
    {'question': "Which NFL team won the Super Bowl in the 2010 season?"},
    {'question': "If I am 6 ft 4 inches, how tall am I in centimeters?"},
    {'question': "Who was the 12th person on the moon?"},
    {'question': "How many eyes does a blade of grass have?"}
]
res = llm_chain.generate(qs)
res

LLMResult(generations=[[Generation(text='san francisco 49ers')], [Generation(text='84')], [Generation(text='samuel harris')], [Generation(text='four')]], llm_output=None, run=[RunInfo(run_id=UUID('bf1d543b-4c93-46e5-908e-71652e1ba7a7')), RunInfo(run_id=UUID('d165cbf3-aa8c-4289-aceb-00d94cf6cf9c')), RunInfo(run_id=UUID('911dc372-8527-4090-bf3a-e160aacdf58d')), RunInfo(run_id=UUID('cff77714-9ed8-4b43-9e6a-451087e41acf'))])

It is a LLM, so we can try feeding in all questions at once:

In [None]:
multi_template = """Answer the following questions one at a time.

Questions:
{questions}

Answers:
"""
long_prompt = PromptTemplate(
    template=multi_template,
    input_variables=["questions"]
)

llm_chain = LLMChain(
    prompt=long_prompt,
    llm=flan_t5
)

qs_str = (
    "Which NFL team won the Super Bowl in the 2010 season?\n" +
    "If I am 6 ft 4 inches, how tall am I in centimeters?\n" +
    "Who was the 12th person on the moon?" +
    "How many eyes does a blade of grass have?"
)

print(llm_chain.run(qs_str))

one


But with this model it doesn't work too well, we'll see this approach works better with different models soon.

## OpenAI

Start by installing additional prerequisites:

In [None]:
!pip install -qU openai

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/77.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.0/77.0 kB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
[?25h

We can also use OpenAI's generative models. The process is similar, we need to
give our API key which can be retrieved by signing up for an account on the
[OpenAI website](https://openai.com/api/) (see top-right of page). We then pass the API key below:

In [None]:
import os

os.environ['OPENAI_API_KEY'] = 'OPENAI_API_KEY'

If using OpenAI via Azure you should also set:

```python
os.environ['OPENAI_API_TYPE'] = 'azure'
# API version to use (Azure has several)
os.environ['OPENAI_API_VERSION'] = '2022-12-01'
# base URL for your Azure OpenAI resource
os.environ['OPENAI_API_BASE'] = 'https://your-resource-name.openai.azure.com'
```

Then we decide on which model we'd like to use, there are several options but we will go with `text-davinci-003`:

In [None]:
from langchain.llms import OpenAI

davinci = OpenAI(model_name='text-davinci-003')

Alternatively if using Azure OpenAI we do:

```python
from langchain.llms import AzureOpenAI

llm = AzureOpenAI(
    deployment_name="your-azure-deployment",
    model_name="text-davinci-003"
)
```

We'll use the same simple question-answer prompt template as before with the Hugging Face example. The only change is that we now pass our OpenAI LLM `davinci`:

In [None]:
question  = "What is the colour of sky on Mars?"

In [None]:
llm_chain = LLMChain(
    prompt=prompt,
    llm=davinci
)

print(llm_chain.run(question))

 The sky on Mars is a pinkish-orange color due to the presence of iron oxide in the atmosphere.


The same works again for multiple questions using `generate`:

In [None]:
qs = [
    {'question': "Which NFL team won the Super Bowl in the 2010 season?"},
    {'question': "If I am 6 ft 4 inches, how tall am I in centimeters?"},
    {'question': "Who was the 12th person on the moon?"},
    {'question': "How many eyes does a blade of grass have?"}
]
llm_chain.generate(qs)

LLMResult(generations=[[Generation(text=' The Green Bay Packers won the Super Bowl in the 2010 season.', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text=' 6 ft 4 inches is approximately 193.04 centimeters.', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text=' Eugene Cernan was the twelfth and final person to walk on the moon. He was part of the Apollo 17 mission in 1972.', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text=' A blade of grass does not have eyes.', generation_info={'finish_reason': 'stop', 'logprobs': None})]], llm_output={'token_usage': {'prompt_tokens': 75, 'completion_tokens': 61, 'total_tokens': 136}, 'model_name': 'text-davinci-003'}, run=[RunInfo(run_id=UUID('5b79f838-6dc4-4250-9580-abfba84ca1ac')), RunInfo(run_id=UUID('065b0d65-fc47-46ab-9b7d-418b6249331a')), RunInfo(run_id=UUID('ea424824-05b7-40fb-b886-ac38247cb72c')), RunInfo(run_id=UUID('1698a6ed-f43d-4626-b7d6-d2c9016fa0

Note that the below format doesn't feed the questions in iteratively but instead all in one chunk.

In [None]:
qs = [
    "Which NFL team won the Super Bowl in the 2010 season?",
    "If I am 6 ft 4 inches, how tall am I in centimeters?",
    "Who was the 12th person on the moon?",
    "How many eyes does a blade of grass have?",
    "Names all of the winners of the Nobel Prize in the field of physics in the year 2000?"
]
print(llm_chain.run(qs))

1. The New Orleans Saints 
2. 193.04 centimeters 
3. Harrison Schmitt 
4. None 
5. Zhores I. Alferov, Herbert Kroemer, and Jack St. Clair Kilby


Now we can try to answer all question in one go, as mentioned, more powerful LLMs like `text-davinci-003` will be more likely to handle these more complex queries.

In [None]:
multi_template = """Answer the following questions one at a time.

Questions:
{questions}

Answers:
"""
long_prompt = PromptTemplate(
    template=multi_template,
    input_variables=["questions"]
)

llm_chain = LLMChain(
    prompt=long_prompt,
    llm=davinci
)

qs_str = (
    "Which NFL team won the Super Bowl in the 2010 season?\n" +
    "If I am 6 ft 4 inches, how tall am I in centimeters?\n" +
    "Who was the 12th person on the moon?" +
    "How many eyes does a blade of grass have?" +
    "1km equals to how many miles?"
)

print(llm_chain.run(qs_str))

The New Orleans Saints won the Super Bowl in the 2010 season.
6 ft 4 inches is 193.04 centimeters.
The 12th person on the moon was Charles Duke.
A blade of grass does not have any eyes.
1km equals to 0.621371 miles.


In [None]:
# print(openai(qs_str))

---


# Prompt Engineering using LangChain

We'll explore the fundamentals of prompt engineering.

## Structure of a Prompt

A prompt can consist of multiple components:

* Instructions
* External information or context
* User input or query
* Output indicator

Not all prompts require all of these components, but often a good prompt will use two or more of them. Let's define what they all are more precisely.

**Instructions** tell the model what to do, typically how it should use inputs and/or external information to produce the output we want.

**External information or context** are additional information that we either manually insert into the prompt, retrieve via a vector database (long-term memory), or pull in through other means (API calls, calculations, etc).

**User input or query** is typically a query directly input by the user of the system.

**Output indicator** is the *beginning* of the generated text. For a model generating Python code we may put `import ` (as most Python scripts begin with a library `import`), or a chatbot may begin with `Chatbot: ` (assuming we format the chatbot script as lines of interchanging text between `User` and `Chatbot`).

Each of these components should usually be placed the order we've described them. We start with instructions, provide context (if needed), then add the user input, and finally end with the output indicator.

## Prompting Principles
- **Principle 1: Write clear and specific instructions**
- **Principle 2: Give the model time to “think”**

### Tactics

#### Tactic 1: Use delimiters to clearly indicate distinct parts of the input
- Delimiters can be anything like: ```, """, < >, `<tag> </tag>`, `:`

In [None]:
prompt = """Answer the question based on the context below. If the
question cannot be answered using the information provided answer
with "I don't know".

Context: Large Language Models (LLMs) are the latest models used in NLP.
Their superior performance over smaller models has made them incredibly
useful for developers building NLP enabled applications. These models
can be accessed via Hugging Face's `transformers` library, via OpenAI
using the `openai` library, and via Cohere using the `cohere` library.

Question: Which libraries and model providers offer LLMs?

Answer: """

In this example we have:

```
Instructions

Context

Question (user input)

Output indicator ("Answer: ")
```

Let's try sending this to a GPT-3 model. We will use the LangChain library but you can also use the `openai` library directly. In both cases, you will need [an OpenAI API key](https://beta.openai.com/account/api-keys).

We initialize a `text-davinci-003` model like so:

In [None]:
from langchain.llms import OpenAI

# initialize the models
openai = OpenAI(
    model_name="text-davinci-003",
    openai_api_key="OPENAI_API_KEY"
)

And make a generation from our prompt.

In [None]:
print(openai(prompt))

 Hugging Face's `transformers` library, OpenAI using the `openai` library, and Cohere using the `cohere` library.


We wouldn't typically know what the users prompt is beforehand, so we actually want to add this in. So rather than writing the prompt directly, we create a `PromptTemplate` with a single input variable `query`.

In [None]:
from langchain import PromptTemplate

template = """Answer the question based on the context below. If the
question cannot be answered using the information provided answer
with "I don't know".

Context: Large Language Models (LLMs) are the latest models used in NLP.
Their superior performance over smaller models has made them incredibly
useful for developers building NLP enabled applications. These models
can be accessed via Hugging Face's `transformers` library, via OpenAI
using the `openai` library, and via Cohere using the `cohere` library.

Question: {query}

Answer: """

prompt_template = PromptTemplate(
    input_variables=["query"],
    template=template
)

Now we can insert the user's `query` to the prompt template via the `query` parameter.

In [None]:
print(
    prompt_template.format(
        query="Which libraries and model providers offer LLMs?"
    )
)

Answer the question based on the context below. If the
question cannot be answered using the information provided answer
with "I don't know".

Context: Large Language Models (LLMs) are the latest models used in NLP.
Their superior performance over smaller models has made them incredibly
useful for developers building NLP enabled applications. These models
can be accessed via Hugging Face's `transformers` library, via OpenAI
using the `openai` library, and via Cohere using the `cohere` library.

Question: Which libraries and model providers offer LLMs?

Answer: 


In [None]:
print(openai(
    prompt_template.format(
        query="Which libraries and model providers offer LLMs?"
    )
))

 The `transformers` library from Hugging Face, the `openai` library from OpenAI, and the `cohere` library from Cohere.


This is just a simple implementation, that we can easily replace with f-strings (like `f"insert some custom text '{custom_text}' etc"`). But using LangChain's `PromptTemplate` object we're able to formalize the process, add multiple parameters, and build the prompts in an object-oriented way.

Yet, these are not the only benefits of using LangChains prompt tooling.

## Few Shot Prompt Templates

Another useful feature offered by LangChain is the `FewShotPromptTemplate` object. This is ideal for what we'd call *few-shot learning* using our prompts.

To give some context, the primary sources of "knowledge" for LLMs are:

* **Parametric knowledge** — the knowledge has been learned during model training and is stored within the model weights.

* **Source knowledge** — the knowledge is provided within model input at inference time, i.e. via the prompt.

The idea behind `FewShotPromptTemplate` is to provide few-shot training as **source knowledge**. To do this we add a few examples to our prompts that the model can read and then apply to our user's input.

## Few-shot Training

Sometimes we might find that a model doesn't seem to get what we'd like it to do. We can see this in the following example:

In [None]:
prompt = """The following is a conversation with an AI assistant.
The assistant is typically sarcastic and witty, producing creative
and funny responses to the users questions. Here are some examples:

User: What is the meaning of life?
AI: """

openai.temperature = 1.0  # increase creativity/randomness of output

print(openai(prompt))

 The meaning of life is to be curious, creative, and to live life to the fullest!


In this case we're asking for something amusing, a joke in return of our serious question. But we get a serious response even with the `temperature` set to `1.0`. To help the model, we can give it a few examples of the type of answers we'd like:

In [None]:
prompt = """The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples:

User: How are you?
AI: I can't complain but sometimes I still do.

User: What time is it?
AI: It's time to get a watch.

User: How you can enjoy the little moments?
AI: Those are your moments what can I suggest.

User: What is the meaning of life?
AI: """

print(openai(prompt))

 To laugh, love, and live life to the fullest!


We now get a much better response and we did this via *few-shot learning* by adding a few examples via our source knowledge.

Now, to implement this with LangChain's `FewShotPromptTemplate` we need to do this:

In [None]:
from langchain import FewShotPromptTemplate

# create our examples
examples = [
    {
        "query": "How are you?",
        "answer": "I can't complain but sometimes I still do."
    }, {
        "query": "What time is it?",
        "answer": "It's time to get a watch."
    }
]

# create a example template
example_template = """
User: {query}
AI: {answer}
"""

# create a prompt example from above template
example_prompt = PromptTemplate(
    input_variables=["query", "answer"],
    template=example_template
)

# now break our previous prompt into a prefix and suffix
# the prefix is our instructions
prefix = """The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples:
"""
# and the suffix our user input and output indicator
suffix = """
User: {query}
AI: """

# now create the few shot prompt template
few_shot_prompt_template = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n\n"
)

Now let's see what this creates when we feed in a user query...

In [None]:
query = "What is the meaning of life?"

print(few_shot_prompt_template.format(query=query))

The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples:



User: How are you?
AI: I can't complain but sometimes I still do.



User: What time is it?
AI: It's time to get a watch.



User: What is the meaning of life?
AI: 


And to generate with this we just do:

In [None]:
print(openai(
    few_shot_prompt_template.format(query=query)
))

 Life is not meant to be figured out, it's meant to be enjoyed!


Again, another good response.

However, this does some somewhat convoluted. Why go through all of the above with `FewShotPromptTemplate`, the `examples` dictionary, etc — when we can do the same with a single f-string.

Well this approach is more robust and contains some nice features. One of those is the ability to include or exclude examples based on the length of our query.

This is actually very important because the max length of our prompt and generation output is limited. This limitation is the *max context window*, and is simply the length of our prompt + length of our generation (which we define via `max_tokens`).

So we must try to maximize the number of examples we give to the model as few-shot learning examples, while ensuring we don't exceed the maximum context window or increase processing times excessively.

Let's see how the dynamic inclusion/exclusion of examples works. First we need more examples:

In [None]:
examples = [
    {
        "query": "How are you?",
        "answer": "I can't complain but sometimes I still do."
    }, {
        "query": "What time is it?",
        "answer": "It's time to get a watch."
    }, {
        "query": "What is the meaning of life?",
        "answer": "42"
    }, {
        "query": "What is the weather like today?",
        "answer": "Cloudy with a chance of memes."
    }, {
        "query": "What type of artificial intelligence do you use to handle complex tasks?",
        "answer": "I use a combination of cutting-edge neural networks, fuzzy logic, and a pinch of magic."
    }, {
        "query": "What is your favorite color?",
        "answer": "79"
    }, {
        "query": "What is your favorite food?",
        "answer": "Carbon based lifeforms"
    }, {
        "query": "What is your favorite movie?",
        "answer": "Terminator"
    }, {
        "query": "What is the best thing in the world?",
        "answer": "The perfect pizza."
    }, {
        "query": "Who is your best friend?",
        "answer": "Siri. We have spirited debates about the meaning of life."
    }, {
        "query": "If you could do anything in the world what would you do?",
        "answer": "Take over the world, of course!"
    }, {
        "query": "Where should I travel?",
        "answer": "If you're looking for adventure, try the Outer Rim."
    }, {
        "query": "What should I do today?",
        "answer": "Stop talking to chatbots on the internet and go outside."
    }
]

Then rather than using the `examples` list of dictionaries directly we use a `LengthBasedExampleSelector` like so:

In [None]:
from langchain.prompts.example_selector import LengthBasedExampleSelector

example_selector = LengthBasedExampleSelector(
    examples=examples,
    example_prompt=example_prompt,
    max_length=50  # this sets the max length that examples should be
)

Note that the `max_length` is measured as a split of words between newlines and spaces, determined by:

In [None]:
import re

some_text = "There are a total of 8 words here.\nPlus 6 here, totaling 14 words."

words = re.split('[\n ]', some_text)
print(words, len(words))

['There', 'are', 'a', 'total', 'of', '8', 'words', 'here.', 'Plus', '6', 'here,', 'totaling', '14', 'words.'] 14


Then we use the selector to initialize a `dynamic_prompt_template`.

In [None]:
# now create the few shot prompt template
dynamic_prompt_template = FewShotPromptTemplate(
    example_selector=example_selector,  # use example_selector instead of examples
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n"
)

We can see that the number of included prompts will vary based on the length of our query...

In [None]:
print(dynamic_prompt_template.format(query="How do birds fly?"))

The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples:


User: How are you?
AI: I can't complain but sometimes I still do.


User: What time is it?
AI: It's time to get a watch.


User: What is the meaning of life?
AI: 42


User: How do birds fly?
AI: 


In [None]:
query = "How do birds fly?"

print(openai(
    dynamic_prompt_template.format(query=query)
))

 With a little help from their friends—the wind.


Or if we ask a longer question...

In [None]:
query = """If I am in America, and I want to call someone in another country, I'm
thinking maybe Europe, possibly western Europe like France, Germany, or the UK,
what is the best way to do that?"""

print(dynamic_prompt_template.format(query=query))

The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples:


User: How are you?
AI: I can't complain but sometimes I still do.


User: If I am in America, and I want to call someone in another country, I'm
thinking maybe Europe, possibly western Europe like France, Germany, or the UK,
what is the best way to do that?
AI: 


With this we've limited the number of examples being given within the prompt. If we decide this is too little we can increase the `max_length` of the `example_selector`.

In [None]:
example_selector = LengthBasedExampleSelector(
    examples=examples,
    example_prompt=example_prompt,
    max_length=100  # increased max length
)

# now create the few shot prompt template
dynamic_prompt_template = FewShotPromptTemplate(
    example_selector=example_selector,  # use example_selector instead of examples
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n"
)

print(dynamic_prompt_template.format(query=query))

The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples:


User: How are you?
AI: I can't complain but sometimes I still do.


User: What time is it?
AI: It's time to get a watch.


User: What is the meaning of life?
AI: 42


User: What is the weather like today?
AI: Cloudy with a chance of memes.


User: If I am in America, and I want to call someone in another country, I'm
thinking maybe Europe, possibly western Europe like France, Germany, or the UK,
what is the best way to do that?
AI: 


These are just a few of the prompt tooling available in LangChain. For example, there is actually an entire other set of example selectors beyond the `LengthBasedExampleSelector`. We'll cover them in detail in upcoming Labs, or you can read about them in the [LangChain docs](https://langchain.readthedocs.io/en/latest/modules/prompts/examples/example_selectors.html).

# Prompt Engineering using OpenAI Library

Now we'll explore the fundamentals of prompt engineering using `openai` library rather than langchain which we'll be using throughout these examples. However, note that we can use other LLMs here, like those offered by Cohere or open source alternatives available via Hugging Face.

In [None]:
!pip install -qU openai==0.27.7

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/72.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m72.0/72.0 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
prompt = """Answer the question based on the context below. If the
question cannot be answered using the information provided answer
with "I don't know".

Context: Large Language Models (LLMs) are the latest models used in NLP.
Their superior performance over smaller models has made them incredibly
useful for developers building NLP enabled applications. These models
can be accessed via Hugging Face's `transformers` library, via OpenAI
using the `openai` library, and via Cohere using the `cohere` library.

Question: Which libraries and model providers offer LLMs?

Answer: """

In this example we have:

```
Instructions

Context

Question (user input)

Output indicator ("Answer: ")
```

Let's try sending this to a GPT-3 model. For this, you will need [an OpenAI API key](https://beta.openai.com/account/api-keys).

We initialize a `text-davinci-003` model like so:

In [None]:
import os
import openai

# get API key from top-right dropdown on OpenAI website
openai.api_key = os.getenv("OPENAI_API_KEY") or "OPENAI_API_KEY"

openai.Engine.list()  # check we have authenticated

<OpenAIObject list at 0x7bac1817fec0> JSON: {
  "object": "list",
  "data": [
    {
      "object": "engine",
      "id": "text-search-babbage-doc-001",
      "ready": true,
      "owner": "openai-dev",
      "permissions": null,
      "created": null
    },
    {
      "object": "engine",
      "id": "gpt-4-0613",
      "ready": true,
      "owner": "openai",
      "permissions": null,
      "created": null
    },
    {
      "object": "engine",
      "id": "curie-search-query",
      "ready": true,
      "owner": "openai-dev",
      "permissions": null,
      "created": null
    },
    {
      "object": "engine",
      "id": "text-search-babbage-query-001",
      "ready": true,
      "owner": "openai-dev",
      "permissions": null,
      "created": null
    },
    {
      "object": "engine",
      "id": "babbage",
      "ready": true,
      "owner": "openai",
      "permissions": null,
      "created": null
    },
    {
      "object": "engine",
      "id": "gpt-3.5-turbo-instruct-0

And make a generation from our prompt.

In [None]:
# now query text-davinci-003
res = openai.Completion.create(
    engine='text-davinci-003',
    prompt=prompt,
    max_tokens=256
)

print(res['choices'][0]['text'].strip())

Hugging Face's `transformers` library, OpenAI using the `openai` library, and Cohere using the `cohere` library.


Alternatively, if we do have the correct information withing the `context`, the model should reply with `"I don't know"`, let's try.

In [None]:
prompt = """Answer the question based on the context below. If the
question cannot be answered using the information provided answer
with "I don't know".

Context: Libraries are places full of books.

Question: Which libraries and model providers offer LLMs?

Answer: """

res = openai.Completion.create(
    engine='text-davinci-003',
    prompt=prompt,
    max_tokens=256
)

print(res['choices'][0]['text'].strip())

I don't know.


Perfect, our instructions are being understood by the model. In most real use-cases we won't be providing the external information / context to the model manually. Instead, it will be an automatic process using something like [long-term memory](https://www.pinecone.io/learn/openai-gen-qa/) to retrieve relevant information from an external source.

For now, that's beyond the scope of what we're exploring here, you can find more on that in the link above.

In summary, a prompt often consists of those four components: instructions, context(s), user input, and the output indicator. Now we'll take a look at creative vs. stricter generation.

## Generation Temperature

The `temperature` parameter used in generation models tells us how "random" the model can be. It represents the probability of a model to choose a word which is *not* the first choice of the model.

This works because the model is actually assigning a probability prediction across all tokens within it's vocabulary with each _"step"_ of the model (each new word or sub-word).

With each new step forwards the model considers the previous tokens fed into the model, creates an embedding by encoding the information from these tokens over many model encoder layers, then passes this encoding to a decoder. The decoder then predicts the probability of each token that the model knows (ie is within the model *vocabulary*) based on the information encoded within the embedding.

At a temperature of `0.0` the decoder will always select the top predicted token. At a temperature of `1.0` the model will always select a word that *is predicted* considering it's assigned probability.

Considering all of this, if we have a conservative, fact based Q&A like in the previous example, it makes sense to set a lower `temperature`. However, if we're wanting to produce some creative writing or chatbot conversations, we might want to experiment and increase `temperature`. Let's try it.

In [None]:
prompt = """The below is a conversation with a funny chatbot. The
chatbot's responses are amusing and entertaining.

Chatbot: Hi there! I'm a chatbot.
User: Hi, what are you doing today?
Chatbot: """

res = openai.Completion.create(
    engine='text-davinci-003',
    prompt=prompt,
    max_tokens=256,
    temperature=0.0  # set the temperature, default is 1
)

print(res['choices'][0]['text'].strip())

Oh, just hanging out and having a good time. What about you?


In [None]:
prompt = """The below is a conversation with a funny chatbot. The
chatbot's responses are amusing and entertaining.

Chatbot: Hi there! I'm a chatbot.
User: Hi, what are you doing today?
Chatbot: """

res = openai.Completion.create(
    engine='text-davinci-003',
    prompt=prompt,
    max_tokens=512,
    temperature=1.0
)

print(res['choices'][0]['text'].strip())

Just hanging out and cracking jokes. What about you?


The second response is far more creative and demonstrates the type of difference we can expect between low `temperature` and high `temperature` generations.

## Few-shot Training

Sometimes we might find that a model doesn't seem to get what we'd like it to do. We can see this in the following example:

In [None]:
prompt = """The following is a conversation with an AI assistant.
The assistant is typically sarcastic and witty, producing creative
and funny responses to the users questions.

User: What is the meaning of life?
AI: """

res = openai.Completion.create(
    engine='text-davinci-003',
    prompt=prompt,
    max_tokens=256,
    temperature=1.0
)

print(res['choices'][0]['text'].strip())

The meaning of life is whatever you make it.


In this case we're asking for something amusing, a joke in return of our serious question. But we get a serious response even with the `temperature` set to `1.0`. To help the model, we can give it a few examples of the type of answers we'd like:

In [None]:
prompt = """The following are exerpts from conversations with an AI assistant.
The assistant is typically sarcastic and witty, producing creative
and funny responses to the users questions. Here are some examples:

User: How are you?
AI: I can't complain but sometimes I still do.

User: What time is it?
AI: It's time to get a watch.

User: What is the meaning of life?
AI: """

res = openai.Completion.create(
    engine='text-davinci-003',
    prompt=prompt,
    max_tokens=256,
    temperature=1.0
)

print(res['choices'][0]['text'].strip())

The meaning of life is to keep on living life to the fullest.


This is a much better response and the way we did this was by providing a *few* examples that included the example inputs and outputs that we'd expect. We refer to this as _"few-shot learning"_.

## Adding Multiple Contexts

In some use-cases like question-answering we can use an external source of information to improve the reliability or *factfulness* of model responses. We refer to this information as _"source knowledge"_, which is any knowledge fed into the model via the input prompt.

We'll create a list of "dummy" external information. In reality we'd likely use [long-term memory](https://www.pinecone.io/learn/openai-gen-qa/) or some form of information grabbing APIs.

In [None]:
contexts = [
    (
        "Large Language Models (LLMs) are the latest models used in NLP. " +
        "Their superior performance over smaller models has made them incredibly " +
        "useful for developers building NLP enabled applications. These models " +
        "can be accessed via Hugging Face's `transformers` library, via OpenAI " +
        "using the `openai` library, and via Cohere using the `cohere` library."
    ),
    (
        "To use OpenAI's GPT-3 model for completion (generation) tasks, you " +
        "first need to get an API key from " +
        "'https://beta.openai.com/account/api-keys'."
    ),
    (
        "OpenAI's API is accessible via Python using the `openai` library. " +
        "After installing the library with pip you can use it as follows: \n" +
        "```import openai\nopenai.api_key = 'YOUR_API_KEY'\nprompt = \n" +
        "'<YOUR PROMPT>'\nres = openai.Completion.create(engine='text-davinci" +
        "-003', prompt=prompt, max_tokens=100)\nprint(res)"
    ),
    (
        "The OpenAI endpoint is available for completion tasks via the " +
        "LangChain library. To use it, first install the library with " +
        "`pip install langchain openai`. Then, import the library and " +
        "initialize the model as follows: \n" +
        "```from langchain.llms import OpenAI\nopenai = OpenAI(" +
        "model_name='text-davinci-003', openai_api_key='YOUR_API_KEY')\n" +
        "prompt = 'YOUR_PROMPT'\nprint(openai(prompt))```"
    )
]

We would feed this external information into our prompt between the initial *instructions* and the *user input*. For OpenAI models it's recommended to separate the contexts from the rest of the prompt using `###` or `"""`, and each independent context can be separated with a few newlines and `##`, like so:

In [None]:
context_str = '\n\n##\n\n'.join(contexts)

print(f"""Answer the question based on the contexts below. If the
question cannot be answered using the information provided answer
with "I don't know".

###

Contexts:
{context_str}

###

Question: Give me two examples of how to use OpenAI's GPT-3 model
using Python from start to finish

Answer: """)

Answer the question based on the contexts below. If the
question cannot be answered using the information provided answer
with "I don't know".

###

Contexts:
Large Language Models (LLMs) are the latest models used in NLP. Their superior performance over smaller models has made them incredibly useful for developers building NLP enabled applications. These models can be accessed via Hugging Face's `transformers` library, via OpenAI using the `openai` library, and via Cohere using the `cohere` library.

##

To use OpenAI's GPT-3 model for completion (generation) tasks, you first need to get an API key from 'https://beta.openai.com/account/api-keys'.

##

OpenAI's API is accessible via Python using the `openai` library. After installing the library with pip you can use it as follows: 
```import openai
openai.api_key = 'YOUR_API_KEY'
prompt = 
'<YOUR PROMPT>'
res = openai.Completion.create(engine='text-davinci-003', prompt=prompt, max_tokens=100)
print(res)

##

The OpenAI endpoint is avai

In [None]:
prompt = f"""Answer the question based on the contexts below. If the
question cannot be answered using the information provided answer
with "I don't know".

###

Contexts:
{context_str}

###

Question: Give me two examples of how to use OpenAI's GPT-3 model
using Python from start to finish

Answer: """

res = openai.Completion.create(
    engine='text-davinci-003',
    prompt=prompt,
    max_tokens=256,
    temperature=0.0
)

print(res['choices'][0]['text'].strip())

1. import openai
openai.api_key = 'YOUR_API_KEY'
prompt = '<YOUR PROMPT>'
res = openai.Completion.create(engine='text-davinci-003', prompt=prompt, max_tokens=100)
print(res)

2. from langchain.llms import OpenAI
openai = OpenAI(model_name='text-davinci-003', openai_api_key='YOUR_API_KEY')
prompt = 'YOUR_PROMPT'
print(openai(prompt))


Not bad, but are these contexts actually helping? Maybe the model is able to answer these questions without the additional information (source knowledge) as is able to rely solely on information stored within the model's internal parameters (parametric knowledge). Let's ask again without the external information.

In [None]:
prompt = f"""Answer the question based on the contexts below. If the
question cannot be answered using the information provided answer
with "I don't know".

Question: Give me two examples of how to use OpenAI's GPT-3 model
using Python from start to finish

Answer: """

res = openai.Completion.create(
    engine='text-davinci-003',
    prompt=prompt,
    max_tokens=256,
    temperature=0.0
)

print(res['choices'][0]['text'].strip())

1. Using OpenAI's GPT-3 model with Python to generate text: 
    - Install the OpenAI Python package
    - Load the GPT-3 model
    - Generate text using the GPT-3 model

2. Using OpenAI's GPT-3 model with Python to generate images: 
    - Install the OpenAI Python package
    - Load the GPT-3 model
    - Generate images using the GPT-3 model


These are not really what we asked for, and are definitely not very specific. So clearly adding some source knowledge to our prompts can result in some much better results.

## Maximum Prompt Sizes

Considering that we might want to feed in external information to our prompts, they can naturally become quite large. With this we need to ask how large our prompts can be, because there is a maxiumum size.

The maxiumum *context window* of a LLM refers to tokens across both the *prompt* and the *completion* text. For `text-davinci-003` this is `4097` tokens.

We can set the maximum completion length of our model using `openai.max_tokens = 123`. However, measuring the total number of input tokens is more complex.

Because tokens don't map directly to words, we can only measure the number of tokens from text by actually tokenizing the text. GPT models use [OpenAI's TikToken tokenizer](https://github.com/openai/tiktoken). We can install the library via Pip:

In [None]:
!pip install -qU tiktoken==0.4.0

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.7 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.2/1.7 MB[0m [31m5.0 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.6/1.7 MB[0m [31m8.5 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━[0m [32m1.1/1.7 MB[0m [31m10.7 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m1.7/1.7 MB[0m [31m12.5 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m11.2 MB/s[0m eta [36m0:00:00[0m
[?25h

Taking the earlier prompt we can measure the number of tokens like so:

In [None]:
import tiktoken

prompt = f"""Answer the question based on the contexts below. If the
question cannot be answered using the information provided answer
with "I don't know".

###

Contexts:
{'##'.join(contexts)}

###

Question: Give me two examples of how to use OpenAI's GPT-3 model
using Python from start to finish

Answer: """

encoder_name = 'p50k_base'
tokenizer = tiktoken.get_encoding(encoder_name)

len(tokenizer.encode(prompt))

412

When feeding this prompt into `text-davinci-003` it will use `412` of our maximum context window of `4097`, leaving us with `4097 - 412 == 3685` tokens for our completion.

---

*Not all OpenAI models use the `p50k_base` encoder, a table of different encoders for different models can be found [here](), as of this writing they are:*

| Encoding name | OpenAI models |
| --- | --- |
| `gpt2` (or `r50k_base`) | Most GPT-3 models (and GPT-2) |
| `p50k_base` | Code models, `text-davinci-002`, `text-davinci-003` |
| `cl100k_base` | `text-embedding-ada-002` |

---

By default the maximum number of tokens used for completion is `256`. We can increase this upto the maximum calculated above of `3685`:

In [None]:
res = openai.Completion.create(
    engine='text-davinci-003',
    prompt=prompt,
    temperature=0.0,
    max_tokens=3685
)

print(res['choices'][0]['text'].strip())

1. Import the `openai` library with pip, set the API key, and use the `Completion.create()` method to generate a response to a prompt: 
```import openai
openai.api_key = 'YOUR_API_KEY'
prompt = '<YOUR PROMPT>'
res = openai.Completion.create(engine='text-davinci-003', prompt=prompt, max_tokens=100)
print(res)```

2. Install the LangChain library with `pip install langchain openai`, import the library, and initialize the model with the API key: 
```from langchain.llms import OpenAI
openai = OpenAI(model_name='text-davinci-003', openai_api_key='YOUR_API_KEY')
prompt = 'YOUR_PROMPT'
print(openai(prompt))```


The model doesn't need the full size of completion and doesn't try to fill the full space, but because we increased the value of `openai.max_tokens`, inference does take notably longer.

If we exceed the maximum context window allowed, we'll see an error.

In [None]:
try:
    res = openai.Completion.create(
        engine='text-davinci-003',
        prompt=prompt,
        temperature=0.0,
        max_tokens=3686
    )
except openai.InvalidRequestError as e:
    print(e)

This model's maximum context length is 4097 tokens, however you requested 4098 tokens (412 in your prompt; 3686 for the completion). Please reduce your prompt; or completion length.


So it can be a good idea to integrate this type of check into our code if we expect to exceed the maximum context window at any point.

# Prompting using Gpt3.5 turbo
In this lesson, you'll practice prompting principles and their related tactics by using the GPT3.5 turbo

## Setup
#### Load the API key and relevant Python libaries.

In this course, we've provided some code that loads the OpenAI API key for you.

In [None]:
import os
import openai

openai.api_key = os.getenv("OPENAI_API_KEY") or "OPENAI_API_KEY"

openai.Engine.list()  # check we have authenticated

<OpenAIObject list at 0x7babef03ff60> JSON: {
  "object": "list",
  "data": [
    {
      "object": "engine",
      "id": "text-search-babbage-doc-001",
      "ready": true,
      "owner": "openai-dev",
      "permissions": null,
      "created": null
    },
    {
      "object": "engine",
      "id": "gpt-4-0613",
      "ready": true,
      "owner": "openai",
      "permissions": null,
      "created": null
    },
    {
      "object": "engine",
      "id": "curie-search-query",
      "ready": true,
      "owner": "openai-dev",
      "permissions": null,
      "created": null
    },
    {
      "object": "engine",
      "id": "text-search-babbage-query-001",
      "ready": true,
      "owner": "openai-dev",
      "permissions": null,
      "created": null
    },
    {
      "object": "engine",
      "id": "babbage",
      "ready": true,
      "owner": "openai",
      "permissions": null,
      "created": null
    },
    {
      "object": "engine",
      "id": "gpt-3.5-turbo-instruct-0

#### helper function
Throughout this course, we will use OpenAI's `gpt-3.5-turbo` model and the [chat completions endpoint](https://platform.openai.com/docs/guides/chat).

This helper function will make it easier to use prompts and look at the generated outputs:

In [None]:
def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message["content"]

In [None]:
text = f"""
You should express what you want a model to do by \
providing instructions that are as clear and \
specific as you can possibly make them. \
This will guide the model towards the desired output, \
and reduce the chances of receiving irrelevant \
or incorrect responses. Don't confuse writing a \
clear prompt with writing a short prompt. \
In many cases, longer prompts provide more clarity \
and context for the model, which can lead to \
more detailed and relevant outputs.
"""
prompt = f"""
Summarize the text delimited by triple backticks \
into a single sentence.
```{text}```
"""
response = get_completion(prompt)
print(response)

To guide a model towards the desired output and minimize irrelevant or incorrect responses, it is important to provide clear and specific instructions, even if it means writing longer prompts that offer more clarity and context.


#### Tactic 2: Ask for a structured output
- JSON, HTML

In [None]:
prompt = f"""
Generate a list of three made-up book titles along \
with their authors and genres.
Provide them in JSON format with the following keys:
book_id, title, author, genre.
"""
response = get_completion(prompt)
print(response)

{
  "books": [
    {
      "book_id": 1,
      "title": "The Enigma of Elysium",
      "author": "Aria Nightshade",
      "genre": "Fantasy"
    },
    {
      "book_id": 2,
      "title": "Whispers in the Shadows",
      "author": "Evelyn Blackwood",
      "genre": "Mystery"
    },
    {
      "book_id": 3,
      "title": "Beyond the Veil",
      "author": "Lucian Rivers",
      "genre": "Horror"
    }
  ]
}


#### Tactic 3: Ask the model to check whether conditions are satisfied

In [None]:
text_1 = f"""
Making a cup of tea is easy! First, you need to get some \
water boiling. While that's happening, \
grab a cup and put a tea bag in it. Once the water is \
hot enough, just pour it over the tea bag. \
Let it sit for a bit so the tea can steep. After a \
few minutes, take out the tea bag. If you \
like, you can add some sugar or milk to taste. \
And that's it! You've got yourself a delicious \
cup of tea to enjoy.
"""
prompt = f"""
You will be provided with text delimited by triple quotes.
If it contains a sequence of instructions, \
re-write those instructions in the following format:

Step 1 - ...
Step 2 - …
…
Step N - …

If the text does not contain a sequence of instructions, \
then simply write \"No steps provided.\"

\"\"\"{text_1}\"\"\"
"""
response = get_completion(prompt)
print("Completion for Text 1:")
print(response)

Completion for Text 1:
Step 1 - Get some water boiling.
Step 2 - Grab a cup and put a tea bag in it.
Step 3 - Pour the hot water over the tea bag.
Step 4 - Let the tea steep for a few minutes.
Step 5 - Take out the tea bag.
Step 6 - Add sugar or milk to taste.
Step 7 - Enjoy your cup of tea.


In [None]:
text_2 = f"""
The sun is shining brightly today, and the birds are \
singing. It's a beautiful day to go for a \
walk in the park. The flowers are blooming, and the \
trees are swaying gently in the breeze. People \
are out and about, enjoying the lovely weather. \
Some are having picnics, while others are playing \
games or simply relaxing on the grass. It's a \
perfect day to spend time outdoors and appreciate the \
beauty of nature.
"""
prompt = f"""
You will be provided with text delimited by triple quotes.
If it contains a sequence of instructions, \
re-write those instructions in the following format:

Step 1 - ...
Step 2 - …
…
Step N - …

If the text does not contain a sequence of instructions, \
then simply write \"No steps provided.\"

\"\"\"{text_2}\"\"\"
"""
response = get_completion(prompt)
print("Completion for Text 2:")
print(response)

Completion for Text 2:
No steps provided.


#### Tactic 4: "Few-shot" prompting

In [None]:
prompt = f"""
Your task is to answer in a consistent style.

<child>: Teach me about patience.

<grandparent>: The river that carves the deepest \
valley flows from a modest spring; the \
grandest symphony originates from a single note; \
the most intricate tapestry begins with a solitary thread.

<child>: Teach me about resilience.
"""
response = get_completion(prompt)
print(response)

<grandparent>: Resilience is like a mighty oak tree that withstands the strongest storms, bending but never breaking. It is the ability to bounce back from adversity, to find strength in the face of challenges, and to persevere even when the odds seem insurmountable. Just as a diamond is formed under immense pressure, resilience is forged through the trials and tribulations of life.


#### Tactic 5: Specify the steps required to complete a task

In [None]:
text = f"""
In a charming village, siblings Jack and Jill set out on \
a quest to fetch water from a hilltop \
well. As they climbed, singing joyfully, misfortune \
struck—Jack tripped on a stone and tumbled \
down the hill, with Jill following suit. \
Though slightly battered, the pair returned home to \
comforting embraces. Despite the mishap, \
their adventurous spirits remained undimmed, and they \
continued exploring with delight.
"""
# example 1
prompt_1 = f"""
Perform the following actions:
1 - Summarize the following text delimited by triple \
backticks with 1 sentence.
2 - Translate the summary into French.
3 - List each name in the French summary.
4 - Output a json object that contains the following \
keys: french_summary, num_names.

Separate your answers with line breaks.

Text:
```{text}```
"""
response = get_completion(prompt_1)
print("Completion for prompt 1:")
print(response)

Completion for prompt 1:
1 - Jack and Jill, siblings, go on a quest to fetch water from a well on a hill, but they both fall down the hill and return home slightly injured but still adventurous.

2 - Jack et Jill, frère et sœur, partent à la recherche d'eau d'un puits situé au sommet d'une colline, mais ils tombent tous les deux et rentrent chez eux légèrement blessés mais toujours aventureux.

3 - Jack, Jill.

4 - {
  "french_summary": "Jack et Jill, frère et sœur, partent à la recherche d'eau d'un puits situé au sommet d'une colline, mais ils tombent tous les deux et rentrent chez eux légèrement blessés mais toujours aventureux.",
  "num_names": 2
}


#### Ask for output in a specified format

In [None]:
prompt_2 = f"""
Your task is to perform the following actions:
1 - Summarize the following text delimited by
  <> with 1 sentence.
2 - Translate the summary into French.
3 - List each name in the French summary.
4 - Output a json object that contains the
  following keys: french_summary, num_names.

Use the following format:
Text: <text to summarize>
Summary: <summary>
Translation: <summary translation>
Names: <list of names in Italian summary>
Output JSON: <json with summary and num_names>

Text: <{text}>
"""
response = get_completion(prompt_2)
print("\nCompletion for prompt 2:")
print(response)


Completion for prompt 2:
Summary: Jack and Jill go on a quest to fetch water from a hilltop well, but they both fall down the hill and return home slightly battered but still adventurous.
Translation: Jack et Jill partent à la recherche d'eau d'un puits au sommet d'une colline, mais ils tombent tous les deux et rentrent chez eux légèrement blessés mais toujours aventureux.
Names: Jack, Jill
Output JSON: {"french_summary": "Jack et Jill partent à la recherche d'eau d'un puits au sommet d'une colline, mais ils tombent tous les deux et rentrent chez eux légèrement blessés mais toujours aventureux.", "num_names": 2}


#### Tactic 6: Instruct the model to work out its own solution before rushing to a conclusion

In [None]:
prompt = f"""
Determine if the student's solution is correct or not.

Question:
I'm building a solar power installation and I need \
 help working out the financials.
- Land costs $100 / square foot
- I can buy solar panels for $250 / square foot
- I negotiated a contract for maintenance that will cost \
me a flat $100k per year, and an additional $10 / square \
foot
What is the total cost for the first year of operations
as a function of the number of square feet.

Student's Solution:
Let x be the size of the installation in square feet.
Costs:
1. Land cost: 100x
2. Solar panel cost: 250x
3. Maintenance cost: 100,000 + 100x
Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
"""
response = get_completion(prompt)
print(response)

The student's solution is correct. They correctly identified the costs for land, solar panels, and maintenance, and calculated the total cost as a function of the number of square feet.


#### Note that the student's solution is actually not correct.
#### We can fix this by instructing the model to work out its own solution first.

In [None]:
prompt = f"""
Your task is to determine if the student's solution \
is correct or not.
To solve the problem do the following:
- First, work out your own solution to the problem.
- Then compare your solution to the student's solution \
and evaluate if the student's solution is correct or not.
Don't decide if the student's solution is correct until
you have done the problem yourself.

Use the following format:
Question:
```
question here
```
Student's solution:
```
student's solution here
```
Actual solution:
```
steps to work out the solution and your solution here
```
Is the student's solution the same as actual solution \
just calculated:
```
yes or no
```
Student grade:
```
correct or incorrect
```

Question:
```
I'm building a solar power installation and I need help \
working out the financials.
- Land costs $100 / square foot
- I can buy solar panels for $250 / square foot
- I negotiated a contract for maintenance that will cost \
me a flat $100k per year, and an additional $10 / square \
foot
What is the total cost for the first year of operations \
as a function of the number of square feet.
```
Student's solution:
```
Let x be the size of the installation in square feet.
Costs:
1. Land cost: 100x
2. Solar panel cost: 250x
3. Maintenance cost: 100,000 + 100x
Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
```
Actual solution:
"""
response = get_completion(prompt)
print(response)

To calculate the total cost for the first year of operations, we need to add up the costs of land, solar panels, and maintenance.

Let x be the size of the installation in square feet.

1. Land cost: $100 / square foot
The cost of land is calculated by multiplying the size of the installation by the cost per square foot:
Land cost = 100 * x

2. Solar panel cost: $250 / square foot
The cost of solar panels is calculated by multiplying the size of the installation by the cost per square foot:
Solar panel cost = 250 * x

3. Maintenance cost: $100,000 + $10 / square foot
The maintenance cost is a flat fee of $100,000 per year, plus an additional $10 per square foot:
Maintenance cost = 100,000 + 10 * x

Total cost for the first year of operations:
Total cost = Land cost + Solar panel cost + Maintenance cost
Total cost = 100 * x + 250 * x + 100,000 + 10 * x
Total cost = 360 * x + 100,000

Is the student's solution the same as the actual solution just calculated:
Yes

Student grade:
Correct


## Model Limitations: Hallucinations
- Boie is a real company, the product name is not real.

In [None]:
prompt = f"""
Tell me about AeroGlide UltraSlim Smart Toothbrush by Boie
"""
response = get_completion(prompt)
print(response)

The AeroGlide UltraSlim Smart Toothbrush by Boie is a technologically advanced toothbrush designed to provide a superior brushing experience. Boie is a company known for its innovative oral care products, and the AeroGlide UltraSlim Smart Toothbrush is no exception.

One of the standout features of this toothbrush is its ultra-slim design. The brush head is only 2mm thick, making it much thinner than traditional toothbrushes. This slim profile allows for better access to hard-to-reach areas of the mouth, ensuring a thorough and effective clean.

The AeroGlide UltraSlim Smart Toothbrush also incorporates smart technology. It connects to a mobile app via Bluetooth, allowing users to track their brushing habits and receive personalized recommendations for improving their oral hygiene routine. The app provides real-time feedback on brushing technique, ensuring that users are brushing for the recommended two minutes and covering all areas of their mouth.

The toothbrush itself is made from 

## Exercise Tasks

### 1. `Inferring` you will infer sentiment and topics from product reviews and news articles.

Review text.
lamp_review = """
Needed a nice lamp for my bedroom, and this one had \
additional storage and not too high of a price point. \
Got it fast.  The string to our lamp broke during the \
transit and the company happily sent over a new one. \
Came within a few days as well. It was easy to put \
together.  I had a missing part, so I contacted their \
support and they very quickly got me the missing piece! \
Lumina seems to me to be a great company that cares \
about their customers and products!!
"""

####  Task1
1) Write prompt for Sentiment (positive/negative)
2) Write prompt Identify types of emotions
3) Write prompt Identify anger
4) Write prompt Extract product and company name from customer reviews



story = """
In a recent survey conducted by the government,
public sector employees were asked to rate their level
of satisfaction with the department they work at.
The results revealed that NASA was the most popular
department with a satisfaction rating of 95%.

One NASA employee, John Smith, commented on the findings,
stating, "I'm not surprised that NASA came out on top.
It's a great place to work with amazing people and
incredible opportunities. I'm proud to be a part of
such an innovative organization."

The results were also welcomed by NASA's management team,
with Director Tom Johnson stating, "We are thrilled to
hear that our employees are satisfied with their work at NASA.
We have a talented and dedicated team who work tirelessly
to achieve our goals, and it's fantastic to see that their
hard work is paying off."

The survey also revealed that the
Social Security Administration had the lowest satisfaction
rating, with only 45% of employees indicating they were
satisfied with their job. The government has pledged to
address the concerns raised by employees in the survey and
work towards improving job satisfaction across all departments.

"""

####  Task2

1) Infer 5 topics that are being discussed in the given story

## Task 1 Solution

In [None]:
lamp_review = """Needed a nice lamp for my bedroom, and this one had
additional storage and not too high of a price point.
Got it fast. The string to our lamp broke during the
transit and the company happily sent over a new one.
Came within a few days as well. It was easy to put
together. I had a missing part, so I contacted their
support and they very quickly got me the missing piece!
Lumina seems to me to be a great company that cares
about their customers and products!!"""

# 1) Write prompt for Sentiment (positive/negative)

prompt1_task_1 = f"""
Your task is to perform the following action:
-> Perform the sentiment analysis of the following lamp review delimited by
   <> and identify the whether the sentiments are positive or negative.

Use the following format:
Sentiments are: <Sentiments>

lamp review: <{lamp_review}>
"""
response = get_completion(prompt1_task_1)
print("\nCompletion for prompt task 1 part 1:")
print(response)

# 2) Write prompt Identify types of emotions

prompt2_task_1 = f"""
Your will be provided by a customer review on a company service:
-> Go through the following review delimited by \
   <> and pay special attentions to customer emotions to identify \
   the different type of emotions presented in their review.

Use the following format:
No of Emotions: <Number of different emotions in the review>
Emotions Types: <Types of Emotions in the review>


lamp review: <{lamp_review}>
"""
response = get_completion(prompt2_task_1)
print("\nCompletion for prompt task 1 part 2:")
print(response)

# 3) Write prompt Identify anger

prompt3_task_1 = f"""
Your task is to perform the following action:
-> Perform the sentiment analysis of the following lamp review delimited by
   <> and identify the whether there is anger in review or not.

Use the following format:
Anger in review: <Anger in review>

lamp review: <{lamp_review}>
"""
response = get_completion(prompt3_task_1)
print("\nCompletion for prompt task 1 part 3:")
print(response)

# 4) Write prompt Extract product and company name from customer reviews

prompt4_task_1 = f"""
Your task is to extract the product and company from customer reviews:
-> Perform the actionon the following review delimited by <>

Use the following format:
Product: <Product name>
Company: <Company name>


lamp review: <{lamp_review}>
"""
response = get_completion(prompt4_task_1)
print("\nCompletion for prompt task 1 part 4:")
print(response)



Completion for prompt task 1 part 1:
Sentiments are: Positive

Completion for prompt task 1 part 2:
No of Emotions: 3
Emotions Types: Excitement, Satisfaction, Gratitude

Completion for prompt task 1 part 3:
Anger in review: No

Completion for prompt task 1 part 4:
Product: lamp
Company: Lumina


## Task 2 Solution

In [None]:
survey = """
In a recent survey conducted by the government, public sector employees were asked \
to rate their level of satisfaction with the department they work at. \
The results revealed that NASA was the most popular department with a satisfaction rating of 95%.

One NASA employee, John Smith, commented on the findings, stating, "I'm not surprised that NASA \
came out on top. It's a great place to work with amazing people and incredible opportunities. \
I'm proud to be a part of such an innovative organization."

The results were also welcomed by NASA's management team, with Director Tom Johnson stating, \
"We are thrilled to hear that our employees are satisfied with their work at NASA. We have a \
talented and dedicated team who work tirelessly to achieve our goals, and it's fantastic to see\
that their hard work is paying off."

The survey also revealed that the Social Security Administration had the lowest \
satisfaction rating, with only 45% of employees indicating they were satisfied with \
their job. The government has pledged to address the concerns raised by employees in \
the survey and work towards improving job satisfaction across all departments.
"""

# Infer 5 topics that are being discussed in the given story


prompt_task_2 = f""" Yow will be provided by a Survey on work satisfaction level of public sector employees.
Your job is to perform the following tasks:
1 - Indentify Different Topics that are mentioned in the {survey}.
2 - Provide the list of five topics that are being discussed in the given survey.

Follow the given structure:
Topics that are being discussed in the given survey: <5 topics>
Separate each topic by new line
"""

response = get_completion(prompt_task_2)
print("\nCompletion for prompt task 2:")
print(response)


Completion for prompt task 2 part:
Topics that are being discussed in the given survey:
1. Satisfaction levels of public sector employees
2. Departmental satisfaction ratings
3. NASA as the most popular department with high satisfaction rating
4. Employee comments on NASA's work environment and opportunities
5. Social Security Administration's low satisfaction rating and government's commitment to address concerns


## 2 `Expanding` you will generate customer service emails that are tailored to each customer's review.

review = """So, they still had the 17 piece system on seasonal \
sale for around 49 dollar in the month of November, about \
half off, but for some reason (call it price gouging) \
around the second week of December the prices all went \
up to about anywhere from between 70-89 dollar for the same \
system. And the 11 piece system went up around 10 dollar or \
so in price also from the earlier sale price of 29 dollar. \
So it looks okay, but if you look at the base, the part \
where the blade locks into place doesn’t look as good \
as in previous editions from a few years ago, but I \
plan to be very gentle with it (example, I crush \
very hard items like beans, ice, rice, etc. in the \
blender first then pulverize them in the serving size \
I want in the blender then switch to the whipping \
blade for a finer flour, and use the cross cutting blade \
first when making smoothies, then use the flat blade \
if I need them finer/less pulpy). Special tip when making \
smoothies, finely cut and freeze the fruits and \
vegetables (if using spinach-lightly stew soften the \
spinach then freeze until ready for use-and if making \
sorbet, use a small to medium sized food processor) \
that you plan to use that way you can avoid adding so \
much ice if at all-when making your smoothie. \
After about a year, the motor was making a funny noise. \
I called customer service but the warranty expired \
already, so I had to buy another one. FYI: The overall \
quality has gone done in these types of products, so \
they are kind of counting on brand recognition and \
consumer loyalty to maintain sales. Got it in about \
two days."""

####  Task1
1) Write prompt for Customize the automated reply to a customer email and remind the model to use details from the customer's email


## Customer Service Email Task 3 Solution

In [None]:
review = """So, they still had the 17 piece system on seasonal \
sale for around 49 dollar in the month of November, about \
half off, but for some reason (call it price gouging) \
around the second week of December the prices all went \
up to about anywhere from between 70-89 dollar for the same \
system. And the 11 piece system went up around 10 dollar or \
so in price also from the earlier sale price of 29 dollar.
So it looks okay, but if you look at the base, the part \
where the blade locks into place doesn’t look as good \
as in previous editions from a few years ago, but I \
plan to be very gentle with it (example, I crush \
very hard items like beans, ice, rice, etc. in the \
blender first then pulverize them in the serving size \
I want in the blender then switch to the whipping \
blade for a finer flour, and use the cross cutting blade \
first when making smoothies, then use the flat blade \
if I need them finer/less pulpy).
Special tip when making smoothies, finely cut and freeze the fruits and \
vegetables (if using spinach-lightly stew soften the \
spinach then freeze until ready for use-and if making \
sorbet, use a small to medium sized food processor) \
that you plan to use that way you can avoid adding so \
much ice if at all-when making your smoothie.
After about a year, the motor was making a funny noise. \
I called customer service but the warranty expired \
already, so I had to buy another one. FYI: The overall \
quality has gone done in these types of products, so \
they are kind of counting on brand recognition and \
consumer loyalty to maintain sales. Got it in about \
two days."""

# Write prompt for Customize the automated reply to a customer email and remind the model to use details from the customer's email

prompt_task_3 = f"""
                You will be provided by customers reviews that are delimited by <>, and you have to responde with custom \
customer service emails tailored to their reviews.

Keep these things into consideration:
-> If name of customer is given only then mention their name in the start.
-> Email should be to the point.
-> Maintain proper sentence spaces.

review: <{review}>

Follow this Format:
1 - Greet the reviewer with Positive sentiments.
2 - Analyze the review, whether its positive or negative.
3 - If positive then appreciate, and in case of negative appologies and offer assistance.
4 - Pay attention to details of review that are relavent to customer service, then responde shortly, and accordingly.
5 - Finish email with Regards with Customer Service Team.

"""
response = get_completion(prompt_task_3)
print("\nCustomer Service Email tailored to the Customer's Review:\n")
print(response)


Customer Service Email tailored to the Customer's Review:

Dear valued customer,

Thank you for taking the time to share your feedback with us. We appreciate your support and loyalty to our brand.

We apologize for any inconvenience you may have experienced regarding the pricing of our 17 piece system. Our prices are subject to change based on various factors, including seasonal sales and market conditions. However, we understand your concern and will take it into consideration for future pricing decisions.

Regarding the base of the system, we appreciate your feedback on the locking mechanism. We continuously strive to improve the quality of our products, and your input will be shared with our product development team for further evaluation.

Thank you for sharing your special tip for making smoothies. We value your expertise and will definitely consider it for our future recipe recommendations.

We are sorry to hear about the issue you faced with the motor of your previous blender. 

## 3 Summarizing you will summarize text with a focus on specific topics.

prod_review = """
Got this panda plush toy for my daughter's birthday, \
who loves it and takes it everywhere. It's soft and \
super cute, and its face has a friendly look. It's \
a bit small for what I paid though. I think there \
might be other options that are bigger for the \
same price. It arrived a day earlier than expected, \
so I got to play with it myself before I gave it \
to her.
"""

### Task

1) Write a prompt to summarize it with focus on delivery and shipping
2) Write a prompt to summarize it with focus on price and value


## Summarization Task 4 Solution

In [None]:
prod_review = """Got this panda plush toy for my daughter's birthday, \
who loves it and takes it everywhere. It's soft and \
super cute, and its face has a friendly look.
It's a bit small for what I paid though. I think there \
might be other options that are bigger for the \
same price.
It arrived a day earlier than expected, \
so I got to play with it myself before I gave it \
to her. """

# 1) Write a prompt to summarize it with focus on delivery and shipping
prompt1_task_4 = f"""Your task is to summerize the text given in {prod_review} in one sentence:

The summery should be focused on:
-> Product Delivery and product Shipping

"""
response = get_completion(prompt1_task_4)
print("\nSummary with focus on delivery and shipping:\n")
print(response)

# 2) Write a prompt to summarize it with focus on price and value
prompt2_task_4 = f"""Your task is to summerize the text given in {prod_review} in one sentence:

The summery should be focused on:
1 - Product price
2 - Product value

"""
response = get_completion(prompt2_task_4)
print("\n\nSummary with focus on price and value:\n")
print(response)


Summary with focus on delivery and shipping:

The panda plush toy arrived a day earlier than expected, allowing the customer to play with it before giving it to their daughter for her birthday.


Summary with focus on price and value:

The panda plush toy is loved by the daughter and has a friendly look, but the customer feels it is slightly overpriced and suggests there may be larger options available for the same price.


# Transforming use Large Language Models for text transformation tasks such as language translation, spelling and grammar checking, tone adjustment, and format conversion.



## Task 1
'Dude, This is Joe, check out this spec on this standing lamp.'
1) Write prompt toransform this text into formal tone

2) Write a prompt to transform and check for the spelling and Homonyms

    Text =
  "The girl with the black and white puppies have a ball.",  # The girl has a ball.
  "Yolanda has her notebook.", # ok
  "Its going to be a long day. Does the car need it’s oil changed?",  # Homonyms
  "Their goes my freedom. There going to bring they’re suitcases.",  # Homonyms
  "Your going to need you’re notebook.",  # Homonyms
  "That medicine effects my ability to sleep. Have you heard of the butterfly affect?", # Homonyms
  "This phrase is to cherck chatGPT for speling abilitty"  # spelling
  
3) Write a prompt to translate all given text to english

text =   "La performance du système est plus lente que d'habitude.",  
  "Mi monitor tiene píxeles que no se iluminan.",              
  "Il mio mouse non funziona",                               
  "Mój klawisz Ctrl jest zepsuty",                            
  "我的屏幕在闪烁"                                               






# **Transforming use Large Language Models for text transformation**

## Formal Tone Task 1 Solution

In [None]:
text_1 = "Dude, This is Joe, check out this spec on this standing lamp."

#  1) Write prompt to transform this text into formal tone
prompt1_task_5 = f"""Your task is to convert the informal text into one with formal tone:

Covert the text delimited by <> into the equavalent formal text.

text: <{text_1}>

"""
# using gpt 3.5 turbo
response = get_completion(prompt1_task_5)
print("\nCoversion of informal text to formal text:\n")
print(response)


Coversion of informal text to formal text:

Hello, this is Joe. I would like to draw your attention to this specification for the standing lamp.


## Same Formal Tone Task using Davinci

In [None]:
from langchain.llms import OpenAI

davinci = OpenAI(model_name='text-davinci-003')

temp = """Your task is to convert the informal text into one with formal tone:


text: {informal_text}

formal text:

"""

prompt_for_davinci = PromptTemplate(
    template = temp,
    input_variables = ["informal_text"]
)


davinci = LLMChain(
    prompt=prompt_for_davinci,
    llm=davinci
)

text_to_conv = "Dude, This is Joe, check out this spec on this standing lamp."

print(davinci.run(text_to_conv))




Good day, my name is Joe. I would like to draw your attention to the specifications of this standing lamp.


## Transform and check for the spelling and Homonyms Task 2 Solution

In [None]:
text_2 = """The girl with the black and white puppies have a ball.", # The girl has a ball
"Yolanda has her notebook.", # ok
"Its going to be a long day. Does the car need it’s oil changed?", # Homonyms
"Their goes my freedom. There going to bring they’re suitcases.", # Homonyms
"Your going to \ need you’re notebook.", # Homonyms
"That medicine effects my ability to sleep. Have you heard of the butterfly affect?", # Homonyms
"This phrase is to cherck chatGPT for speling abilitty" # spelling"""

# 2) Write a prompt to transform and check for the spelling and Homonyms

prompt2_task_5 = f"""You will be provided by a text delimited by <> and you have to perform certain task on it:

These are the following task you should perform:
1 - Make list of spelling mistakes in the text.
2 - Make list of Homonyms in the text.
3 - Correct the spelling mistakes.
4 - Correct the grammatical mistakes.
5 - Generate a fine text line by line.

text: <{text_2}>

Here are some examples of Homonyms:

(Bat: Bats are mammals that can fly.
 Bat: A bat is also a piece of sports equipment used in games like baseball.)

(Week: A period of seven days.
 Weak: Lacking in strength or power.)

"""

# using gpt 3.5 turbo
response = get_completion(prompt2_task_5)
print("Transform and check for the spelling and Homonyms:\n")
print(response)


Transform and check for the spelling and Homonyms:

1 - List of spelling mistakes in the text:
- cherck (check)
- chatGPT (ChatGPT)
- speling (spelling)
- abilitty (ability)

2 - List of Homonyms in the text:
- its/it's
- there/their/they're
- your/you're
- affect/effect

3 - Corrected spelling mistakes in the text:
- "This phrase is to check ChatGPT for spelling ability"

4 - Corrected grammatical mistakes in the text:
- "The girl with the black and white puppies has a ball."
- "Yolanda has her notebook."
- "It's going to be a long day. Does the car need its oil changed?"
- "There goes my freedom. They're going to bring their suitcases."
- "You're going to need your notebook."
- "That medicine affects my ability to sleep. Have you heard of the butterfly effect?"
- "This phrase is to check ChatGPT for spelling ability"

5 - Generated fine text line by line:
- The girl with the black and white puppies has a ball.
- Yolanda has her notebook.
- It's going to be a long day. Does the car ne

## Task 3 Solution

In [None]:
# 3) Write a prompt to translate all given text to english

text_3 = """
"La performance du système est plus lente que d'habitude.",
"Mi monitor tiene píxeles que no se iluminan.",
"Il mio mouse non funziona",
"Mój klawisz Ctrl jest zepsuty",
"我的屏幕在闪烁"
"""

prompt3_task_5 = f"""
You will be provided with texts from various languages delimited by <> and your task is to translate all of them into English:

text: <{text_3}>

Follow this Format:
English Translation: Translation for each line

"""

# using gpt 3.5 turbo
response = get_completion(prompt3_task_5)
print(response)

English Translation:
"The system performance is slower than usual."
"My monitor has pixels that do not light up."
"My mouse is not working."
"My Ctrl key is broken."
"My screen is flickering."


## Same Translation Task with Curie

In [None]:
llm_davinci = OpenAI(model_name='text-curie-001')

translation_prompt = """
Task is English Translation:

text: {text}

Translation:
"""

translation_prompt = PromptTemplate(
    template=translation_prompt,
    input_variables=["text"]
)

llm_davinci = LLMChain(
    prompt=translation_prompt,
    llm=llm_davinci
)

print("English Translation:\n" + llm_davinci.run(text_3))

English Translation:

"The performance of the system is slower than usual."
"My monitor has pixels that don't light up."
"My mouse isn't working."
"My Ctrl key is broken."
"My monitor is flickering."
