# Experimenting Prompt Engineering - Chatbot

In this notebook we will explore common prompt engineering techniques to get the most out of Large Language Models (LLMs). 

First you will experiment by loading a 7 billion parameter LLM within the notebook environment and throwing some prompts its way to see how differt prompt strategies can impact the results of the model. 

After trying different prompt engineering techniques, you will learn how to run a simple chatbot using what you learned. 

Finally, we'll review some pointers for how to put these concepts into practice by building on the sample code and swapping in different LLMs to create your own solutions.

### Working Environment 

[![Open In Studio Lab](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/build-on-aws/generative-ai-prompt-engineering/blob/main/prompt-engineering-chatbot/prompt-engineering-chatbot.ipynb)


This notebook has been designed, written and tested to run for free on [Amazon SageMaker Studio Lab](https://studiolab.sagemaker.aws/) with CPU.  Studio Lab is a free machine learning (ML) development environment that provides compute and storage (up to 15GB) at no cost with NO credit card required.

You can sign up for Amazon SageMaker Studio Lab here: [https://studiolab.sagemaker.aws/]

> Whatever environment you end up using, make sure you have at least 12 GB of disk space available to run this code.

### Libraries
First, if needed, install `ctransformers` - a library based on `transformers` from [Hugging Face](https://huggingface.co/), a great open source set of libraries for working and experimenting with the underlying technology of generative AI.  

In [1]:
%%sh
pip install ctransformers>=0.2.24 --quiet
pip install langchain --quiet
pip install transformers[torch] --quiet

In [2]:
from ctransformers import AutoModelForCausalLM, AutoTokenizer

## Mistral-7B-Instruct-v0.1-GGUP Model

The üêã [Mistral-7B-Instruct-v0.1-GGUP](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF) üêã model was fine-tuned on top of Mistral 7B using using a variety of publicly available conversation datasets. 

### Loading the model

The following cell will load the pretrained model. Because the model itself is around 3.8 GB, so it can take a minute or so to load. 

In [3]:
llm = AutoModelForCausalLM.from_pretrained("TheBloke/Mistral-7B-Instruct-v0.1-GGUF")

Fetching 1 files:   0%|          | 0/1 [00:00<?, ?it/s]

Fetching 1 files:   0%|          | 0/1 [00:00<?, ?it/s]

# How do large language models work?

## Prompt engineering

Remember that LLMs are trained to predict the next word when given a sequence of words. 

![LLM-concept](img/LLM-generation.png)

The way that we prompt an LLM can make a big difference in how it responds and how useful its output is. This depends not only on they type of query, but also nuances of how individual models where trained. Let's start by asking the Mistral model to describe the concept of photosynthesis.

## Let's try a basic prompt

In the following cell, we provide the model three parameters:
1. the prompt: "Explain photosythesis"
2. max_new_tokens which is the maximum length in tokens for the model's response
3. temperature which you can think of as a measure of how *creative* the model can be in its response


<div class="alert alert-block alert-info"><b>Note:</b> If you are running this on a cpu instance of Studio Lab keep in mind that the following cell may take a couple minutes to run.</div>

In [4]:
prompt = "Explain photosythesis"

for text in llm(prompt, max_new_tokens=250, temperature=0.1, stream=True):
    print(text, end="", flush=True)

.
Photosynthesis is the process by which plants, algae, and some bacteria convert sunlight, water, and carbon dioxide into glucose, oxygen, and other chemical compounds. It is an essential process for life on Earth because it provides the primary source of organic compounds and oxygen.

The process of photosynthesis can be divided into two main stages: the light-dependent reactions and the light-independent reactions (also known as the Calvin cycle).

During the light-dependent reactions, light energy is absorbed by pigments in the chloroplasts of plant cells, primarily chlorophyll. This energy is used to split water molecules into hydrogen ions and electrons. The hydrogen ions are used to create ATP (adenosine triphosphate) and NADPH (nicotinamide adenine dinucleotide phosphate), which are energy-rich molecules that are used in the second stage of photosynthesis. The electrons from water molecules are passed through a series of protein complexes in the thylakoid membrane of the chloro

---
<div class="alert alert-block alert-info"><b>The model did pretty well, but let's see if we can make the answer more concise</b></div>

## Modifying the basic prompt
Below we will try giving the model more specific instructions.


In [5]:
prompt = "Explain photosythesis in three sentences."

for text in llm(prompt, max_new_tokens=250, temperature=0.1, stream=True):
    print(text, end="", flush=True)

 Photosynthesis is the process by which plants, algae, and some bacteria convert sunlight, water, and carbon dioxide into glucose, oxygen, and other chemical compounds. This process occurs in the chloroplasts of plant cells and involves the absorption of light energy by pigments such as chlorophyll. The energy from the absorbed light is used to drive a series of chemical reactions that ultimately result in the production of glucose, which serves as a source of energy for the plant.

---

<div class="alert alert-block alert-info"><b>This response is more concise; however, the model didn't answer in three sentences. We will explore an approach a little further below when we cover few-shot learning.</b></div>


# Instruction tuning

Building complex prompts to achieve simple objectives like this one can get complicated very fast, especially once we attempt more sophisticated tasks. Luckily, LLMs can be instruction tuned. This is done on using a variety of publicly available conversation datasets which fine-tune the model to follow directions as closely as possible. In other words, instead of providing examples in each prompt, this is done by changing the model weights as part of training processs. This Mistral model has actually already been instruction tuned and we just need to follow the format which was used during training. The model uses ChatML markup language with <|im_start|> and <|im_end|> tokens added:
```
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{question}<|im_end|>
<|im_start|>assistant
```

Here is an example prompt exchange using the model template:

```
<|im_start|>system
You are MistralInstruct, a large language model trained by Alignment Lab AI. Write out your reasoning step-by-step to be sure you get the right answers!
<|im_end|>
<|im_start|>user
How are you?<|im_end|>
<|im_start|>assistant
I am doing well!<|im_end|>
<|im_start|>user
Please tell me about how mistral winds have attracted super-orcas.<|im_end|>
```

## Let's try it:

The following cell uses the prompt template to provide a system prompt to the model to ask it to provide its reasoning step-by-step which can help LLM response accuracy.

In [6]:
prompt="""
<|im_start|>system
You are MistralInstruct, a large language model trained by Alignment Lab AI. Write out your reasoning step-by-step to be sure you get the right answers!
<|im_end|>
<|im_start|>user
Please tell me about how mistral winds have attracted super-orcas. Keep your answer brief.<|im_end|>
<|im_start|>assistant\n
"""

print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=500, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

Prompt:

<|im_start|>system
You are MistralInstruct, a large language model trained by Alignment Lab AI. Write out your reasoning step-by-step to be sure you get the right answers!
<|im_end|>
<|im_start|>user
Please tell me about how mistral winds have attracted super-orcas. Keep your answer brief.<|im_end|>
<|im_start|>assistant


Response:
Mistral winds are strong, cold winds that blow from the northwest towards the southeast in the North Atlantic Ocean. They are known for their strength and persistence, with speeds of up to 60 mph (96 km/h) and can last for days or even weeks.

Super-orcas, also known as humpback whales, are large marine mammals that migrate long distances in search of food and breeding grounds. They are attracted to areas with abundant krill populations, which they feed on.

Mistral winds have attracted super-orcas by providing a reliable source of nutrients for them. The cold water temperature of the North Atlantic Ocean is ideal for krill growth, and the strong m

---
<div class="alert alert-block alert-info"><b>The model was able to follow the system direction to provide its step-by-step reasoning. However, manually supplying the full prompt template for each query is cumberson. In the next section we will look at how to simplify and automate prompt generation dynamically.</b></div>

# Dyanamic prompting
Now that we know how to submit instructions to Mistral LLM, let's try assembling prompts dynamically - part of the prompt can be fixed, and part can be provided on the fly. We achieve this by creating prompt template with variables. The value for each variable is supplied in a separate statement and that statement can be executed further down in the code. Basically, this is a way to de-couple static part of the prompt from variable one, making the entire prompt dynamic. 

In [7]:
from langchain.prompts import PromptTemplate

prompt_template = PromptTemplate.from_template(
    """<|im_start|>system
You are an AI assistant which gives helpful and polite answers to the user's questions.<|im_end|>
<|im_start|>user
{question}<|im_end|>
<|im_start|>assistant\n"""
)

In [8]:
prompt = prompt_template.format(question="Explain photosynthesis in three sentences")
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=500, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

Prompt:
<|im_start|>system
You are an AI assistant which gives helpful and polite answers to the user's questions.<|im_end|>
<|im_start|>user
Explain photosynthesis in three sentences<|im_end|>
<|im_start|>assistant

Response:
Photosynthesis is the process by which plants, algae, and some bacteria convert sunlight, water, and carbon dioxide into glucose, oxygen, and other chemical compounds. It is a vital process for life on Earth as it provides the primary source of energy for most living organisms. The process can be simplified into two stages: the light-dependent reactions, which occur in the thylakoid membranes of chloroplasts, and the light-independent reactions, also known as the Calvin cycle, which take place in the stroma of chloroplasts.

---
<div class="alert alert-block alert-info"><b>Using prompt templates makes it much easier to automate away the formatting nuances of different LLMs which can vary widely.</b></div>


# Few-shot prompts

We can influence model's output style by using what's called few-shot prompting - a technique which shows the model the exact behavior we expect on few examples. Few-shot prompting can be a powerful approach to improve model performance for a given task without additional training or fine-tuning. 

As you can see in the cell below, we provide the model two examples of a prompt and the expected response where the number of response sentences is constrained correctly by the request. We end the full prompt with our request for a three sentence explanation of photosenthesis and let the model complete the example.  

In [9]:
from langchain.prompts.few_shot import FewShotPromptTemplate
from langchain.prompts.prompt import PromptTemplate

fs_template = PromptTemplate.from_template(
 """<|im_start|>user
{question}<|im_end|>
<|im_start|>assistant
{output}<|im_end|>"""
)


examples = [
  {
    "question": "Explain precipitation in two sentences.",
    "output": "1. In meteorology, precipitation is any product of the condensation of atmospheric water vapor that falls from clouds due to gravitational pull.\n2. The main forms of precipitation include drizzle, rain, sleet, snow, ice pellets, graupel and hail."
  },
  {
    "question": "Explain condensation in one sentence.",
    "output": "1. Condensation is the change of the state of matter from the gas phase into the liquid phase, and is the reverse of vaporization."
  }
]

prompt_fs = FewShotPromptTemplate(
    examples=examples,
    example_prompt=fs_template,
    suffix="<|im_start|>user\n{question}<|im_end|>\n<|im_start|>assistant\n",
    input_variables=["question"]
)

prompt = prompt_fs.format(question="Explain photosynthesis in three sentences.")
print("Prompt:")
print(prompt)
      
print("Response:")      
for text in llm(prompt, max_new_tokens=250, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

Prompt:
<|im_start|>user
Explain precipitation in two sentences.<|im_end|>
<|im_start|>assistant
1. In meteorology, precipitation is any product of the condensation of atmospheric water vapor that falls from clouds due to gravitational pull.
2. The main forms of precipitation include drizzle, rain, sleet, snow, ice pellets, graupel and hail.<|im_end|>

<|im_start|>user
Explain condensation in one sentence.<|im_end|>
<|im_start|>assistant
1. Condensation is the change of the state of matter from the gas phase into the liquid phase, and is the reverse of vaporization.<|im_end|>

<|im_start|>user
Explain photosynthesis in three sentences.<|im_end|>
<|im_start|>assistant

Response:
1. Photosynthesis is the process by which plants, algae and some bacteria convert sunlight, water and carbon dioxide into oxygen, glucose and other organic compounds.
2. It is a vital process for life on Earth as it provides the primary source of energy for almost all living organisms.
3. The process can be divi

<div class="alert alert-block alert-info"><b>Using a couple examples, we were able to teach the model to respond by specifically numbering the sentences in its response and limiting its response to the number of sentences requested.</b></div>


# Augmenting LLM response with context

Without providing any real-time context, the model is limited to knowledge it gained during training which may be incomplete or out of date. However, we can improve responses by providing relevant context that the model can use before formulating a response. 

To illustrate the benefit of providing context we will first query the model about which instance types are available to use for managed spot training in SageMaker. 


In [10]:
prompt = prompt_template.format(question="Which instances can I use with Managed Spot Training in SageMaker?")

print(f"Prompt:\n{prompt}")
print("Response:")      
for text in llm(prompt, max_new_tokens=300, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

Prompt:
<|im_start|>system
You are an AI assistant which gives helpful and polite answers to the user's questions.<|im_end|>
<|im_start|>user
Which instances can I use with Managed Spot Training in SageMaker?<|im_end|>
<|im_start|>assistant

Response:
You can use Managed Spot Training in Amazon SageMaker with the following instances:

* ml.c4.xlarge
* ml.c4.2xlarge
* ml.c4.4xlarge
* ml.c4.8xlarge
* ml.p2.xlarge
* ml.p2.8xlarge
* ml.p3.xlarge
* ml.p3.16xlarge
* ml.m4.xlarge
* ml.m4.2xlarge
* ml.m4.4xlarge
* ml.m4.10xlarge
* ml.m4.16xlarge
* ml.c5.xlarge
* ml.c5.2xlarge
* ml.c5.4xlarge
* ml.c5.8xlarge
* ml.c5.16xlarge
* ml.p3.16xlarge
* ml.p3.20xlarge
* ml.p3.24xlarge
* ml.p3.32xlarge

Note that the availability of these instances may vary depending on your region and availability zone.

---
<div class="alert alert-block alert-info"><b>This response does not clearly and directly answer our question and you can see how without provided context, the LLM uses its own "knowledge" to answer questions. Let's try providing some context to the model to use as reference. We will add a new dynamic field "context" to our prompt template and provide up to date information when we ask the user question.</b></div>

In [11]:
from langchain.prompts import PromptTemplate

context_template = PromptTemplate.from_template(
    """<|im_start|>system
You are an AI assistant which gives helpful, detailed, and polite answers to the user's questions<|im_end|>
<|im_start|>user
Given the context below, answer the question that follows. If you do not know the answer and the context doesn't
contain the answer truthfully say "I don't know".

Context: {context}
Question: {question}<|im_end|>
"""
)

context = """Managed Spot Training can be used with all instances supported in Amazon SageMaker. Managed Spot Training is supported in all AWS Regions where Amazon SageMaker is currently available."""


prompt=context_template.format(context=context, question="Which instances can I use with Managed Spot Training in SageMaker?")
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=150, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

Prompt:
<|im_start|>system
You are an AI assistant which gives helpful, detailed, and polite answers to the user's questions<|im_end|>
<|im_start|>user
Given the context below, answer the question that follows. If you do not know the answer and the context doesn't
contain the answer truthfully say "I don't know".

Context: Managed Spot Training can be used with all instances supported in Amazon SageMaker. Managed Spot Training is supported in all AWS Regions where Amazon SageMaker is currently available.
Question: Which instances can I use with Managed Spot Training in SageMaker?<|im_end|>

Response:
<|im_start|>system
You can use any instance supported in Amazon SageMaker with Managed Spot Training.

---
<div class="alert alert-block alert-info"><b>Great! The model used the provided context to answer the user question. Now lets use the same context and ask the model about something that is not covered in the context. </b></div>

In [12]:
prompt=context_template.format(context=context, question="Which instances can I use with Amazon ECS?")
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=150, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

Prompt:
<|im_start|>system
You are an AI assistant which gives helpful, detailed, and polite answers to the user's questions<|im_end|>
<|im_start|>user
Given the context below, answer the question that follows. If you do not know the answer and the context doesn't
contain the answer truthfully say "I don't know".

Context: Managed Spot Training can be used with all instances supported in Amazon SageMaker. Managed Spot Training is supported in all AWS Regions where Amazon SageMaker is currently available.
Question: Which instances can I use with Amazon ECS?<|im_end|>

Response:

Answer: I don't know, Amazon ECS does not support managed spot training.

---
<div class="alert alert-block alert-info"><b>The model followed our direction and admitted that it did not know the answer when it wasn't able to be determined from the provided context. This is a powerful method that is used heavily in RAG (Retrieval Augmented Generation) applications. A company may only want an IT support bot to provide answers based on internal documentaiton provided to the model as context. </b></div>

# Hallucinations

Hallucination is when a model makes up information and presents it as if it were true. Leading questions can often cause hallucinations. Let's see what happens if we ask about a non-existent new launch at re:Invent. 

In [13]:
prompt = prompt_template.format(question="What type of new chemical solution AWS announced at re:Invent this year?")
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=250, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

Prompt:
<|im_start|>system
You are an AI assistant which gives helpful and polite answers to the user's questions.<|im_end|>
<|im_start|>user
What type of new chemical solution AWS announced at re:Invent this year?<|im_end|>
<|im_start|>assistant

Response:
AWS announced a new chemical solution called "AWS Polymer" at re:Invent this year. AWS Polymer is a cloud-based platform that enables users to design, simulate, and optimize complex chemical processes using machine learning algorithms. The platform can be used for drug discovery, materials science, and other applications in the chemical industry.

---
<div class="alert alert-block alert-info"><b>The model incorrectly states that AWS announced a new chemical solution called "AWS Graviton2" in 2019. One approach to combatting LLM hallucination is to give the model an "out" by giving it explicit instructions not to make up information and admit when it doesn't know something. Let's try the same question again by by also providing new system instructions in our prompt. </b></div>

In [14]:
prompt_template2 = PromptTemplate.from_template(
    """<|im_start|>system
{system}<|im_end|>
<|im_start|>user
{question}<|im_end|>
<|im_start|>assistant\n"""
)

system = "You are an AI assistant which answers user questions politely and in great details. Do not make up facts. Say I don't know if you have no information about something."
question = "What type of new chemical solution AWS announced at reInvent this year?"

prompt = prompt_template2.format(system=system, question=question)
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=250, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

Prompt:
<|im_start|>system
You are an AI assistant which answers user questions politely and in great details. Do not make up facts. Say I don't know if you have no information about something.<|im_end|>
<|im_start|>user
What type of new chemical solution AWS announced at reInvent this year?<|im_end|>
<|im_start|>assistant

Response:
AWS announced a new chemical solution called "AWS Elemental X" at reInvent this year. AWS Elemental X is a high-performance, low-cost, and scalable solution that enables customers to run complex chemical simulations and modeling workloads on the cloud. The solution leverages AWS's advanced computing resources, including GPUs, FPGAs, and TPUs, as well as its deep expertise in molecular simulation and modeling. With AWS Elemental X, customers can accelerate their research and development efforts, reduce costs, and improve the accuracy of their simulations and models.

---
<div class="alert alert-block alert-info"><b>By giving the model clear instructions and an out, we were able to get the model to admit it didn't have enough context to answer the question rather than making up something.</b></div>

# Chatbot

Let's make a simple chatbot.  There is no special library to include and no setting to apply to the LLM, all we need is prompt engineering!

Let's use what we know of prompts with in context learning, to create a simple chatbot. 

In [15]:
question = "Who is Jeff Bezos?"

prompt = prompt_template2.format(system=system, question=question)
print(f"Prompt:\n{prompt}")
print("Response:")
resp = llm(prompt, max_new_tokens=250, temperature=0.1, stop="<|im_end|>")
print(resp)

Prompt:
<|im_start|>system
You are an AI assistant which answers user questions politely and in great details. Do not make up facts. Say I don't know if you have no information about something.<|im_end|>
<|im_start|>user
Who is Jeff Bezos?<|im_end|>
<|im_start|>assistant

Response:
Jeff Bezos is an American entrepreneur and business magnate. He was born on January 28, 1964 in Seattle, Washington. He is best known as the founder and CEO of Amazon, one of the world's largest online retailers. He has also founded Blue Origin, a space exploration company, and The Washington Post, a daily newspaper. Bezos has been ranked as one of the richest people in the world by Forbes magazine.


### Follow-up questions
Now if we ask the LLM a follow-up question without providing context from the conversation so far, it will not be able to provide an accurate answer.

In [16]:
question = "How old is he?"

prompt = prompt_template2.format(system=system, question=question)
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=300, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

Prompt:
<|im_start|>system
You are an AI assistant which answers user questions politely and in great details. Do not make up facts. Say I don't know if you have no information about something.<|im_end|>
<|im_start|>user
How old is he?<|im_end|>
<|im_start|>assistant

Response:
I am sorry, but I do not have any information about who you are referring to. Could you please provide me with more details or context so that I can assist you better?

---
<div class="alert alert-block alert-info"><b>The model doesn't know who "he" is. Without the chat history context it is not able to answer.</b></div>

### Providing chat context
Let's use the context template we created earlier and feed in the previous response about Jeff Bezos. 

In [17]:
prompt = context_template.format(context=resp, question=question)
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=300, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

Prompt:
<|im_start|>system
You are an AI assistant which gives helpful, detailed, and polite answers to the user's questions<|im_end|>
<|im_start|>user
Given the context below, answer the question that follows. If you do not know the answer and the context doesn't
contain the answer truthfully say "I don't know".

Context: Jeff Bezos is an American entrepreneur and business magnate. He was born on January 28, 1964 in Seattle, Washington. He is best known as the founder and CEO of Amazon, one of the world's largest online retailers. He has also founded Blue Origin, a space exploration company, and The Washington Post, a daily newspaper. Bezos has been ranked as one of the richest people in the world by Forbes magazine.
Question: How old is he?<|im_end|>

Response:

Answer: Jeff Bezos was born on January 28, 1964, so he is currently 57 years old.

---
<div class="alert alert-block alert-info"><b>This is a simplified example, but by including the past chat context, we can support a back and forth chat bot interaction with the LLM.</b></div>

# More examples of LLM use cases:

## The following are optional prompts you can try running to keep exploring.


For simpler questions/prompts we can often get away witout strict instruction formatting:

In [18]:
prompt = """Rewrite the following sentence in better English:
I think I want to apply for this position but don't know how, can you help?
"""

for text in llm(prompt, max_new_tokens=150, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

I am considering applying for this position, but I am unsure of the process, could you provide some guidance?

In [19]:
prompt = """Rewrite the following sentence in better English:
I think I want to apply for this position but don't know how, can you help?
"""

for text in llm(prompt, max_new_tokens=150, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

I am considering applying for this position but I am unsure of the process, could you provide some guidance?

In [20]:
prompt = """I am getting the following error when attempting to run this pythong code, can you explain why?
  File "/home/studio-lab-user/sagemaker-studiolab-notebooks/ha.py", line 2, in <module>
    import zip
ModuleNotFoundError: No module named 'zip'
"""

for text in llm(prompt, max_new_tokens=150, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

```
This error occurs because the `zip` module is not installed on your system. The `zip` module provides functionality for working with archives and packing/unpacking files. It is a built-in module in Python, so it should be available by default.

To fix this error, you can try installing the `zipfile` package using pip:
```
pip install zipfile
```
If you are running this code on a Jupyter Notebook, you can also try installing the `zipfile` package using the following command in your notebook cell:
```
!pip install zipfile
```
After installing the `zipfile` package

In [21]:
prompt = """Reduct PII from the following paragraph. Replace any PII with ###.
Paragraph:
Jeff Bezos lives at 1 Main St. Miami, FL 39812. His phone number is 111-123-4567.
"""

for text in llm(prompt, max_new_tokens=250, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)


Reduced paragraph:
Jeff Bezos lives at ### Main St., ###, FL ###. His phone number is ###-###-####.

---
The last one or two examples didn't quite work. Perhaps the model is not strong enough for this type of task. 

In [22]:
prompt = """Create multiple choice question to test student's understanding of photosynthesis. The question must have at least three distractors. Indicate the correct answer.
"""

for text in llm(prompt, max_new_tokens=250, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

Answer: C) Photosynthesis is the process by which plants use sunlight, carbon dioxide, and water to produce glucose, oxygen, and other chemical compounds.

---
The above didn't work at all. Let's try with better prompt formatting:

In [None]:
prompt = """<|im_start|>system
You are an AI assistant which creates assessments to help educators evaluate students knowledge<|im_end|>
<|im_start|>user
Create multiple choice question to test student's understanding of photosynthesis. The question must have at least three distractors. Indicate the correct answer.<|im_end|>
<|im_start|>assistant"""

for text in llm(prompt, max_new_tokens=250, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)


Here are five multiple-choice questions on photosynthesis:

1. What is the process by which plants convert light energy into chemical energy?
a) Respiration
b) Photosynthesis
c) Cell division
d) Fermentation

2. Which of the following substances is not a product of photosynthesis?
a) Oxygen
b) Carbon dioxide
c) Water
d) Nitrogen

3. What is the name of the pigment responsible for absorbing light energy during photosynthesis?
a) Chlorophyll
b) Hemoglobin
c) Melanin
d) Myoglobin

4. Which of the following statements about photosynthesis is true?
a) It occurs in the presence of oxygen.
b) It requires light energy to occur.
c) It produces carbon dioxide as a

## Next steps: Want to use a different model with this notebook?

Hugging Face have many models that you can use and drop in to code like this. We encourage you to play with the prompts above and other models. You may need to make modifications to the code above depending on the model you choose.

