# Experimenting Prompt Engineering - Chatbot

In this notebook we will explore common prompt engineering techniques to get the most out of Large Language Models (LLMs). 

First you will experiment by loading a 7 billion parameter LLM within the notebook environment and throwing some prompts its way to see how differt prompt strategies can impact the results of the model. 

After trying different prompt engineering techniques, you will learn how to run a simple chatbot using what you learned. 

Finally, we'll review some pointers for how to put these concepts into practice by building on the sample code and swapping in different LLMs to create your own solutions.

### Working Environment 

[![Open In Studio Lab](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/build-on-aws/generative-ai-prompt-engineering/blob/main/prompt-engineering-chatbot/prompt-engineering-chatbot.ipynb)


This notebook has been designed, written and tested to run for free on [Amazon SageMaker Studio Lab](https://studiolab.sagemaker.aws/) with CPU.  Studio Lab is a free machine learning (ML) development environment that provides compute and storage (up to 15GB) at no cost with NO credit card required.

You can sign up for Amazon SageMaker Studio Lab here: [https://studiolab.sagemaker.aws/]

> Whatever environment you end up using, make sure you have at least 12 GB of disk space available to run this code.

### Libraries
First, if needed, install `ctransformers` - a library based on `transformers` from [Hugging Face](https://huggingface.co/), a great open source set of libraries for working and experimenting with the underlying technology of generative AI.  

In [1]:
%%sh
pip install ctransformers>=0.2.24 --quiet
pip install langchain --quiet
pip install transformers[torch] --quiet

In [1]:
from ctransformers import AutoModelForCausalLM, AutoTokenizer

## Mistral-7B-Instruct-v0.1-GGUP Model

The 🐋 [Mistral-7B-Instruct-v0.1-GGUP](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF) 🐋 model was fine-tuned on top of Mistral 7B using using a variety of publicly available conversation datasets. 

### Loading the model

The following cell will load the pretrained model. Because the model itself is around 3.8 GB, so it can take a minute or so to load. 

In [2]:
llm = AutoModelForCausalLM.from_pretrained("TheBloke/Mistral-7B-Instruct-v0.1-GGUF", model_file="mistral-7b-instruct-v0.1.Q4_K_M.gguf", model_type="mistral")

Fetching 1 files:   0%|          | 0/1 [00:00<?, ?it/s]

Fetching 1 files:   0%|          | 0/1 [00:00<?, ?it/s]

# How do large language models work?

## Prompt engineering

Remember that LLMs are trained to predict the next word when given a sequence of words. 

![LLM-concept](img/LLM-generation.png)

The way that we prompt an LLM can make a big difference in how it responds and how useful its output is. This depends not only on they type of query, but also nuances of how individual models where trained. Let's start by asking the Mistral model to describe the concept of photosynthesis.

## Let's try a basic prompt

In the following cell, we provide the model three parameters:
1. the prompt: "Explain photosythesis"
2. max_new_tokens which is the maximum length in tokens for the model's response
3. temperature which you can think of as a measure of how *creative* the model can be in its response


<div class="alert alert-block alert-info"><b>Note:</b> If you are running this on a cpu instance of Studio Lab keep in mind that the following cell may take a couple minutes to run.</div>

In [4]:
prompt = "Explain photosythesis"

for text in llm(prompt, max_new_tokens=250, temperature=0.1, stream=True):
    print(text, end="", flush=True)

.

Photosynthesis is the process by which plants, algae, and some bacteria convert sunlight, water, and carbon dioxide into glucose (sugar), oxygen, and other chemical compounds. It is an essential process for life on Earth because it provides the primary source of organic compounds and oxygen.

The process of photosynthesis occurs in two stages: the light-dependent reactions and the light-independent reactions. During the light-dependent reactions, light energy is absorbed by pigments in the chloroplasts, primarily chlorophyll. This energy is used to generate ATP (adenosine triphosphate) and NADPH (nicotinamide adenine dinucleotide phosphate), which are high-energy molecules that are used in the second stage of photosynthesis.

During the light-independent reactions, also known as the Calvin cycle, ATP and NADPH are used to power a series of chemical reactions that convert carbon dioxide into glucose. This process is also known as carbon fixation. The glucose produced during photosynt

---
<div class="alert alert-block alert-info"><b>The model did pretty well. Notice it started with generating a period ("."). That's because LLMs are trained to generate next most probably token, given an input. Let's see if we can make the answer more concise, and we will end our prompt with a period this time.</b></div>

## Modifying the basic prompt
Below we will try giving the model more specific instructions.


In [5]:
prompt = "Explain photosythesis in three sentences."

for text in llm(prompt, max_new_tokens=250, temperature=0.1, stream=True):
    print(text, end="", flush=True)

 Photosynthesis is the process by which plants, algae, and some bacteria convert sunlight, water, and carbon dioxide into glucose, oxygen, and other chemical compounds. This process occurs in the chloroplasts of plant cells and involves two stages: the light-dependent reactions, which use light energy to produce ATP and NADPH, and the light-independent reactions, also known as the Calvin cycle, which use these energy-rich molecules to fix carbon dioxide into glucose. Photosynthesis is a vital process for life on Earth, providing the primary source of food and oxygen for nearly all living organisms.

---

<div class="alert alert-block alert-info"><b>This response is more concise, our prompt worked quite well.</b></div>


# Fine-Tune with prompt

When interact with large lanauge models, you can enhance the model performance by Weight fine-tuning, and instruction fine-tuning.

- Weight fine-tuning involves taking a pre-trained language model and continuing to train it on a downstream task, tweaking the weights of the model through additional training on task-specific data. This allows the model to become specialized for a particular application. One technique of weight fine-tuning is Low-Rank Adaptation(LoRA). Read more about it: [A Brief Introduction to LoRA: Low-Rank Adaptation of Large Language Models](https://wandb.ai/sauravmaheshkar/LoRA/reports/A-Brief-Introduction-to-LoRA-Low-Rank-Adaptation-of-Large-Language-Models--Vmlldzo2MDAyOTU3)

- Instruction fine-tuning keeps the weights of the pre-trained model fixed, but provides the model with natural language instructions that guide its behavior on a downstream task. For example, you could provide a prompt like "Act as a helpful assistant that answers questions clearly and honestly." This prompts the model to adapt its behavior without changing the underlying weights. You can learn more about instruction fine-tuning in [How to Fine-Tune an LLM Part 1: Preparing a Dataset for Instruction Tuning](https://wandb.ai/capecape/alpaca_ft/reports/How-to-Fine-Tune-an-LLM-Part-1-Preparing-a-Dataset-for-Instruction-Tuning--Vmlldzo1NTcxNzE2)

Weight fine-tuning will provide more fitting into the prepared dataset, while it takes much more effort, time, and compute resources to achieve it. In this notebook, we start from the instruction fine-tuning.

# Instruction fine-tuning

So far we've been prompting the model to complete our inputs in a way so that we'd get the answers we wanted. Luckily, LLMs can be instruction tuned, meaning we could instruct the model to do specific tasks. This particular model was instruction tuned as described in the [model card](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF). It appears the model uses the following markup language:
```
<s>[INST] {prompt} [/INST]
```

Here is an example prompt exchange using the model template:

```
<s>[INST] What is your favourite condiment? [/INST]
Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s>
[INST] Do you have mayonnaise recipes? [/INST]
```

## Let's try it:

The following cell uses the prompt template to provide a system prompt to the model to ask it to provide its reasoning step-by-step which can help LLM response accuracy.

In [3]:
prompt="""
<s>[INST] Please tell me about how mistral winds have attracted super-orcas. Keep your answer brief. [/INST]
"""

print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=500, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

Prompt:

<s>[INST] Please tell me about how mistral winds have attracted super-orcas. Keep your answer brief. [/INST]

Response:
Mistral winds are strong, cold winds that blow across the Mediterranean Sea from France to Spain. These winds are known for their strength and direction, which can reach up to 100 km/h and come from a specific direction. Super-orcas, also known as killer whales, have been attracted to these mistral winds due to their ability to generate large waves that can be used for hunting prey. The strong currents and waves created by the mistral winds provide an ideal environment for super-orcas to hunt and rest, making them a popular destination for these marine mammals.

---
<div class="alert alert-block alert-info"><b>The model was able to follow the system direction to provide its step-by-step reasoning. However, manually supplying the full prompt template for each query is cumberson. In the next section we will look at how to simplify and automate prompt generation dynamically.</b></div>

# Dyanamic prompting
Now that we know how to submit instructions to Mistral LLM, let's try assembling prompts dynamically - part of the prompt can be fixed, and part can be provided on the fly. We achieve this by creating prompt template with variables. The value for each variable is supplied in a separate statement and that statement can be executed further down in the code. Basically, this is a way to de-couple static part of the prompt from variable one, making the entire prompt dynamic. 

In [4]:
from langchain.prompts import PromptTemplate

prompt_template = PromptTemplate.from_template(
    """
<s>[INST] {question} [/INST]    
"""
)

In [5]:
prompt = prompt_template.format(question="Explain photosynthesis in three sentences")
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=500, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

Prompt:

<s>[INST] Explain photosynthesis in three sentences [/INST]    

Response:
Photosynthesis is the process by which plants, algae and some bacteria convert sunlight, water and carbon dioxide into glucose, oxygen and other chemical compounds. It is an essential process for the survival of most life forms on earth as it provides the primary source of energy and food. The process occurs in two stages; the light-dependent reactions which take place in the thylakoid membranes of chloroplasts, and the light-independent reactions which occur in the stroma of the same organelles.

---
<div class="alert alert-block alert-info"><b>Using prompt templates makes it much easier to automate away the formatting nuances of different LLMs which can vary widely. Also notice, we did not finish our sentence with a period and the model did not try to complete it, becuase this time we signaled it that this is a command, an instrution which needs to be followed.</b></div>


# Few-shot prompts

We can influence model's output style by using what's called few-shot prompting - a technique which shows the model the exact behavior we expect on few examples. Few-shot prompting can be a powerful approach to improve model performance for a given task without additional training or fine-tuning. 

As you can see in the cell below, we provide the model two examples of a prompt and the expected response where the number of response sentences is constrained correctly by the request. We also influence the output formatting by demonstrating that we prefer sentences to be number in each response. 

In the end the full prompt with our request for a three sentence explanation of photosenthesis and let the model complete the example.  

In [6]:
prompt="""
<s>[INST] Explain precipitation in two sentences [/INST]
1. In meteorology, precipitation is any product of the condensation of atmospheric water vapor that falls from clouds due to gravitational pull.
2. The main forms of precipitation include drizzle, rain, sleet, snow, ice pellets, graupel and hail.</s>
[INST] Explain condensation in one sentence [/INST]
1. Condensation is the change of the state of matter from the gas phase into the liquid phase, and is the reverse of vaporization.</s>
[INST] Explain photosynthesis in three sentences  [/INST]
"""
for text in llm(prompt, max_new_tokens=250, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

1. Photosynthesis is the process by which plants, algae, and some bacteria convert light energy from the sun into chemical energy in the form of glucose or other sugars.
2. This process occurs in the chloroplasts of plant cells, where chlorophyll and other pigments absorb light photons and use them to power a series of chemical reactions that split water molecules into hydrogen and oxygen.
3. The oxygen produced during photosynthesis is released into the atmosphere as a byproduct, while the glucose can be used by the plant for energy or stored for later use.

<div class="alert alert-block alert-info"><b>And now with langchain template:</b></div>

In [9]:
from langchain.prompts.few_shot import FewShotPromptTemplate
from langchain.prompts.prompt import PromptTemplate

fs_template = PromptTemplate.from_template(
 """<s>[INST] {question} [/INST]
{output}</s>"""
)


examples = [
  {
    "question": "Explain precipitation in two sentences.",
    "output": "1. In meteorology, precipitation is any product of the condensation of atmospheric water vapor that falls from clouds due to gravitational pull.\n2. The main forms of precipitation include drizzle, rain, sleet, snow, ice pellets, graupel and hail."
  },
  {
    "question": "Explain condensation in one sentence.",
    "output": "1. Condensation is the change of the state of matter from the gas phase into the liquid phase, and is the reverse of vaporization."
  }
]

prompt_fs = FewShotPromptTemplate(
    examples=examples,
    example_prompt=fs_template,
    suffix="<s>[INST]{question}[/INST]",
    input_variables=["question"]
)

prompt = prompt_fs.format(question="Explain photosynthesis in three sentences.")
print("Prompt:")
print(prompt)
      
print("Response:")      
for text in llm(prompt, max_new_tokens=250, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

Prompt:
<s>[INST] Explain precipitation in two sentences. [/INST]
1. In meteorology, precipitation is any product of the condensation of atmospheric water vapor that falls from clouds due to gravitational pull.
2. The main forms of precipitation include drizzle, rain, sleet, snow, ice pellets, graupel and hail.</s>

<s>[INST] Explain condensation in one sentence. [/INST]
1. Condensation is the change of the state of matter from the gas phase into the liquid phase, and is the reverse of vaporization.</s>

<s>[INST]Explain photosynthesis in three sentences.[/INST]
Response:
 1. Photosynthesis is the process by which plants use sunlight, carbon dioxide, and water to produce oxygen and glucose.
2. It occurs in the chloroplasts of plant cells, where chlorophyll captures light energy and uses it to convert chemical energy from water molecules into glucose.
3. The byproduct of photosynthesis is oxygen, which is released into the atmosphere during the process.

<div class="alert alert-block alert-info"><b>Using a couple examples, we were able to teach the model to respond by specifically numbering the sentences in its response and limiting its response to the number of sentences requested.</b></div>


# Augmenting LLM response with context

Without providing any real-time context, the model is limited to knowledge it gained during training which may be incomplete or out of date. However, we can improve responses by providing relevant context that the model can use before formulating a response. 

To illustrate the benefit of providing context we will first query the model about which instance types are available to use for managed spot training in SageMaker. 


In [10]:
prompt = prompt_template.format(question="Which instances can I use with Managed Spot Training in SageMaker?")

print(f"Prompt:\n{prompt}")
print("Response:")      
for text in llm(prompt, max_new_tokens=300, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

Prompt:

<s>[INST] Which instances can I use with Managed Spot Training in SageMaker? [/INST]    

Response:
Managed Spot Training is a feature of Amazon SageMaker that allows you to train machine learning models using Amazon EC2 Spot Instances. You can use Managed Spot Training for a wide range of scenarios, including:

1. Image and video classification: You can use Managed Spot Training to train models for image and video classification tasks, such as object detection, facial recognition, and activity recognition.
2. Natural language processing (NLP): You can use Managed Spot Training to train NLP models for tasks such as text classification, sentiment analysis, and named entity recognition.
3. Predictive analytics: You can use Managed Spot Training to train predictive models for a variety of applications, including fraud detection, customer churn prediction, and demand forecasting.
4. Deep learning: You can use Managed Spot Training to train deep learning models for tasks such as im

---
<div class="alert alert-block alert-info"><b>This response does not clearly and directly answer our question and you can see how without provided context, the LLM uses its own "knowledge" to answer questions. Let's try providing some context to the model to use as reference. We will add a new dynamic field "context" to our prompt template and provide up to date information when we ask the user question.</b></div>

In [11]:
from langchain.prompts import PromptTemplate

context_template = PromptTemplate.from_template(
    """
<s>[INST] Given the context below, answer the question that follows. If you do not know the answer and the context doesn't contain the answer truthfully say "I don't know".

Context: {context}
Question: {question}[/INST]
"""
)

context = """Managed Spot Training can be used with all instances supported in Amazon SageMaker. Managed Spot Training is supported in all AWS Regions where Amazon SageMaker is currently available."""


prompt=context_template.format(context=context, question="Which instances can I use with Managed Spot Training in SageMaker?")
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=150, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

Prompt:

<s>[INST] Given the context below, answer the question that follows. If you do not know the answer and the context doesn't contain the answer truthfully say "I don't know".

Context: Managed Spot Training can be used with all instances supported in Amazon SageMaker. Managed Spot Training is supported in all AWS Regions where Amazon SageMaker is currently available.
Question: Which instances can I use with Managed Spot Training in SageMaker?[/INST]

Response:
You can use all instances supported in Amazon SageMaker with Managed Spot Training.

---
<div class="alert alert-block alert-info"><b>Great! The model used the provided context to answer the user question. Now lets use the same context and ask the model about something that is not covered in the context. </b></div>

In [12]:
prompt=context_template.format(context=context, question="Which instances can I use with Amazon ECS?")
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=150, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

Prompt:

<s>[INST] Given the context below, answer the question that follows. If you do not know the answer and the context doesn't contain the answer truthfully say "I don't know".

Context: Managed Spot Training can be used with all instances supported in Amazon SageMaker. Managed Spot Training is supported in all AWS Regions where Amazon SageMaker is currently available.
Question: Which instances can I use with Amazon ECS?[/INST]

Response:
I don't know, the context does not mention anything about using Managed Spot Training with Amazon ECS.

---
<div class="alert alert-block alert-info"><b>The model followed our direction and admitted that it did not know the answer when it wasn't able to be determined from the provided context. This is a powerful method that is used heavily in RAG (Retrieval Augmented Generation) applications. A company may only want an IT support bot to provide answers based on internal documentaiton provided to the model as context. </b></div>

# Hallucinations

Hallucination is when a model makes up information and presents it as if it were true. Leading questions can often cause hallucinations. Let's see what happens if we ask about a non-existent new launch at re:Invent. 

In [13]:
prompt = prompt_template.format(question="What type of new chemical solution AWS announced at re:Invent this year?")
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=250, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

Prompt:

<s>[INST] What type of new chemical solution AWS announced at re:Invent this year? [/INST]    

Response:
At re:Invent 2021, AWS announced a new chemical solution called "AWS Elemental AI for Chemical Processing." This solution is designed to help chemical processors optimize their operations by using machine learning and artificial intelligence (AI) techniques. It can be used to predict equipment failures, improve yield and throughput, reduce energy consumption, and enhance product quality. The solution leverages AWS's existing services such as Amazon SageMaker, Amazon Redshift, and Amazon Kinesis Data Streams to provide a comprehensive platform for chemical processors.

---
<div class="alert alert-block alert-info"><b>The model incorrectly states that AWS announced a new chemical solution called "AWS Graviton2" in 2019 (you may get a different answer). One approach to combatting LLM hallucination is to give the model an "out" by giving it explicit instructions not to make up information and admit when it doesn't know something. Let's try the same question again by by also adding additional instructions in our prompt as follows: </b></div>

In [15]:
prompt = prompt_template.format(question="What type of new chemical solution AWS announced at re:Invent this year? Do not make up facts. Say I don't know if you have no information about something or unsure.")
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=250, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

Prompt:

<s>[INST] What type of new chemical solution AWS announced at re:Invent this year? Do not make up facts. Say I don't know if you have no information about something or unsure. [/INST]    

Response:
I apologize, but I do not have any specific information on the new chemical solution that AWS announced at re:Invent this year. Can you please provide me with more context or details so that I can assist you better?

---
<div class="alert alert-block alert-info"><b>By giving the model clear instructions and an out, we were able to get the model to admit it didn't have enough context to answer the question rather than making up something.</b></div>

# Chatbot

Let's make a simple chatbot.  There is no special library to include and no setting to apply to the LLM, all we need is prompt engineering!

Let's use what we know of prompts with in context learning, to create a simple chatbot. 

In [18]:
question = "Who is Jeff Bezos?"

prompt = prompt_template.format(question=question)
print(f"Prompt:\n{prompt}")
print("Response:")
resp = llm(prompt, max_new_tokens=250, temperature=0.1, stop="</s>")
print(resp)

Prompt:

<s>[INST] Who is Jeff Bezos? [/INST]    

Response:
Jeff Bezos is an American entrepreneur and business magnate, born on July 12, 1964 in Seattle, Washington. He is best known as the founder of Amazon, a multinational technology company focusing on e-commerce, cloud computing, digital streaming, and artificial intelligence. Bezos has also ventured into space exploration with his company Blue Origin. He has been ranked as one of the world's wealthiest people by Forbes and Time magazine.


### Follow-up questions
Now if we ask the LLM a follow-up question without providing context from the conversation so far, it will not be able to provide an accurate answer.

In [19]:
question = "How old is he?"

prompt = prompt_template.format(question=question)
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=300, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

Prompt:

<s>[INST] How old is he? [/INST]    

Response:
I don't have access to personal information about individuals unless it has been shared with me in the course of our conversation. I am designed to respect user privacy and confidentiality. Therefore, I don't know how old he is. Can you please provide more context or information about who you are referring to?

---
<div class="alert alert-block alert-info"><b>The model doesn't know who "he" is. Without the chat history context it is not able to answer.</b></div>

### Providing chat context
Let's use the context template we created earlier and feed in the previous response about Jeff Bezos. 

In [20]:
prompt = context_template.format(context=resp, question=question)
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=300, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

Prompt:

<s>[INST] Given the context below, answer the question that follows. If you do not know the answer and the context doesn't contain the answer truthfully say "I don't know".

Context: Jeff Bezos is an American entrepreneur and business magnate, born on July 12, 1964 in Seattle, Washington. He is best known as the founder of Amazon, a multinational technology company focusing on e-commerce, cloud computing, digital streaming, and artificial intelligence. Bezos has also ventured into space exploration with his company Blue Origin. He has been ranked as one of the world's wealthiest people by Forbes and Time magazine.
Question: How old is he?[/INST]

Response:
Jeff Bezos was born on July 12, 1964. To calculate his current age, subtract his birth year from the current year (2021), and you get 57 years old.

---
<div class="alert alert-block alert-info"><b>This is a simplified example, but by including the past chat context, we can support a back and forth chat bot interaction with the LLM.</b></div>

# More examples of LLM use cases:

## The following are optional prompts you can try running to keep exploring.


For simpler questions/prompts we can often get away witout strict instruction formatting:

In [21]:
prompt = """Rewrite the following sentence in better English:
I think I want to apply for this position but don't know how, can you help?
"""

for text in llm(prompt, max_new_tokens=150, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

I am considering applying for this position, however, I am uncertain about the process and would appreciate your assistance.

In [23]:
prompt = """I am getting the following error when attempting to run this pythong code, can you explain why?
  File "/home/studio-lab-user/sagemaker-studiolab-notebooks/ha.py", line 2, in <module>
    import zip
ModuleNotFoundError: No module named 'zip'
"""

for text in llm(prompt, max_new_tokens=150, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

```
This error occurs when the Python interpreter is unable to find the `zip` module, which is a built-in module in Python that provides functionality for working with archives. The `zip` module is required by the code you provided, so it needs to be imported in order for the code to run.

To fix this error, you can try the following steps:

1. Make sure that you are running the code on a system that has Python installed. If you don't have Python installed, you can download and install it from the official website (<https://www.python.org/downloads/>).
2. Check that the version of Python you are using is compatible with the version

In [24]:
prompt = """Reduct PII from the following paragraph. Replace any PII with ###.
Paragraph:
Jeff Bezos lives at 1 Main St. Miami, FL 39812. His phone number is 111-123-4567.
"""

for text in llm(prompt, max_new_tokens=250, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)


Reduced Paragraph:
Jeff Bezos lives at ### St. ###, ### ### ###. His phone number is ###-###-####.

---
The last one or two examples didn't quite work. Perhaps the model is not strong enough for this type of task. 

## Next steps: Want to use a different model with this notebook?

Hugging Face have many models that you can use and drop in to code like this. We encourage you to play with the prompts above and other models. You may need to make modifications to the code above depending on the model you choose.

