# Experimenting Prompt Engineering - Chatbot

In this notebook we will explore common prompt engineering techniques to get the most out of Large Language Models (LLMs). 

First you will experiment by loading a 7 billion parameter LLM within the notebook environment and throwing some prompts its way to see how differt prompt strategies can impact the results of the model. 

After trying different prompt engineering techniques, you will learn how to run a simple chatbot using what you learned. 

Finally, we'll review some pointers for how to put these concepts into practice by building on the sample code and swapping in different LLMs to create your own solutions.

This notebook is used at workshop named AIM219 at re:Invent. Please refer the presentation slide from here: [Learn and experiment with LLMs in Amazon SageMaker Studio Lab (AIM219)](https://speakerdeck.com/icoxfog417/learn-and-experiment-with-llms-in-amazon-sagemaker-studio-lab-aim219). 

### Working Environment 

[![Open In Studio Lab](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/aws/studio-lab-examples/blob/main/generative-ai/mistral/prompting-mistral7B.ipynb)


This notebook has been designed, written and tested to run for free on [Amazon SageMaker Studio Lab](https://studiolab.sagemaker.aws/) with CPU.  Studio Lab is a free machine learning (ML) development environment that provides compute and storage (up to 15GB) at no cost with NO credit card required.

You can sign up for Amazon SageMaker Studio Lab here: https://studiolab.sagemaker.aws/

> Whatever environment you end up using, make sure you have at least 12 GB of disk space available to run this code.

### Libraries
First, if needed, install `ctransformers` - Python bindings implemented in C/C++ to accelerate the inference speed of models built on [`transformers` library](https://github.com/huggingface/transformers). If you run this notebook locally or on a GPU instance (ex. g4 or g5) instead of Studio Lab, please check that appropriate version of the CUDA runtime (`nvcc`) is installed. If you face a "command not found" error when running `nvcc`, [installing cuda-toolkit](https://anaconda.org/nvidia/cuda-toolkit) after checking which CUDA version is needed via `nvidia-smi` should fix the issue.

In [1]:
%pip install transformers[torch]
%pip install ctransformers
%pip install langchain

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [2]:
from ctransformers import AutoModelForCausalLM, AutoTokenizer

In [3]:
!pip3 install huggingface-hub



In [4]:
!huggingface-cli download TheBloke/Mistral-7B-Instruct-v0.1-GGUF mistral-7b-instruct-v0.1.Q4_K_M.gguf --local-dir . 

mistral-7b-instruct-v0.1.Q4_K_M.gguf


## Mistral-7B-Instruct-v0.1-GGUP Model

The 🐋 [Mistral-7B-Instruct-v0.1-GGUP](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF) 🐋 model was fine-tuned on top of Mistral 7B using using a variety of publicly available conversation datasets. 

### Loading the model

The following cell will load the pretrained model. Because the model itself is around 3.8 GB, so it can take a minute or so to load. 

In [5]:
llm = AutoModelForCausalLM.from_pretrained(
    "TheBloke/Mistral-7B-Instruct-v0.1-GGUF",
    model_file="mistral-7b-instruct-v0.1.Q4_K_M.gguf",
    model_type="mistral",
)

Fetching 1 files:   0%|          | 0/1 [00:00<?, ?it/s]

Fetching 1 files:   0%|          | 0/1 [00:00<?, ?it/s]

# How do large language models work?

## Prompt engineering

Remember that LLMs are trained to predict the next word when given a sequence of words. 

![LLM-concept](llm-generation.png)

The way that we prompt an LLM can make a big difference in how it responds and how useful its output is. This depends not only on they type of query, but also nuances of how individual models where trained. Let's start by asking the Mistral model to describe the concept of photosynthesis.

## Let's try a basic prompt

In the following cell, we provide the model three parameters:
1. the prompt: "Explain photosythesis"
2. max_new_tokens which is the maximum length in tokens for the model's response
3. temperature which you can think of as a measure of how *creative* the model can be in its response


<div class="alert alert-block alert-info"><b>Note:</b> If you are running this on a cpu instance of Studio Lab keep in mind that the following cell may take a couple minutes to run.</div>

In [6]:
prompt = "Explain photosythesis"

for text in llm(prompt, max_new_tokens=250, temperature=0.1, stream=True):
    print(text, end="", flush=True)

.

Photosynthesis is the process by which plants, algae, and some bacteria convert sunlight, water, and carbon dioxide into glucose (sugar), oxygen, and other chemical compounds. It is an essential process for life on Earth because it provides the primary source of organic compounds and oxygen.

The process of photosynthesis occurs in two stages: the light-dependent reactions and the light-independent reactions. During the light-dependent reactions, energy from sunlight is absorbed by pigments such as chlorophyll in the thylakoid membranes of the chloroplasts. This energy is used to generate ATP (adenosine triphosphate) and NADPH (nicotinamide adenine dinucleotide phosphate), which are high-energy molecules that are used in the second stage of photosynthesis.

During the light-independent reactions, also known as the Calvin cycle, ATP and NADPH are used to power a series of chemical reactions that convert carbon dioxide into glucose. This process takes place in the stroma of the ch

---
<div class="alert alert-block alert-info"><b>The model did pretty well. Notice it started with generating a period ("."). That's because LLMs are trained to generate next most probably token, given an input. Let's see if we can make the answer more concise, and we will end our prompt with a period this time.</b></div>

## Modifying the basic prompt
Below we will try giving the model more specific instructions.


In [7]:
prompt = "Explain photosythesis in three sentences."

for text in llm(prompt, max_new_tokens=250, temperature=0.1, stream=True):
    print(text, end="", flush=True)

 Photosynthesis is the process by which plants, algae, and some bacteria convert sunlight, water, and carbon dioxide into glucose, oxygen, and other chemical compounds. It is an essential process for life on Earth as it provides the primary source of energy and food for almost all living organisms. The process occurs in two stages: the light-dependent reactions, which take place in the thylakoid membranes of chloroplasts, and the light-independent reactions, also known as the Calvin cycle, which occur in the stroma of chloroplasts.

---

<div class="alert alert-block alert-info"><b>This response is more concise, our prompt worked quite well.</b></div>


# What is Fine tuning?

When interacting with large lanauge models, you can enhance the model performance through fine tuning.

Instruction tuning is a type of fine-tuning. We should be careful with the term "fine-tuning" because it encompasses many types of training in large language model context. For example, continued pre-training is another type of fine-tuning that aims to add additional knowledge beyond the base model. Training on domain-specific corpora such as medical articles, social media texts, and specific language data are examples. In contrast, instruction tuning aims to refine model behavior rather than adding knowledge. As a result of instruction tuning, a model can produce accurate responses with short or formatted prompts. It just like adding call-and-response style to the model.

Recent research enables "switching" model style by switching additional model weights to base model. [LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685) is one way to freeze original model weights and fine-tune with additional weights. For examples, you can generate artworks in different styles by switching LoRA model weights of Stable Diffusion. We can find various weights online such as [CivitAI](https://civitai.com/). (However we should be mindful of copyright - in most cases, emulating specific author's style and damaging their repuration or opportunities is forbidden by law.)

"Fine tuning", which includes both instruction tuning and continued pre-training, requires datasets, computing resources, and extensive training time and effort. We have confirmed that we can change the behavior by chainging only the prompt. This is called "Prompt tuning". We need to consider the most appropriate ways to elicit the desired responses from large language models. The course "[Finetuning Large Language Models](https://www.deeplearning.ai/short-courses/finetuning-large-language-models/)" from DeepLearning.AI provides helpful guide for this decision.

# Instruction fine-tuning

Luckily, `Mistral-7B-Instruct` is already instruction tuned, meaning we could instruct the model to do specific tasks. This particular model was instruction tuned as described in the [model card](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF). It appears the model uses the following markup language:
```
<s>[INST] {prompt} [/INST]
```

Here is an example prompt exchange using the model template:

```
<s>[INST] What is your favourite condiment? [/INST]
Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s>
[INST] Do you have mayonnaise recipes? [/INST]
```

## Let's try it:

The following cell uses the prompt template to provide a system prompt to the model to ask it to provide its reasoning step-by-step which can help LLM response accuracy.

In [26]:
prompt = """
<s>[INST] Please tell me about how mistral winds have attracted super-orcas. Keep your answer brief. [/INST]
"""

print(f"Prompt:\n{prompt}")
print("Response:")
 for text in llm(prompt, max_new_tokens=500, temperature=0.1, stop="<|im_end|>", stream=True):
     print(text, end="", flush=True)


Prompt:

<s>[INST] Please tell me about how mistral winds have attracted super-orcas. Keep your answer brief. [/INST]

Response:

Mistral winds are strong, cold winds that blow across the Mediterranean Sea from France to Italy. These winds have been known to attract super-orcas, also known as killer whales, due to their nutrient-rich waters and favorable conditions for hunting prey. The mistral winds create upwelling, which brings nutrients from the deep ocean to the surface, providing a plentiful food source for the orcas. Additionally, the strong currents created by the mistral winds help to concentrate schools of fish, making it easier for the orcas to hunt. Overall, the combination of nutrient-rich waters and favorable hunting conditions has made the Mediterranean Sea an attractive destination for super-orcas, drawing them from as far away as Norway and Iceland.

---
<div class="alert alert-block alert-info"><b>The model was able to follow the system direction to provide its step-by-step reasoning. However, manually supplying the full prompt template for each query is cumberson. In the next section we will look at how to simplify and automate prompt generation dynamically.</b></div>

# Dyanamic prompting
Now that we know how to submit instructions to Mistral LLM, let's try assembling prompts dynamically - part of the prompt can be fixed, and part can be provided on the fly. We achieve this by creating prompt template with variables. The value for each variable is supplied in a separate statement and that statement can be executed further down in the code. Basically, this is a way to de-couple static part of the prompt from variable one, making the entire prompt dynamic. 

In [9]:
from langchain.prompts import PromptTemplate

prompt_template = PromptTemplate.from_template(
    """
<s>[INST] {question} [/INST]    
"""
)

In [10]:
prompt = prompt_template.format(question="Explain photosynthesis in three sentences")
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=500, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

Prompt:

<s>[INST] Explain photosynthesis in three sentences [/INST]    

Response:
Photosynthesis is the process by which plants, algae and some bacteria convert sunlight, water and carbon dioxide into glucose, oxygen and other chemical compounds. It is an essential process for the survival of most life forms on Earth, as it provides the primary source of energy and food for nearly all living organisms. The process occurs in two stages: the light-dependent reactions, which take place in the thylakoid membranes of chloroplasts and involve the absorption of light energy by pigments such as chlorophyll, and the light-independent reactions, also known as the Calvin cycle, which occur in the stroma of chloroplasts and involve the fixation of carbon dioxide into glucose through a series of chemical reactions.

---
<div class="alert alert-block alert-info"><b>Using prompt templates makes it much easier to automate away the formatting nuances of different LLMs which can vary widely. Also notice, we did not finish our sentence with a period and the model did not try to complete it, becuase this time we signaled it that this is a command, an instrution which needs to be followed.</b></div>


# Few-shot prompts

We can influence model's output style by using what's called few-shot prompting - a technique which shows the model the exact behavior we expect on few examples. Few-shot prompting can be a powerful approach to improve model performance for a given task without additional training or fine-tuning. 

As you can see in the cell below, we provide the model two examples of a prompt and the expected response where the number of response sentences is constrained correctly by the request. We also influence the output formatting by demonstrating that we prefer sentences to be number in each response. 

In the end the full prompt with our request for a three sentence explanation of photosenthesis and let the model complete the example.  

In [11]:
prompt = """
<s>[INST] Explain precipitation in two sentences [/INST]
1. In meteorology, precipitation is any product of the condensation of atmospheric water vapor that falls from clouds due to gravitational pull.
2. The main forms of precipitation include drizzle, rain, sleet, snow, ice pellets, graupel and hail.</s>
[INST] Explain condensation in one sentence [/INST]
1. Condensation is the change of the state of matter from the gas phase into the liquid phase, and is the reverse of vaporization.</s>
[INST] Explain photosynthesis in three sentences  [/INST]
"""
for text in llm(prompt, max_new_tokens=250, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

1. Photosynthesis is the process by which plants, algae, and some bacteria convert sunlight, water, and carbon dioxide into glucose (sugar), oxygen, and other chemical compounds.
2. It is an essential process for the survival of most life forms on Earth, as it provides the primary source of energy and organic compounds for almost all living organisms.
3. Photosynthesis occurs in two stages: the light-dependent reactions, which take place in the thylakoid membranes of chloroplasts, and the light-independent reactions, also known as the Calvin cycle, which occur in the stroma of chloroplasts.

<div class="alert alert-block alert-info"><b>And now with langchain template:</b></div>

In [12]:
from langchain.prompts.few_shot import FewShotPromptTemplate
from langchain.prompts.prompt import PromptTemplate

fs_template = PromptTemplate.from_template(
    """<s>[INST] {question} [/INST]
{output}</s>"""
)


examples = [
    {
        "question": "Explain precipitation in two sentences.",
        "output": "1. In meteorology, precipitation is any product of the condensation of atmospheric water vapor that falls from clouds due to gravitational pull.\n2. The main forms of precipitation include drizzle, rain, sleet, snow, ice pellets, graupel and hail.",
    },
    {
        "question": "Explain condensation in one sentence.",
        "output": "1. Condensation is the change of the state of matter from the gas phase into the liquid phase, and is the reverse of vaporization.",
    },
]

prompt_fs = FewShotPromptTemplate(
    examples=examples,
    example_prompt=fs_template,
    suffix="<s>[INST]{question}[/INST]",
    input_variables=["question"],
)

prompt = prompt_fs.format(question="Explain photosynthesis in three sentences.")
print("Prompt:")
print(prompt)

print("Response:")
for text in llm(prompt, max_new_tokens=250, temperature=0.1, stop="<|im_end|>", stream=True):
    print(text, end="", flush=True)

Prompt:
<s>[INST] Explain precipitation in two sentences. [/INST]
1. In meteorology, precipitation is any product of the condensation of atmospheric water vapor that falls from clouds due to gravitational pull.
2. The main forms of precipitation include drizzle, rain, sleet, snow, ice pellets, graupel and hail.</s>

<s>[INST] Explain condensation in one sentence. [/INST]
1. Condensation is the change of the state of matter from the gas phase into the liquid phase, and is the reverse of vaporization.</s>

<s>[INST]Explain photosynthesis in three sentences.[/INST]
Response:
 1. Photosynthesis is the process by which plants, algae, and some bacteria convert sunlight, water, and carbon dioxide into glucose (sugar), oxygen, and other chemical compounds.
2. It is a vital process for the survival of most life forms on Earth, as it provides the primary source of energy and organic compounds necessary for growth and reproduction.
3. Photosynthesis occurs in two stages: the light-dependent react

<div class="alert alert-block alert-info"><b>Using a couple examples, we were able to teach the model to respond by specifically numbering the sentences in its response and limiting its response to the number of sentences requested.</b></div>


# Augmenting LLM response with context

Without providing any real-time context, the model is limited to knowledge it gained during training which may be incomplete or out of date. However, we can improve responses by providing relevant context that the model can use before formulating a response. 

To illustrate the benefit of providing context we will first query the model about which instance types are available to use for managed spot training in SageMaker. 


In [13]:
prompt = prompt_template.format(
    question="Which instances can I use with Managed Spot Training in SageMaker?"
)

print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=300, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

Prompt:

<s>[INST] Which instances can I use with Managed Spot Training in SageMaker? [/INST]    

Response:
Managed Spot Training is a feature of Amazon SageMaker that allows you to train machine learning models using Amazon EC2 Spot Instances, which are instances that are available at a lower cost than On-Demand Instances. You can use Managed Spot Training in the following scenarios:

1. Batch Inference: You can use Managed Spot Training for batch inference when you need to process large amounts of data and want to minimize costs.
2. Model Training: You can use Managed Spot Training for model training when you have a long-running training job that can tolerate interruptions.
3. Hyperparameter Tuning: You can use Managed Spot Training for hyperparameter tuning when you need to optimize the performance of your model and want to minimize costs.
4. Model Serving: You can use Managed Spot Training for model serving when you have a large number of requests and want to minimize costs.
5. Da

---
<div class="alert alert-block alert-info"><b>This response does not clearly and directly answer our question and you can see how without provided context, the LLM uses its own "knowledge" to answer questions. Let's try providing some context to the model to use as reference. We will add a new dynamic field "context" to our prompt template and provide up to date information when we ask the user question.</b></div>

In [14]:
from langchain.prompts import PromptTemplate

context_template = PromptTemplate.from_template(
    """
<s>[INST] Given the context below, answer the question that follows. If you do not know the answer and the context doesn't contain the answer truthfully say "I don't know".

Context: {context}
Question: {question}[/INST]
"""
)

context = """Managed Spot Training can be used with all instances supported in Amazon SageMaker. Managed Spot Training is supported in all AWS Regions where Amazon SageMaker is currently available."""


prompt = context_template.format(
    context=context,
    question="Which instances can I use with Managed Spot Training in SageMaker?",
)
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=150, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

Prompt:

<s>[INST] Given the context below, answer the question that follows. If you do not know the answer and the context doesn't contain the answer truthfully say "I don't know".

Context: Managed Spot Training can be used with all instances supported in Amazon SageMaker. Managed Spot Training is supported in all AWS Regions where Amazon SageMaker is currently available.
Question: Which instances can I use with Managed Spot Training in SageMaker?[/INST]

Response:
You can use all instances supported in Amazon SageMaker with Managed Spot Training.

---
<div class="alert alert-block alert-info"><b>Great! The model used the provided context to answer the user question. Now lets use the same context and ask the model about something that is not covered in the context. </b></div>

In [15]:
prompt = context_template.format(
    context=context, question="Which instances can I use with Amazon ECS?"
)
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=150, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

Prompt:

<s>[INST] Given the context below, answer the question that follows. If you do not know the answer and the context doesn't contain the answer truthfully say "I don't know".

Context: Managed Spot Training can be used with all instances supported in Amazon SageMaker. Managed Spot Training is supported in all AWS Regions where Amazon SageMaker is currently available.
Question: Which instances can I use with Amazon ECS?[/INST]

Response:
I don't know, the context does not mention anything about using Managed Spot Training with Amazon ECS.

---
<div class="alert alert-block alert-info"><b>The model followed our direction and admitted that it did not know the answer when it wasn't able to be determined from the provided context. This is a powerful method that is used heavily in RAG (Retrieval Augmented Generation) applications. A company may only want an IT support bot to provide answers based on internal documentaiton provided to the model as context. </b></div>

# Hallucinations

Hallucination is when a model makes up information and presents it as if it were true. Leading questions can often cause hallucinations. Let's see what happens if we ask about a non-existent new launch at re:Invent. 

In [16]:
prompt = prompt_template.format(
    question="What type of new chemical solution AWS announced at re:Invent this year?"
)
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=250, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

Prompt:

<s>[INST] What type of new chemical solution AWS announced at re:Invent this year? [/INST]    

Response:
At re:Invent 2021, AWS announced a new chemical solution called "AWS Elemental AI for Chemical Processing." This solution is designed to help chemical processors optimize their operations and reduce costs by using machine learning algorithms to predict equipment failures, optimize reaction conditions, and improve product quality. The solution leverages AWS's existing expertise in machine learning and data analytics to provide a scalable and cost-effective solution for the chemical industry.

---
<div class="alert alert-block alert-info"><b>The model incorrectly states that AWS announced a new chemical solution called "AWS Graviton2" in 2019 (you may get a different answer). One approach to combatting LLM hallucination is to give the model an "out" by giving it explicit instructions not to make up information and admit when it doesn't know something. Let's try the same question again by by also adding additional instructions in our prompt as follows: </b></div>

In [17]:
prompt = prompt_template.format(
    question="What type of new chemical solution AWS announced at re:Invent this year? Do not make up facts. Say I don't know if you have no information about something or unsure."
)
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=250, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

Prompt:

<s>[INST] What type of new chemical solution AWS announced at re:Invent this year? Do not make up facts. Say I don't know if you have no information about something or unsure. [/INST]    

Response:
I apologize, but I do not have any specific information on the new chemical solution that AWS announced at re:Invent this year. Can you please provide me with more context or details so that I can assist you better?

---
<div class="alert alert-block alert-info"><b>By giving the model clear instructions and an out, we were able to get the model to admit it didn't have enough context to answer the question rather than making up something.</b></div>

# Chatbot

Let's make a simple chatbot.  There is no special library to include and no setting to apply to the LLM, all we need is prompt engineering!

Let's use what we know of prompts with in context learning, to create a simple chatbot. 

In [18]:
question = "Who is Jeff Bezos?"

prompt = prompt_template.format(question=question)
print(f"Prompt:\n{prompt}")
print("Response:")
resp = llm(prompt, max_new_tokens=250, temperature=0.1, stop="</s>")
print(resp)

Prompt:

<s>[INST] Who is Jeff Bezos? [/INST]    

Response:
Jeff Bezos is an American entrepreneur and business magnate. He is the founder, CEO, CTO, and lead shareholder of Amazon, a multinational technology company focusing on e-commerce, cloud computing, digital streaming, and artificial intelligence. He is also the founder of Blue Origin, a space exploration company. Bezos has been ranked as the world's wealthiest person by Forbes multiple times.


### Follow-up questions
Now if we ask the LLM a follow-up question without providing context from the conversation so far, it will not be able to provide an accurate answer.

In [19]:
question = "How old is he?"

prompt = prompt_template.format(question=question)
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=300, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

Prompt:

<s>[INST] How old is he? [/INST]    

Response:
I don't have access to personal information about individuals unless it has been shared with me in the course of our conversation. I am designed to respect user privacy and confidentiality. Therefore, I don't know how old he is. Can you please provide more context or information about who you are referring to?

---
<div class="alert alert-block alert-info"><b>The model doesn't know who "he" is. Without the chat history context it is not able to answer.</b></div>

### Providing chat context
Let's use the context template we created earlier and feed in the previous response about Jeff Bezos. 

In [20]:
prompt = context_template.format(context=resp, question=question)
print(f"Prompt:\n{prompt}")
print("Response:")
for text in llm(prompt, max_new_tokens=300, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

Prompt:

<s>[INST] Given the context below, answer the question that follows. If you do not know the answer and the context doesn't contain the answer truthfully say "I don't know".

Context: Jeff Bezos is an American entrepreneur and business magnate. He is the founder, CEO, CTO, and lead shareholder of Amazon, a multinational technology company focusing on e-commerce, cloud computing, digital streaming, and artificial intelligence. He is also the founder of Blue Origin, a space exploration company. Bezos has been ranked as the world's wealthiest person by Forbes multiple times.
Question: How old is he?[/INST]

Response:
I don't know, the context does not provide information about Jeff Bezos' age.

---
<div class="alert alert-block alert-info"><b>This is a simplified example, but by including the past chat context, we can support a back and forth chat bot interaction with the LLM.</b></div>

# More examples of LLM use cases:

## The following are optional prompts you can try running to keep exploring.


For simpler questions/prompts we can often get away witout strict instruction formatting:

In [21]:
prompt = """Rewrite the following sentence in better English:
I think I want to apply for this position but don't know how, can you help?
"""

for text in llm(prompt, max_new_tokens=150, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

I am considering applying for this position, however, I am uncertain about the process and would appreciate your assistance.

In [22]:
prompt = """I am getting the following error when attempting to run this pythong code, can you explain why?
  File "/home/studio-lab-user/sagemaker-studiolab-notebooks/ha.py", line 2, in <module>
    import zip
ModuleNotFoundError: No module named 'zip'
"""

for text in llm(prompt, max_new_tokens=150, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)

```
This error occurs because the `zip` module is not installed on your system. The `zip` module provides functionality for working with compressed files in Python. You can install it using pip by running the following command in your terminal or command prompt:
```
pip install zipfile
```
Once you have installed the `zipfile` package, you should be able to run your code without encountering this error.

In [23]:
prompt = """Reduct PII from the following paragraph. Replace any PII with ###.
Paragraph:
Jeff Bezos lives at 1 Main St. Miami, FL 39812. His phone number is 111-123-4567.
"""

for text in llm(prompt, max_new_tokens=250, temperature=0.1, stop="</s>", stream=True):
    print(text, end="", flush=True)


Reduced Paragraph:
### lives at ### ###, ### ### ###. His phone number is ### ### ### ### ###.

---
The last one or two examples didn't quite work. Perhaps the model is not strong enough for this type of task. 

## Next steps: Want to use a different model with this notebook?

Hugging Face have many models that you can use and drop in to code like this. We encourage you to play with the prompts above and other models. You may need to make modifications to the code above depending on the model you choose.

