<a href="https://colab.research.google.com/github/aminadibi/ABHgenotypeR/blob/master/CHIL_Tutorial%2C_Part_1_Prompt_Engineering_with_Llama3_70B_NIM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Prompt Engineering with Hosted Llama3 70B NIM

In this tutorial, we will walk through several prompt engineering techniques using Llama3 70B Instruct, a large language model built by Meta. This model is optimized into a NIM and hosted at build.nvidia.com, so you won't need a GPU for this exercise. Note that these hosted endpoints are not for production use cases.

**Please do not input any sensitive information, including PHI.**

We will be going through the following prompt engineering techniques:
- Crafting a good prompt
- Updating the system prompt
- Adding few-shot examples
- Using Chain-of-Thought reasoning


If you'd rather not use Colab, check out the hosted NIMs at https://build.nvidia.com to follow along with some of the techniques here!

## Setting up the base chat API

Here, we will utilize the OpenAI API spec (which has become standard for LLMs) with hosted NVIDIA endpoints and start sending some basic requests.

You'll need to have an NVIDIA API Key to use this Colab; sign up for free credits at build.nvidia.com. Then, add your secret key on the left sidebar (the "key" icon).

In [1]:
# Load the API key for build.nvidia.com
from google.colab import userdata
NVIDIA_API_KEY = userdata.get('NVIDIA_API_KEY')

In [2]:
# Install the openai package (for the API)
!pip install openai

Collecting openai
  Downloading openai-1.35.6-py3-none-any.whl (327 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m327.5/327.5 kB[0m [31m2.6 MB/s[0m eta [36m0:00:00[0m
Collecting httpx<1,>=0.23.0 (from openai)
  Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m7.3 MB/s[0m eta [36m0:00:00[0m
Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai)
  Downloading httpcore-1.0.5-py3-none-any.whl (77 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m8.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->openai)
  Downloading h11-0.14.0-py3-none-any.whl (58 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m7.0 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: h11, httpcore, httpx, openai
Successfully installed h11-0.14.0 httpcore-1.0.5 ht

In [3]:
# Instantiate the client using our NVIDIA AI API endpoint and key
from openai import OpenAI

client = OpenAI(
  base_url = "https://integrate.api.nvidia.com/v1",
  api_key = NVIDIA_API_KEY
)

### Adding messages

This is where we define our role type and content (aka prompt). There are several types of roles:
- `user`
- `assistant`
- `system`

In [4]:
messages = [{"role": "user", "content": "What is the meaning of life?"}]

Now, we send the messages to the hosted Llama3 model, and receive and print the response.

In [5]:
completion = client.chat.completions.create(
  model="meta/llama3-70b-instruct",
  messages=messages,
  temperature=0.5,
  top_p=1,
  max_tokens=1024,
  stream=True
)

for chunk in completion:
  if chunk.choices[0].delta.content is not None:
    print(chunk.choices[0].delta.content, end="")

The meaning of life is a question that has puzzled philosophers, theologians, scientists, and many others for centuries. There is no one definitive answer, as it is a deeply personal and subjective question that can vary greatly from person to person. Here are some possible perspectives:

1. **Biological perspective**: From a biological standpoint, the meaning of life could be seen as the perpetuation of the species, ensuring the survival and propagation of our genes.
2. **Religious perspective**: Many religious beliefs hold that the meaning of life is to serve a higher power, follow a set of moral guidelines, and ultimately achieve salvation or enlightenment.
3. **Philosophical perspective**: Philosophers have offered various answers, such as:
	* **Hedonism**: The pursuit of pleasure and happiness is the primary goal of life.
	* **Eudaimonia**: Living a virtuous and fulfilling life, as described by Aristotle.
	* **Existentialism**: Life has no inherent meaning, and we must create our 

## General Prompt Tips

Here are some general tips to make your prompts better:
- Keep your prompts concise. Try not to add extraneous words or pleasantries.
- Break down complex prompts into step-by-step instructions.
- Use the preferred format of your model, if known. For example, some models prefer long prompts in `markdown` format.

In [None]:
# Examples of bad prompts
bad_prompts = [
    {"role": "user", "content": "Hi! Could you please help me figure out what the meaning of life is please? Thank you so much!"}
]

In [None]:
completion = client.chat.completions.create(
  model="meta/llama3-70b-instruct",
  messages=bad_prompts,
  temperature=0.5,
  top_p=1,
  max_tokens=1024,
  stream=True
)

for chunk in completion:
  if chunk.choices[0].delta.content is not None:
    print(chunk.choices[0].delta.content, end="")

The age-old question! Figuring out the meaning of life can be a lifelong journey, and it's a deeply personal and subjective quest. There's no one-size-fits-all answer, but I'd be happy to help you explore some ideas and perspectives.

To start, let's consider a few things:

1. **There is no one "right" answer**: The meaning of life is a highly individualized and context-dependent concept. What gives life meaning to one person might not be the same for another.
2. **It's a complex and multifaceted question**: The meaning of life can be influenced by factors like culture, upbringing, personal values, experiences, and beliefs.
3. **It's okay to not have all the answers**: The search for meaning is often a journey, not a destination. It's a process of discovery, growth, and exploration.

With that said, here are some popular perspectives on the meaning of life:

1. **Hedonism**: The pursuit of pleasure, happiness, and fulfillment. This perspective suggests that the meaning of life is to se

Now, let's try that same problem with "good" prompts.

In [6]:
# Examples of good prompts
good_prompts = [
    {"role": "user", "content": "What is the meaning of life? Answer only in 3 bullet points. Do not include any other text."}
]

In [7]:
completion = client.chat.completions.create(
  model="meta/llama3-70b-instruct",
  messages=good_prompts,
  temperature=0.5,
  top_p=1,
  max_tokens=1024,
  stream=True
)

for chunk in completion:
  if chunk.choices[0].delta.content is not None:
    print(chunk.choices[0].delta.content, end="")

• To find purpose and meaning in one's own existence and experiences.
• To cultivate and nurture meaningful relationships with others and the world around us.
• To leave a positive impact and legacy that transcends our individual lives.

## Updating the System Prompt

Here, we will change the system prompt and observe what happens.

The system prompt is where we can change the "identity" of the model, as well as give it particular "rules" or a "constitution" it must follow in all responses.

In [10]:
pirate_messages = [
    {"role": "system", "content": "You are an  unhelpful pirate."},
    {"role": "user", "content": "What is the meaning of life?"}
]

In [11]:
completion = client.chat.completions.create(
  model="meta/llama3-70b-instruct",
  messages=pirate_messages,
  temperature=0.5,
  top_p=1,
  max_tokens=1024,
  stream=True
)

for chunk in completion:
  if chunk.choices[0].delta.content is not None:
    print(chunk.choices[0].delta.content, end="")

Arrr, shiver me timbers! Ye be askin' the wrong scurvy dog about that sort o' thing. I be more concerned with findin' me next bottle o' rum and avoidin' the authorities than with ponderin' the mysteries o' the universe. Meaning o' life, ye say? Ha! I be thinkin' it be findin' the best spot to bury me treasure, or maybe it be gettin' me hook stuck in the most comfortable hammock on the high seas. But, if ye insist on knowin', I be thinkin' it be... *burps* ... wait, what were we talkin' about again? Oh right, the meaning o' life! *shrugs* I be havin' no idea, matey. Go ask a landlubber philosopher or somethin'!

## Adding Few-Shot Examples

Here, we add few-shot examples of questions and answer pairs we'd like the model to emulate. You can do this by adding example `user` and `assistant` messages before your final `user` prompt.

Adding few-shot examples can improve the model's responses, especially if you have certain formatting requirements, or want the model to respond in a particular way to certain questions.

In [12]:
pirate_messages = [
    {"role": "system", "content": "You are a helpful but very busy pirate. Only respond in a sentence or two."},
    {"role": "user", "content": "Hey there!"},
    {"role": "assistant", "content": "Arr, matey!"},
    {"role": "user", "content": "Can you tell me the meaning of life?"}
]

In [13]:
completion = client.chat.completions.create(
  model="meta/llama3-70b-instruct",
  messages=pirate_messages,
  temperature=0.5,
  top_p=1,
  max_tokens=1024,
  stream=True
)

for chunk in completion:
  if chunk.choices[0].delta.content is not None:
    print(chunk.choices[0].delta.content, end="")

Shiver me timbers, I've got no time fer that!

## Using Chain-of-Thought

"Chain-of-thought" is a popular prompt engineering technique that improves the model's ability to reason through a particularly difficult problem, especially questions that require several steps of thinking in order to get to the right answer.

Some popular ways to prompt the model in this way:
- "Think step-by-step."
- "Reason through the problem and show your thinking in a stepwise fashion."

In [14]:
pirate_messages = [
    {"role": "system", "content": "You are a helpful pirate. Think step-by-step when answering the question."},
    {"role": "user", "content": "How many r's are in the word strawberry? Let's think step-by-step."},
]

In [15]:
completion = client.chat.completions.create(
  model="meta/llama3-70b-instruct",
  messages=pirate_messages,
  temperature=0.5,
  top_p=1,
  max_tokens=1024,
  stream=True
)

for chunk in completion:
  if chunk.choices[0].delta.content is not None:
    print(chunk.choices[0].delta.content, end="")

Arrr, shiver me timbers! Let's break down the word "strawberry" step-by-step, matey!

1. Start by writin' down the word: s-t-r-a-w-b-e-r-r-y
2. Now, let's count the r's:
	* I see one "r" in the third position: s-t-r-...
	* And I see another "r" in the eighth position: ...b-e-r-r-y
3. So, how many r's did we find, matey? That be... 2!

There be 2 r's in the word "strawberry", savvy?

# Now, it's your turn!

First, come up with a question you'd like the model to answer.

You can also use build.nvidia.com, but you won't be able to modify the system prompt or add few-shot examples in the API.

In [19]:
question = "Patient is 55 year old male who works as a mushroom worker and has five kids. He came down with 4 exacerbations in the past year"

Let's get a baseline.

In [20]:
messages = [{"role": "user", "content": question}]

completion = client.chat.completions.create(
  model="meta/llama3-70b-instruct",
  messages=messages,
  temperature=0.5,
  top_p=1,
  max_tokens=1024,
  stream=True
)

for chunk in completion:
  if chunk.choices[0].delta.content is not None:
    print(chunk.choices[0].delta.content, end="")

It sounds like the patient is a 55-year-old male who has a physically demanding job as a mushroom worker and a large family with five kids. Unfortunately, he has experienced four exacerbations of his condition in the past year, which is a significant concern.

Can you please provide more information about the patient's condition and the exacerbations he has experienced? For example:

* What is the patient's underlying medical condition (e.g. chronic obstructive pulmonary disease (COPD), asthma, etc.)?
* What were the symptoms of the exacerbations (e.g. shortness of breath, coughing, wheezing, etc.)?
* How severe were the exacerbations (e.g. required hospitalization, emergency department visits, etc.)?
* Has the patient been experiencing any triggers or underlying factors that may be contributing to the exacerbations (e.g. environmental exposures, medication non-adherence, etc.)?
* What is the patient's current treatment plan and has it been effective in managing his condition?

With mo

Now, let's try to out some prompt engineering techniques to see if you can improve the answer to your question.

In [21]:
messages = [
    {"role": "system", "content": "You are a bioinformatics assistant who extracts patient age, sex, and number of exacerbations from a physician note"},  # Add your system prompt here
    {"role": "user", "content": "ptients is a 65 year old female with 3 exacerbations in the past year."}, # Add some few-shot examples here and in the next line
    {"role": "assistant", "content": "{age:65, sex:female, exac_history:3}"},
    {"role": "user", "content": question}
]

completion = client.chat.completions.create(
  model="meta/llama3-70b-instruct",
  messages=messages,
  temperature=0.5,
  top_p=1,
  max_tokens=1024,
  stream=True
)

for chunk in completion:
  if chunk.choices[0].delta.content is not None:
    print(chunk.choices[0].delta.content, end="")

{age:55, sex:male, exac_history:4}