<a href="https://colab.research.google.com/github/rishil-git/Simple-DevOps-Project/blob/master/001_Prompt_Eng.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install -U \
  openai



# Guidelines for Prompting
In this lesson, you'll practice two prompting principles and their related tactics in order to write effective prompts for large language models.

## Setup
#### Load the API key and relevant Python libaries.

In this course, we've provided some code that loads the OpenAI API key for you.

In [None]:
from openai import OpenAI
import os

from google.colab import userdata
OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')

#### helper function
Throughout this course, we will use OpenAI's `gpt-4o` model and the [chat completions endpoint](https://platform.openai.com/docs/guides/chat).

In [None]:
openai_client = OpenAI(api_key=OPENAI_API_KEY)

In [None]:
  response = openai_client.chat.completions.create(
      model='gpt-4o',
      messages=[{
          "role":"user",
          "content":"How Are you today ?"
      }

      ]
  )
  print(response.choices[0].message.content)

Thank you for asking! I'm just a computer program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?


In [None]:
def get_completion(prompt, model="gpt-4o"):
    messages = [{"role": "user", "content": prompt}]
    response = openai_client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message.content

## Prompting Principles
- **Principle 1: Write clear and specific instructions**
- **Principle 2: Give the model time to “think”**



### Principle 1 Tactics

#### Tactic 1: Use delimiters to clearly indicate distinct parts of the input
- Delimiters can be anything like: ```, """, < >, `<tag> </tag>`, `:`

In [None]:
pharma_text = """
CTX-214 Phase III trial evaluated efficacy and safety in 400 patients with Type 2 Diabetes.
The study spanned 15 global sites and lasted 24 weeks.
Primary endpoint: Reduction in HbA1c.
Results: Statistically significant improvement with no serious adverse events reported.
"""

prompt = f"""
You are a clinical research analyst.
Summarize the information between the <trial> tags below into a single sentence
focusing on: drug/trial name, indication, patient count, duration, outcome, and safety.

---
<trial>
{pharma_text}
</trial>
---
"""
response = get_completion(prompt)
print(response)

The CTX-214 Phase III trial, conducted over 24 weeks across 15 global sites, demonstrated a statistically significant reduction in HbA1c in 400 patients with Type 2 Diabetes, with no serious adverse events reported.


#### Tactic 2: Ask for a structured output
- JSON, HTML

In [None]:
output_format = "JSON"
prompt = f"""
Generate a list of three fictional pharmaceutical drugs \
along with their manufacturers and therapeutic areas.
Provide them in {output_format} format with the following keys:
drug_id, name, manufacturer, therapeutic_area.
"""
response = get_completion(prompt)
print(response)

```json
[
    {
        "drug_id": "RX001",
        "name": "CardioRelief",
        "manufacturer": "HealthGenix Pharmaceuticals",
        "therapeutic_area": "Cardiovascular Health"
    },
    {
        "drug_id": "RX002",
        "name": "NeuroCalm",
        "manufacturer": "NeuroPharm Solutions",
        "therapeutic_area": "Neurology"
    },
    {
        "drug_id": "RX003",
        "name": "GastroEase",
        "manufacturer": "DigestiveCare Inc.",
        "therapeutic_area": "Gastroenterology"
    }
]
```


#### Tactic 3: Ask the model to check whether conditions are satisfied

In [None]:
text_1 = f"""
The clinical trial began with a screening visit to determine subject eligibility,
including medical history and laboratory tests. Upon passing the screening,
participants entered a 2-week washout period where any previous medications were discontinued.
After the washout, subjects received the first dose of the investigational drug under supervision.
Vital signs were monitored every 30 minutes for the next 4 hours.
Subjects then returned for follow-up visits on Days 7, 14, and 28 to assess safety and efficacy.
"""

prompt = f"""
You will be provided with clinical procedure text delimited by triple quotes.
If it contains a sequence of procedural steps, re-write them in the following format:

Step 1 - ...
Step 2 - ...
…
Step N - …

If the text does not contain a sequence of steps or instructions, \
then simply write \"No steps provided.\"

\"\"\"{text_1}\"\"\"
"""
response = get_completion(prompt)
print("Completion for Text 1:")
print(response)

Completion for Text 1:
Step 1 - Screening visit to determine subject eligibility, including medical history and laboratory tests.  
Step 2 - Enter a 2-week washout period where any previous medications are discontinued.  
Step 3 - Receive the first dose of the investigational drug under supervision.  
Step 4 - Monitor vital signs every 30 minutes for the next 4 hours.  
Step 5 - Return for follow-up visits on Days 7, 14, and 28 to assess safety and efficacy.


In [None]:
text_2 = f"""
The clinical trial results were promising, with a significant reduction in blood pressure
observed in the treatment group. Participants reported minimal side effects, and
overall tolerability was high. The study was conducted across multiple sites and
included a diverse patient population. Investigators noted improvements in adherence
and patient-reported outcomes, especially among those who had previously shown resistance
to standard antihypertensive therapies.
"""

prompt = f"""
You will be provided with text delimited by triple quotes.
If it contains a sequence of instructions, \
re-write those instructions in the following format:

Step 1 - ...
Step 2 - …
…
Step N - …

If the text does not contain a sequence of instructions, \
then simply write "No steps provided."

\"\"\"{text_2}\"\"\"
"""
response = get_completion(prompt)
print("Completion for Text 2:")
print(response)

Completion for Text 2:
No steps provided.


#### Tactic 4: "Few-shot" prompting
Few-shot prompting helps bridge the gap between what a large language model (LLM) can do and what you want it to do, without **fine-tuning**.

Large Language Models (LLMs) like GPT-4 are trained on a vast general corpus (Wikipedia, GitHub, books, etc.), but:
```
  •	They may not fully understand specialized business terms (e.g., pharma, finance, legal).
  
  •	They might misinterpret task-specific intents (e.g., how you want to classify a complaint or triage a support ticket).
```

Two-shot prompting is useful when the model needs guidance through multiple diverse examples to generalize a task better, especially in ambiguous or domain-specific contexts. It strikes a balance between zero-shot simplicity and few-shot specificity.

In [None]:
prompt = f"""
You are a medical triage assistant. Read each HCP query and classify it into one of the following:
["Adverse Event", "Product Inquiry", "Off-label Use Inquiry", "Medical Literature Request", "Clinical Trial Request", "Other"]

Examples:

Q: "Patient developed severe rash after starting DrugX. Is this expected?"
A: Adverse Event

Q: "Can you provide the mechanism of action of DrugY?"
A: Product Inquiry

Q: "Is DrugZ effective in pediatric patients with rheumatoid arthritis?"
A: Off-label Use Inquiry

Q: "Please send me publications related to DrugA's efficacy in migraines."
A: Medical Literature Request

Q: "How do I enroll my patients into the ongoing phase 3 trial for DrugB?"
A: Clinical Trial Request

Q: "Where can I find pricing information for DrugC?"
A: Other

Now classify the following:

Q: "My patient experienced dizziness after taking the first dose of DrugP. Should I report it?"
"""
response = get_completion(prompt)
print(response)

Adverse Event


### Principle 2: Give the model time to “think”

#### Tactic 1: Specify the steps required to complete a task

In [None]:
clinical_trial_text = f"""
A recent Phase IIb study evaluated the investigational drug GLX-108 in 280 patients with moderate-to-severe asthma.
The randomized, double-blind trial was conducted over 16 weeks across 12 sites in Europe.
Primary outcome: improvement in FEV1 (forced expiratory volume).
The trial met its endpoint, showing a significant improvement in lung function with a favorable safety profile.
"""

# Pharma-style multi-step prompt
prompt_1 = f"""
Perform the following actions:
1 - Summarize the following clinical trial text (delimited by triple backticks) into a single sentence.
2 - Translate the summary into Spanish.
3 - List each drug or compound name mentioned in the Spanish summary.
4 - Output a JSON object with the following keys: spanish_summary, drug_names.

Text:
```{clinical_trial_text}```
"""
response = get_completion(prompt_1)
print("Completion for prompt 1:")
print(response)

Completion for prompt 1:
1 - Summary: A Phase IIb study of GLX-108 in 280 asthma patients showed significant lung function improvement and a favorable safety profile.

2 - Spanish Translation: Un estudio de fase IIb de GLX-108 en 280 pacientes con asma mostró una mejora significativa en la función pulmonar y un perfil de seguridad favorable.

3 - Drug Names: GLX-108

4 - JSON Object:
```json
{
  "spanish_summary": "Un estudio de fase IIb de GLX-108 en 280 pacientes con asma mostró una mejora significativa en la función pulmonar y un perfil de seguridad favorable.",
  "drug_names": ["GLX-108"]
}
```


#### Ask for output in a specified format

In [None]:
# Define clinical text input
text = f"""
The Phase III trial of the drug Lorafenib, developed by BioThera, enrolled 600 patients with late-stage melanoma.
The study demonstrated a 32% improvement in progression-free survival compared to the control arm.
No new safety concerns were identified during the 18-month follow-up.
"""

# Define multi-step prompt
prompt = f"""
Your task is to perform the following actions:
1 - Summarize the following text delimited by < > using a single concise sentence.
2 - Translate the summary into French.
3 - Identify all proper names (e.g., drug names, companies, or diseases) in the French summary.
4 - Output a JSON object with the following fields:
   - summary: French translated summary
   - num_names: total count of identified names
   - schema: a list of dictionaries describing each field in the JSON, containing:
     * field: the field name
     * description: explanation of what the field contains
     * datatype: data type (e.g., string, integer)

Use the following format:
Text: <text to summarize>
Summary: <summary>
Translation: <summary translation>
Names: <list of proper names in French summary>
Output JSON: <json with summary, num_names, schema<field_of_json, description_of_field, datatype>>

Text: <{text}>
"""

# Call your completion function (assume get_completion uses OpenAI or similar)
response = get_completion(prompt)
print("\nCompletion for prompt:")
print(response)


Completion for prompt:
Summary: The Phase III trial of Lorafenib by BioThera showed a 32% improvement in progression-free survival for late-stage melanoma patients without new safety concerns.

Translation: L'essai de phase III de Lorafenib par BioThera a montré une amélioration de 32 % de la survie sans progression pour les patients atteints de mélanome à un stade avancé sans nouveaux problèmes de sécurité.

Names: Lorafenib, BioThera

Output JSON: {
  "summary": "L'essai de phase III de Lorafenib par BioThera a montré une amélioration de 32 % de la survie sans progression pour les patients atteints de mélanome à un stade avancé sans nouveaux problèmes de sécurité.",
  "num_names": 2,
  "schema": [
    {
      "field": "summary",
      "description": "French translated summary of the original text.",
      "datatype": "string"
    },
    {
      "field": "num_names",
      "description": "Total count of identified proper names in the French summary.",
      "datatype": "integer"
    

#### Tactic 2: Instruct the model to work out its own solution before rushing to a conclusion

In [None]:
prompt = f"""
Determine if the student's solution is correct or not.

Question:
I'm building a solar power installation and I need \
 help working out the financials.
- Land costs $100 / square foot
- I can buy solar panels for $250 / square foot
- I negotiated a contract for maintenance that will cost \
me a flat $100k per year, and an additional $10 / square \
foot
What is the total cost for the first year of operations
as a function of the number of square feet.

Student's Solution:
Let x be the size of the installation in square feet.
Costs:
1. Land cost: 100x
2. Solar panel cost: 250x
3. Maintenance cost: 100,000 + 100x
Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
"""
response = get_completion(prompt)
print(response)

The student's solution is correct. The total cost for the first year of operations as a function of the number of square feet \( x \) is calculated as follows:

1. **Land cost**: \( 100x \) (since land costs $100 per square foot)
2. **Solar panel cost**: \( 250x \) (since solar panels cost $250 per square foot)
3. **Maintenance cost**: \( 100,000 + 10x \) (a flat $100,000 per year plus $10 per square foot)

Adding these costs together gives:

\[
100x + 250x + 100,000 + 10x = 360x + 100,000
\]

It seems there was a mistake in the student's solution regarding the maintenance cost per square foot. The correct maintenance cost should be \( 10x \) instead of \( 100x \). Therefore, the correct total cost should be:

\[
360x + 100,000
\]

The student's solution incorrectly calculated the maintenance cost per square foot, leading to an incorrect total cost expression.


#### Note that the student's solution is actually not correct.
#### We can fix this by instructing the model to work out its own solution first.

In [None]:
prompt = f"""
Your task is to determine if the student's solution \
is correct or not.
To solve the problem do the following:
- First, work out your own solution to the problem including the final total.
- Then compare your solution to the student's solution \
and evaluate if the student's solution is correct or not.
Don't decide if the student's solution is correct until
you have done the problem yourself.

Use the following format:
Question:
```
question here
```
Student's solution:
```
student's solution here
```
Actual solution:
```
steps to work out the solution and your solution here
```
Is the student's solution the same as actual solution \
just calculated:
```
yes or no
```
Student grade:
```
correct or incorrect
```

Question:
```
I'm building a solar power installation and I need help \
working out the financials.
- Land costs $100 / square foot
- I can buy solar panels for $250 / square foot
- I negotiated a contract for maintenance that will cost \
me a flat $100k per year, and an additional $10 / square \
foot
What is the total cost for the first year of operations \
as a function of the number of square feet.
```
Student's solution:
```
Let x be the size of the installation in square feet.
Costs:
1. Land cost: 100x
2. Solar panel cost: 250x
3. Maintenance cost: 100,000 + 100x
Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
```
Actual solution:
"""
response = get_completion(prompt)
print(response)

```
Let x be the size of the installation in square feet.
Costs:
1. Land cost: 100x (since land costs $100 per square foot)
2. Solar panel cost: 250x (since solar panels cost $250 per square foot)
3. Maintenance cost: 100,000 + 10x (since the maintenance costs a flat $100k per year plus $10 per square foot)

Total cost for the first year of operations:
= Land cost + Solar panel cost + Maintenance cost
= 100x + 250x + (100,000 + 10x)
= 100x + 250x + 100,000 + 10x
= 360x + 100,000
```
Is the student's solution the same as actual solution just calculated:
```
no
```
Student grade:
```
incorrect
```


## Model Limitations: Hallucinations
- Boie is a real company, the product name is not real.

In [None]:
prompt = f"""
Tell me about AeroGlide UltraSlim Smart Toothbrush by Boie
"""
response = get_completion(prompt)
print(response)

## Try experimenting on your own!