# Welcome to the Prompt engineering Workshop
Lets first install the pre-requite python packages

In [None]:
# pandas is a library for parsing csvs
!pip install pandas
# also as we will be using claude we need to install anthropic
!pip install anthropic

We will generating code for parsing some csv so lets create the csv to work with quickly

In [12]:
# For Workshop purpose so that everyone can acess the same data, we writing the file dynamically 
import csv

# Create data as list of lists
data = [
    ["Flow ID","Tx Frames","Rx Frames","Frames Delta","Loss %","Tx Frame Rate","Rx Frame Rate","Tx L1 Rate (bps)","Rx L1 Rate (bps)","Rx Bytes","Tx Rate (Bps)","Rx Rate (Bps)","Tx Rate (bps)","Rx Rate (bps)","Tx Rate (Kbps)","Rx Rate (Kbps)","Tx Rate (Mbps)","Rx Rate (Mbps)","Store-Forward Avg Latency (ns)","Store-Forward Min Latency (ns)","Store-Forward Max Latency (ns)","First TimeStamp","Last TimeStamp"],
    ["Flow 1",1000,1000,0,0.000,0.000,0.000,0.000,0.000,66000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0,0,0,"00:00:01.063","00:00:04.393"],
    ["Flow 2",1000,100,0,0.000,0.000,0.000,0.000,0.000,66000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0,0,0,"00:00:01.063","00:00:04.393"],
    ["Flow 3",1000,500,0,0.000,0.000,0.000,0.000,0.000,66000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0,0,0,"00:00:01.063","00:00:04.393"],
    ["Flow 4",1000,"N/A",0,0.000,0.000,0.000,0.000,0.000,66000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0,0,0,"00:00:01.063","00:00:04.393"]
]

# Write to CSV file
with open('stats.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    writer.writerows(data)


In [11]:
# check the file is created
!cat stats.csv

'cat' is not recognized as an internal or external command,
operable program or batch file.


## Lets get started with our first prompt
### Level 1: The "Zero-Shot"
Asking the model to perform a task with no examples, no context, and no constraints. You are relying entirely on the model's training data assumptions.

In [None]:
from anthropic import AnthropicFoundry
import re

def extract_and_save_python_code(text, output_file='output.py'):
    # Pattern to match ```python ... ``` blocks
    pattern = r'```python\s*(.*?)```'
    
    # Find all matches (re.DOTALL allows . to match newlines)
    matches = re.findall(pattern, text, re.DOTALL)
    
    # Join all code blocks if multiple exist
    python_code = '\n\n'.join(matches)
    
    # Write to file
    with open(output_file, 'w') as f:
        f.write(python_code)
    
    return python_code

# API Configuration
endpoint = "https://foundary-codegen-1.services.ai.azure.com/anthropic/"
deployment_name = "claude-sonnet-4-5"
api_key = ""

client = AnthropicFoundry(
    api_key=api_key,
    base_url=endpoint
)

prompt = """
Write a python script to read a csv file called 'stats.csv' 
calculate the total loss by looking at Tx Frames and Rx Frames.
I want you to calculate loss as (tx - rx) / tx * 100 and print the result for all Flow 2, Flow3 and Flow 4.
we will havve multiple columns but the relevant columns are Flow ID, Tx Frames, Rx Frames.
return only the python code without any explanation or comments.
"""

message = client.messages.create(
    model=deployment_name,
    messages=[
        {"role": "user", "content": prompt}
    ],
    max_tokens=1024,
)

print(message.content[0].text)
extract_and_save_python_code(message.content[0].text, output_file='traffic_loss_calculator.py')


In [None]:
# lets run the python code generated by the model
!python traffic_loss_calculator.py

got an error right ! Do you know why ? because the LLM is not at all aware of the format of your csv

### Level 2: The "Few-Shot"
Providing examples (shots) of the input data and the desired output format. This guides the model's pattern recognition engine to handle data formatting

In [None]:
prompt = """
Write a python script to read a csv file called 'stats.csv' 
Calculate the total loss by looking at Tx Frames and Rx Frames.
I want you to calculate loss as (tx - rx) / tx * 100 and print the result for all Flow 2, Flow3 and Flow 4.
We will havve multiple columns but the relevant columns are Flow ID, Tx Frames, Rx Frames.

Its a messy csv
Here is how my csv looks like paritially:
"Flow ID","Tx Frames","Rx Frames"
"Flow 1",1000,1000,....
"Flow 2",1000,"N/A",....
.
.
.
"Flow N",1000,200,....

Return only the python code without any explanation or comments.
"""

message = client.messages.create(
    model=deployment_name,
    messages=[
        {"role": "user", "content": prompt}
    ],
    max_tokens=1024,
)

print(message.content[0].text)
extract_and_save_python_code(message.content[0].text, output_file='traffic_loss_calculator2.py')

In [None]:
# now lets run the new code generated by the model
!python traffic_loss_calculator2.py

Nice right! but it still has a lot of things it cannot handle , like proper tyope casts, missing rows, missing files etc
This leads to 

### Level 3: Chain of Thought

Forcing the model to "Show its work" or verbalize reasoning before generating code. This reduces hallucination and ensures edge cases are considered.

In [None]:
prompt = """
Write a python script to read a csv file called 'stats.csv' 
Calculate the total loss by looking at Tx Frames and Rx Frames.
I want you to calculate loss as (tx - rx) / tx * 100 and print the result for all Flow 2, Flow3 and Flow 4.
We will havve multiple columns but the relevant columns are Flow ID, Tx Frames, Rx Frames.

Its a messy csv
Here is how my csv looks like paritially:
"Flow ID","Tx Frames","Rx Frames"
"Flow 1",1000,1000,....
"Flow 2",1000,"N/A",....
.
.
.
"Flow N",1000,200,....

Before coding, consider the following edge cases and handle them in your code:
1. How do we handle missing files?
2. How do we handle non-numeric rows?
3. How do we safely convert types?

Return only the python code without any explanation or comments.
"""

message = client.messages.create(
    model=deployment_name,
    messages=[
        {"role": "user", "content": prompt}
    ],
    max_tokens=1024,
)

print(message.content[0].text)
extract_and_save_python_code(message.content[0].text, output_file='traffic_loss_calculator3.py')

In [None]:
# lets now delete the file and see if the new code handles it gracefully
!rm stats.csv
!python traffic_loss_calculator3.py

Seems pretty neat !, but is it maintainle ? is it like a code written by someone senior ?

### Level 4: Role + Constraints

Assigning a Persona (Senior Engineer) and strict Constraints (Standards) to force production-quality structure and formatting.

In [None]:
prompt = """
Write a python script to read a csv file called 'stats.csv' 
Calculate the total loss by looking at Tx Frames and Rx Frames.
I want you to calculate loss as (tx - rx) / tx * 100 and print the result for all Flow 2, Flow3 and Flow 4.
We will havve multiple columns but the relevant columns are Flow ID, Tx Frames, Rx Frames.

Its a messy csv
Here is how my csv looks like paritially:
"Flow ID","Tx Frames","Rx Frames"
"Flow 1",1000,1000,....
"Flow 2",1000,"N/A",....
.
.
.
"Flow N",1000,200,....

Before coding, consider the following edge cases and handle them in your code:
1. How do we handle missing files?
2. How do we handle non-numeric rows?
3. How do we safely convert types?

Act as a Senior Engineer.
Constraints:
1. Use `argparse` for CLI usage.
2. Use Type Hinting (PEP 484).
3. Use `logging` instead of print.
4. Modular functions.

Return only the python code without any explanation or comments.
"""
message = client.messages.create(
    model=deployment_name,
    messages=[
        {"role": "user", "content": prompt}
    ],
    max_tokens=2048,
)

print(message.content[0].text)
extract_and_save_python_code(message.content[0].text, output_file='traffic_loss_calculator4.py')

In [None]:
# lets run the generated code to see if it works
!python traffic_loss_calculator4.py --file stats.csv