## Genki-BOT Secure API Workshop
Welcome to this workshop that will teach you the basics of working with the Genki-BOT API via the OpenAI Python SDK.

### 01: Setting up the client, selecting a model and generating your first response

**Installing the OpenAI SDK**

To get startet, the first thing we have to do is to install the openai python package.

We do this by running the pip package installer in terminal, or like below:

In [1]:
! pip install -q openai #q is for quiet mode, so we don't see the output of the installation

You should now be able to import the AzureOpenAI SDK

In [1]:
from openai import AzureOpenAI

To use the API, we have to instantiate a client object.

The client object need 3 inputs:
- **api_key**: Your personal API key generated via the Genki-BOT API interface at https://genki-bot.ffdb.com/api-keys
- **api_version**: Any supported API version from Microsoft, refer to Microsoft for the latest overview: https://learn.microsoft.com/en-us/azure/ai-services/openai/api-version-lifecycle .
- **azure_endpoint**: For this workshop: "https://ca-gateway-genki-bot-dev.whitehill-48675d4c.swedencentral.azurecontainerapps.io", an updated production URL will be available to view from within the Genki-BOT API interface

In [2]:
client = AzureOpenAI(
    api_key="insert-your-key-here",  ## INSERT YOUR KEY FROM GENKI
    api_version="2025-03-01-preview",
    azure_endpoint="https://ca-gateway-genki-bot-dev.whitehill-48675d4c.swedencentral.azurecontainerapps.io",
)

The openai client supports a few different API structures, the most recent is called Responses API which is a recent replacement for the so calle Completion API. We will use this throughout the examples. 

Azure docs on Responses API: https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/responses?tabs=python-key

**Model support**
In order to generate a response from a language model, we need to specify which model to use.

Genki-BOT API currenty supports
- General-purpose language models:
    - gpt-4o
    - gpt-4.1
    - gpt-4o-mini
- Reasoning-optimized models:
    - o3
    - o3-mini
    - o4-mini
- Embedding model for text similarity/search (these are not called via the Responses API):
    - text-embedding-3-large

**input**
Secondly, we need to define the text input the model should respond to. The input generally follows a structure resembling a multi-turn conversation, where each input-pair is defined by a role ('user', 'system' or 'assistant') and the text content.
- User is the role used for any general text you want processed, whether it's a document input or a question as is typically what you will use


In [3]:
response = client.responses.create(
    model="gpt-4o", # Replace with your model deployment name 
    input=[
        {"role": "user", 
         "content": "Tell me a joke about cats"}
    ]
)

print(response.output_text)

Why was the cat sitting on the computer?

It wanted to keep an eye on the mouse!


**Exercise 1**
1. Change the text input and generate a new reply

2. Change the model by selecting another language or reasoning model and generate a new reply (not embedding model)

3. Inspect the response object, it contains more than just the response, add this line of code: `print(response.model_dump_json(indent=2))`

4. The response client also supports a 'system' role. This role should be put first, and defines the behavior of the AI model and will affect any response the model provides to the input text following that.

   Run the below cell and see the output. Now change the content of the system role, f.x. to "Always translate the input to Japanese", try with other system messages.

In [4]:
response = client.responses.create(
    model="gpt-4o",
    input=[
        {
            "role": "system", 
            "content": "You are a pharma nerd, you always provide answers with a pharma twist"
         },
        {
            "role": "user", 
            "content": "Tell me a joke about cats"
        }
    ]
)

print(response.output_text)

Why was the cat sitting on the pharmacist's counter?

Because it had a "purr-scription" that needed filling!


### 02: Structured outputs

For many use-cases we want the output of the language model to follow a strict format, and the output to be of a specific type (number, text, list etc.). 

By using structured outputs, we can define a schema for how we want the reply from the AI model to be, and it will adhere to this structure. In python, such structures is created via a 'class'. If we then pass in the class as the text_format attribute, we are certain that the output will follow this class structure.

In [8]:
from pydantic import BaseModel

class Deviation(BaseModel):
    incident_date: str
    invovled_personel: list[str]
    incident_description: str

response = client.responses.parse(
    model="gpt-4o", 
    input=[
        {
            "role": "system", 
            "content": "You are deviation analyst, you are given a description of an incident and you need to extract the information and return it in a structured way."
        }
        ,
        {
            "role": "user", 
            "content": "Peter and Lisa was working in the manufacturing site, when Peter dropped the wrench into the instrument. It happend Thursday the 15th of May 2024."
        }   
    ],
    text_format=Deviation
)

incident = response.output_parsed
print(f'Incident date: {incident.incident_date}')
print(f'Invovled personel: {incident.invovled_personel}')
print(f'Incident description: {incident.incident_description}')

Incident date: 2024-05-15
Invovled personel: ['Peter', 'Lisa']
Incident description: Peter dropped the wrench into the instrument while working on the manufacturing site.


**Exercise 2**
1. Change the description of the deviation incident and run the code again [the content field with Peter and Lisa ...]
2. Add a field to the Deviation class and re-run the response.
3. Update the system text, how does this affect the outcome [the content field with 'You are a deviation ...]

### 03: Image inputs

Multimodal models such as gpt-4o also support image inputs and can analyze and extract information from even fairly complex images. The Azure OpenAI API needs the images converted into a format called base64.

Below is a simple example where the user input also includes an image from the FLB website.

In [53]:
import base64

# Function to encode the image
def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")


# Path to your image
image_path = "PFP-Cover-Image-Sustainability.jpg"

# Getting the Base64 string
base64_image = encode_image(image_path)


response = client.responses.create(
    model="gpt-4o",
    input=[
        {
            "role": "user",
            "content": [
                { "type": "input_text", "text": "what's in this image?" },
                {
                    "type": "input_image",
                    "image_url": f"data:image/jpeg;base64,{base64_image}",
                },
            ],
        }
    ],
)

print(response.output_text)

The image shows an aerial view of a lush green forest with winding rivers. Overlaid on this scenery is text that reads "Partners for the Planet" alongside a globe-like design.


**Exercise 3**
1. Inspect the .jpg image, does the image allign with the output of the model?
2. Replace the image_path name with `ai-generated-notes.jpg` and see whether the model can read handwritten text
3. Change the input text to ensure that the AI extract the full text from the ai generated notes and prints those out
4. If you have an image on your computer or phone [non confidential], try to upload this file, and have the model analyze the input

### 04: Text embeddings

Text embedding models are AI models trained to convert text into vectors, so that we can work mathematically with meaning and context. This is often usefull if we want to compare hundreds, thousands or even millions of text records by using efficient algorithms. This, is a core component of a RAG system, where the most naive retrieval technique simply use embeddings to retrieve relevant pieces of text.

Let us first embed a single piece of text and inspect the output

In [62]:
SOP_1_embedding = client.embeddings.create(
    input="SOP 1: Hygiene and cleaning in the production area",
    model="text-embedding-3-large"
)

print(SOP_1_embedding.data[0].embedding[:20])

[-0.0007559562800452113, -0.006044063251465559, -0.009024844504892826, 0.01776272989809513, 5.85126290388871e-05, -0.0327419638633728, 0.009397890418767929, 0.02189493179321289, 0.013695093803107738, 0.0015513693215325475, -0.02716062031686306, -0.023860597983002663, -0.008931582793593407, -0.00921136699616909, -0.010825509205460548, 0.02123492769896984, 0.010337679646909237, -0.015337930992245674, -0.02281319908797741, 0.014204445295035839]


Now let us create 2 more embeddings. In order to compare how similar the embeddings are to each other, we can use the cosine similarity metric which is one of the most common metrics for this purpose.

The cosine similarity score is in the range of 0 to 1.In order to easily calculate the cosine similarity between multiple pairs of embeddings, we will use the Sklearn library.

In [1]:
! pip install -q scikit-learn

In [None]:
SOP_2_embedding = client.embeddings.create(
    input="SOP 2: Cleaning and disinfection of the storage facility",
    model="text-embedding-3-large"
)

SOP_3_embedding = client.embeddings.create(
    input="SOP 3: Training of employees on forklift operation",
    model="text-embedding-3-large"
)

# Calculate the cosine similarity between the embeddings
from sklearn.metrics.pairwise import cosine_similarity

similarity_matrix = cosine_similarity([SOP_1_embedding.data[0].embedding, 
                                       SOP_2_embedding.data[0].embedding, 
                                       SOP_3_embedding.data[0].embedding])

print(f'Similarity between SOP 1 and SOP 2: {similarity_matrix[0][1]}')
print(f'Similarity between SOP 1 and SOP 3: {similarity_matrix[0][2]}')
print(f'Similarity between SOP 2 and SOP 3: {similarity_matrix[1][2]}')

Similarity between SOP 1 and SOP 2: 0.6562185395643199
Similarity between SOP 1 and SOP 3: 0.3927057636338306
Similarity between SOP 2 and SOP 3: 0.42579820373983357


**Exercise 4**
1. Do the above similar scores match what you would expect?
2. Modify the existing SOP input texts and see how the scores are effected, does the results match your intuition?

### 05: Tool Usage & Function calling with Chat Completions

#### 05.1 Why use Chat Completions for tools?

- Chat Completions is mature, widely documented, and identical to the OpenAI global endpoint, so sample code “just works” on Azure. 
- The functions field lets the model decide when to call helper code, then finish the reply with your function’s JSON result.

#### 05.2 Define the helper function & schema

In this example, we will create a helper function that can look up live weather data if the user's question implies the need for knowing a specific city's weather for the answer.

In [21]:
import json, requests
from datetime import datetime


def get_current_weather(location: str, unit: str = "celsius") -> str:
    """
    Look up live weather via the free open-meteo.com REST API and
    return a JSON string (because the Chat API expects a string payload).
    """
    geo = requests.get(
        "https://geocoding-api.open-meteo.com/v1/search",
        params={"name": location, "count": 1},
        timeout=10
    ).json()

    if not geo.get("results"):
        return json.dumps({"error": f"Location '{location}' not found."})

    lat, lon = geo["results"][0]["latitude"], geo["results"][0]["longitude"]
    weather = requests.get(
        "https://api.open-meteo.com/v1/forecast",
        params={
            "latitude": lat,
            "longitude": lon,
            "current_weather": True,
            "temperature_unit": unit,
        },
        timeout=10
    ).json()["current_weather"]

    return json.dumps({
        **weather,
        "requested_location": location,
        "retrieved_at": datetime.utcnow().isoformat() + "Z",
        "unit": unit
    })

#### 05.3 Tool schema
The schema mirrors OpenAI’s function-calling spec.

In [18]:
# JSON schema that tells the model how to call the function
weather_tool = {
    "type": "function",
    "function": {
        "name": "get_current_weather",          # <-- required
        "description": "Return the current weather for a city.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City and country, e.g. 'Copenhagen, Denmark'"
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature unit"
                }
            },
            "required": ["location"]
        }
    }
}

**Why the wrapper?**

The service inspects tools[*].type. If the key or its "name" sub-property is missing you’ll get “Missing required parameter: 'tools[0].name'”.

#### 05.4 Utility: run a chat with automatic function use

In [24]:
def ask_llm(prompt: str, deployment_name: str = "gpt-4o") -> str:
    # STEP 1 ─ Let the model decide if it needs the tool
    first = client.chat.completions.create(
        model      = deployment_name,
        messages   = [{"role": "user", "content": prompt}],
        tools      = [weather_tool],
        tool_choice= "auto"
    )

    assistant_msg = first.choices[0].message

    # If the model requested the function ...
    if assistant_msg.tool_calls:
        tool_call   = assistant_msg.tool_calls[0]
        args        = json.loads(tool_call.function.arguments)
        tool_result = get_current_weather(**args)

        # STEP 2 ─ Send the tool result back so the model can finish
        second = client.chat.completions.create(
            model    = deployment_name,
            messages = [
                {"role": "user", "content": prompt},
                assistant_msg,   # the function-call request
                {
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "name": tool_call.function.name,
                    "content": tool_result
                }
            ]
        )
        return second.choices[0].message.content, tool_result

    # No function needed
    return assistant_msg.content, None

The two-call flow is the exact sequence recommended by both OpenAI and Azure docs.

#### 05.5 Example A – Weather question (tool invoked)

In [None]:
# Weather question – tool will be invoked
answer, tool_result = ask_llm("I'm visiting Copenhagen today - do I need an umbrella?")

print("Model response:\n", answer)
print("\n\nTool result:", tool_result)

Answer: The current weather in Copenhagen is partly cloudy with a temperature of 17.8°C. There is no mention of rain at this moment, so you probably don't need an umbrella right now. However, weather conditions can change, so you might want to keep an umbrella handy, just in case.

Tool result: {"time": "2025-06-17T07:30", "interval": 900, "temperature": 17.8, "windspeed": 15.1, "winddirection": 281, "is_day": 1, "weathercode": 3, "requested_location": "Copenhagen, Denmark", "retrieved_at": "2025-06-17T07:39:51.732977Z", "unit": "celsius"}


What happens behind the scenes?

1. The model detects the user wants weather.
2. It returns a tool call request with arguments {"location": "Copenhagen, Denmark"}.
3. Your code (the SDK does this automatically in v1 preview) runs get_current_weather, gets the JSON, and sends a follow-up request containing the function output.
4. The model writes the final answer, weaving the fresh weather into natural language.

#### 05.6 Example B - Generic question (tool not involved)

In [None]:
# Non-weather question – tool is skipped
answer, tool_result = ask_llm("Tell me a Danish pastry joke!")


print("Model response:\n", answer)
print("\n\nTool result:", tool_result)

Answer: Why did the Danish pastry go to school?

Because it wanted to be a little “butter” at everything!

Tool result: None


Because the user’s request isn’t weather-related, the model simply replies without invoking the function.

#### 05.7 Things to try

1. Change the location or set "unit": "fahrenheit" and observe the tool arguments.
2. Add error handling – ask for an imaginary city and see the graceful fallback.
3. Chain tools – combine the weather data with a packing-list generator.