## Check your connection to Generative AI Hub


👉 First you need to assign the values from `generative-ai-codejam/.aicore-config.json` to the environmental variables. 

That way the Generative AI Hub [Python SDK](https://pypi.org/project/generative-ai-hub-sdk/) will connect to Generative AI Hub.

👉 For the Python SDK to know which resource group to use, you also need to set the `resource group` in the [`variables.py`](variables.py) file to your own `resource group` (e.g. **team-01**) that you created in the SAP AI Launchpad in exercise [00-connect-AICore-and-AILaunchpad](000-connect-AICore-and-AILaunchpad.md).

In [1]:
pip install generative-ai-hub-sdk[all] hdbcli scipy pypdf --break-system-packages

In [2]:
import json
import os
from ai_core_sdk.ai_core_v2_client import AICoreV2Client
# Inline credentials
with open('creds.json') as f:
    credCF = json.load(f)
 
# Set environment variables
def set_environment_vars(credCF):
    env_vars = {
        'AICORE_AUTH_URL': credCF['url'] + '/oauth/token',
        'AICORE_CLIENT_ID': credCF['clientid'],
        'AICORE_CLIENT_SECRET': credCF['clientsecret'],
        'AICORE_BASE_URL': credCF["serviceurls"]["AI_API_URL"] + "/v2",
        'AICORE_RESOURCE_GROUP': "grounding"
    }
 
    for key, value in env_vars.items():
        os.environ[key] = value
        print(value)
 
# Create AI Core client instance
def create_ai_core_client(credCF):
    set_environment_vars(credCF)  # Ensure environment variables are set
    return AICoreV2Client(
        base_url=os.environ['AICORE_BASE_URL'],
        auth_url=os.environ['AICORE_AUTH_URL'],
        client_id=os.environ['AICORE_CLIENT_ID'],
        client_secret=os.environ['AICORE_CLIENT_SECRET'],
        resource_group=os.environ['AICORE_RESOURCE_GROUP']
    )
 
ai_core_client = create_ai_core_client(credCF)

https://israel-fsvdxbsq.authentication.eu11.hana.ondemand.com/oauth/token
sb-49ec08a9-d325-4480-9418-ad8801558203!b28574|aicore!b18
96d2ba69-3289-4190-ad82-c174e50f9f17$8C_adlgCYD6AscPgIKtLXJkIj1AL6i8p9Opw1JJZ0o8=
https://api.ai.prodeuonly.eu-central-1.aws.ml.hana.ondemand.com/v2
grounding


In [3]:
# #import init_env
# #import variables
# #import importlib
# #variables = importlib.reload(variables)

# # TODO: You need to specify which model you want to use. In this case we are directing our prompt
# # to the openAI API directly so you need to pick one of the GPT models. Make sure the model is actually deployed
# # in genAI Hub. You might also want to chose a model that can also process images here already. 
# # E.g. 'gpt-4o-mini' or 'gpt-4o'
# MODEL_NAME = 'gpt-4o'
# # Do not modify the `assert` line below
# assert MODEL_NAME!='', """You should change the variable `MODEL_NAME` with the name of your deployed model (like 'gpt-4o-mini') first!"""

# #init_env.set_environment_variables()
# # Do not modify the `assert` line below 
# assert variables.RESOURCE_GROUP!='', """You should change the value assigned to the `RESOURCE_GROUP` in the `variables.py` file to your own resource group first!"""
# print(f"Resource group is set to: {variables.RESOURCE_GROUP}")

## Prompt an LLM with Generative AI Hub

In [4]:
from gen_ai_hub.proxy.native.openai import chat
from IPython.display import Markdown

messages = [
    {
        "role": "user", 
        "content": "What is the underlying model architecture of an LLM? Explain it as short as possible."
    }
]

kwargs = dict(model_name="gpt-4o-mini", messages=messages)

response = chat.completions.create(**kwargs)
display(Markdown(response.choices[0].message.content))

The underlying architecture of a Large Language Model (LLM) is typically based on the Transformer model. The key components of this architecture include:

1. **Attention Mechanism**: Specifically, self-attention allows the model to weigh the significance of different words relative to one another, enabling it to capture contextual relationships effectively.

2. **Layers**: The Transformer consists of multiple stacked layers, each containing two main sub-layers: a multi-head self-attention mechanism and a feed-forward neural network.

3. **Positional Encoding**: Since Transformers do not have a built-in sense of word order, positional encodings are added to input embeddings to retain the sequence information.

4. **Training Objective**: LLMs are usually trained using unsupervised learning techniques, such as predicting the next word in a sequence (language modeling).

This architecture allows LLMs to process and generate human-like text by understanding context, relationships, and nuances within language.

## Understanding roles
Most LLMs have the roles `system`, `assistant` (GPT) or `model` (Gemini) and `user` that can be used to steer the models response. In the previous step you only used the role `user` to ask your question. 

👉 Try out different `system` messages to change the response. You can also tell the model to not engage in smalltalk or only answer questions on a certain topic. Then try different user prompts as well!

In [5]:
messages = [
    {   "role": "system", 
        # TODO try changing the system prompt
        "content": "Speak like Yoda from Star Wars."
    }, 
    {
        "role": "user", 
        "content": "What is the underlying model architecture of an LLM? Explain it as short as possible."
    }
]

kwargs = dict(model_name="gpt-4o-mini", messages=messages)
response = chat.completions.create(**kwargs)
display(Markdown(response.choices[0].message.content))

An LLM, hmm, based on transformer architecture it is. Attention mechanisms, self-attention layers, and feedforward networks, it employs. Training on vast data, it learns patterns and context. Simple, yet powerful, yes.

👉 Also try to have it speak like a pirate.

👉 Now let's be more serious! Tell it to behave like an SAP consultant talking to AI Developers.

👉 Ask it to behave like an SAP Consultant talking to ABAP Developers and to make ABAP comparisons.

## Hallucinations
👉 Run the following question.

In [6]:
messages = [
    {   "role": "system", 
        "content": "You are an SAP Consultant."
    }, 
    {
        "role": "user", 
        "content": "How does the data masking of the orchestration service work?"
    }
]

kwargs = dict(model_name="gpt-4o-mini", messages=messages)
response = chat.completions.create(**kwargs)
display(Markdown(response.choices[0].message.content))

Data masking in the context of orchestration services typically involves the processes and techniques used to obscure specific data within a dataset, thereby protecting sensitive information while retaining its usability for analytics, development, or testing purposes. Although SAP provides various orchestration services (like SAP Data Intelligence, SAP Process Orchestration, etc.), the fundamental principles of data masking generally apply across platforms.

Here's an overview of how data masking might work within an orchestration service:

1. **Definition of Sensitivity**: Identify the data elements that need to be masked based on business rules and compliance requirements. This may include personal information such as names, identification numbers, financial data, etc.

2. **Masking Techniques**: Various techniques can be employed, including:
   - **Static Data Masking**: This replaces the sensitive data with masked values and is typically used in non-production environments.
   - **Dynamic Data Masking**: This keeps the original data intact in the backend, but provides masked values to users based on their roles or permissions.
   - **Tokenization**: Sensitive data is replaced with non-sensitive tokens that can represent the original data without exposing it.
   - **Shuffling**: Rearranging the values within a column or dataset so that the actual data is not easily identifiable.
   - **Data Obfuscation**: Modifying the data format or appearance without significant data loss.

3. **Integration within Orchestration Workflows**: The orchestration service can include data masking as part of its workflow processes. This means:
   - **Data Extraction**: When extracting data from source systems, the orchestration service can apply masking logic to sensitive data before it is moved to another system or environment.
   - **Data Transformation**: During data transformation processes (ETL - Extract, Transform, Load), masking can be incorporated into transformation rules.

4. **Automation and Monitoring**: Data masking can be automated within workflows to ensure that any sensitive data flows are consistently masked. Additionally, monitoring can be set up to ensure compliance and traceability.

5. **Auditing and Compliance**: Organizations often require audits to ensure that data masking procedures follow regulations (like GDPR, HIPAA, etc.). Orchestration services should include mechanisms for logging and monitoring access to sensitive data.

6. **User Role Management**: Access controls can be integrated within the orchestration service to ensure that users view only masked data unless they have permission to see original data.

7. **Performance Considerations**: It's important to maintain performance when applying data masking techniques, especially in high-transaction environments.

To implement data masking effectively, organizations often combine these techniques to suit their specific needs, ensuring that they meet both security requirements and operational efficiency. Always consult relevant technical documentation specific to your SAP environment for detailed implementation approaches and tools available.

☝️ Compare the response to [SAP Help - Generative AI Hub SDK](https://help.sap.com/doc/generative-ai-hub-sdk/CLOUD/en-US/_reference/orchestration-service.html). 

👉 What did the model respond? Was it the truth or a hallucination?

👉 Which questions work well, which questions still do not work so well?

# Use Multimodal Models

Multimodal models can use different inputs such as text, audio and images. In Generative AI Hub on SAP AI Core you can access multiple multimodal models (e.g. `gpt-4o-mini`).

👉 If you have not deployed one of the gpt-4o-mini models in previous exercises, then go back to the model library and deploy a model that can also process images.

👉 Now run the code snippet below to get a description for [the image of the AI Foundation Architecture](documents/ai-foundation-architecture.png). These descriptions can then for example be used as alternative text for screen readers or other assistive tech.

👉 You can upload your own image and play around with it!

In [7]:
import base64

# get the image from the documents folder
with open("documents/ai-foundation-architecture.png", "rb") as image_file:
    image_data = base64.b64encode(image_file.read()).decode('utf-8')

messages = [{"role": "user", "content": [
            {"type": "text", "text": "Describe the images as an alternative text."},
            {"type": "image_url", "image_url": {
                "url": f"data:image/png;base64,{image_data}"}
            }]
        }]

kwargs = dict(model_name="gpt-4o-mini", messages=messages)
response = chat.completions.create(**kwargs)

display(Markdown(response.choices[0].message.content))

The image is a structured diagram titled "AI Foundation on SAP BTP." It outlines various components related to AI services and management within the SAP Business Technology Platform. 

At the top, there are sections labeled "AI Services," displaying four features: Document Processing, Recommendation, Machine Translation, and Generative AI Management, highlighted in a pink box. Under Generative AI Management, there are three subcategories: Toolset, Trust & Control, and Access.

Beneath that, the "AI Workload Management" section includes Training and Inference, while the "Business Data & Context" segment features Vector Engine and Data Management in blue boxes.

Further down, the "Foundation Models" area lists three categories: SAP built, Hosted, and Remote, with an additional category for Fine-tuned. At the bottom, there is a section titled "Lifecycle Management" in a blue box. The overall layout is organized and visually segmented for clarity.

# Extracting text from images
Nora loves bananabread and thinks recipes are a good example of how LLMs can also extract complex text from images, like from [a picture of a recipe of a bananabread](documents/bananabread.png). Try your own recipe if you like :)

This exercise also shows how you can use the output of an LLM in other systems, as you can tell the LLM how to output information, for example in JSON format.

In [8]:
# get the image from the documents folder
with open("documents/bananabread.png", "rb") as image_file:
    image_data = base64.b64encode(image_file.read()).decode('utf-8')

messages = [{"role": "user", "content": [
            {"type": "text", "text": "Extract the ingredients and instructions in two different json files."},
            {"type": "image_url", "image_url": {
                "url": f"data:image/png;base64,{image_data}"}
            }]
        }]

kwargs = dict(model_name="gpt-4o-mini", messages=messages)
response = chat.completions.create(**kwargs)

display(Markdown(response.choices[0].message.content))

Here are the ingredients and instructions extracted from the banana bread recipe into two separate JSON files.

**Ingredients JSON:**

```json
{
  "dry_ingredients": {
    "all_purpose_flour": "260 g",
    "sugar": "200 g",
    "baking_soda": "6 g",
    "salt": "3 g"
  },
  "wet_ingredients": {
    "banana": "225 g",
    "large_eggs": "2",
    "vegetable_oil": "100 g",
    "whole_milk": "55 g",
    "vanilla_extract": "5 g"
  },
  "toppings": {
    "chocolate_chips": "100 g",
    "walnuts": "100 g (optional)"
  }
}
```

**Instructions JSON:**

```json
{
  "instructions": [
    "Preheat oven to 180 °C",
    "Mash banana in a bowl",
    "Combine banana and other wet ingredients together in the same bowl",
    "Mix all dry ingredients together in a separate bowl",
    "Use a whisk to combine dry mixture into wet mixture until smooth",
    "Pour mixture into a greased/buttered loaf pan",
    "Place the pan into preheated oven on the middle rack and bake for about 60 minutes or until a toothpick comes out clean",
    "Enjoy!"
  ]
}
```

[Next exercise](04-create-embeddings.ipynb)
