# Initial configuration of SAP AI Core for BYOM-OSS-LLM-AI-CORE
This notebook automates the initial configurations for application BYOM-OSS-LLM-AI-CORE to bring open-sourced llms into SAP AI Core. Alternatively, you can perform the same with SAP AI Launchpad.
- Onboarding Git Repository
- Create an Application and Synchronize
- Create the configurations for scenarios ollama, local-ai, llama.cpp, and vllm

### 0: Replace <YOUR_DOCKER_SECRET> and <YOUR_DOCKER_USER> in serving templates
First of most, replace the place holders <YOUR_DOCKER_SECRET> and <YOUR_DOCKER_USER> with your docker secret and user in the following serving templates.
```yaml
    spec: |
      predictor:
        imagePullSecrets:
          - name: <YOUR_DOCKER_SECRET>
          ...
        containers:
            - name: kserve-container
              image: docker.io/<YOUR_DOCKER_USER>/ollama:ai-core
```
- [../byom-oss-llm-templates/llama.cpp-template.yaml](../byom-oss-llm-templates/llama.cpp-template.yaml)
- [../byom-oss-llm-templates/local-ai-template.yaml](../byom-oss-llm-templates/local-ai-template.yaml)
- [../byom-oss-llm-templates/ollama-template.yaml](../byom-oss-llm-templates/ollama-template.yaml)
- [../byom-oss-llm-templates/vllm-template.yaml](../byom-oss-llm-templates/vllm-template.yaml)

#### 1: Copy [config.template.json](config.template.jso) as [config.json](config.json) 

In [None]:
%%sh
cp config.template.json config.json

#### 2: Review and Update configuration in [config.json](config.json)
Please read the **comments** carefully in [config.json](config.json) and update the necessary configurations.  
- **name**: used as name of git repository and application. 
- **resource_group**: "default" will be used if not specified. It is optional but recommended to create a dedicate resource group, and update it [config.json](config.json). By default, "default" resource group is in place for all the AI Core instances.AI Core with tree tier plan is not able to create a new resource group.
- **ai_core_sk**: update with your own AI Core Service Key
- **git_repo**: update the git repo configuration with your owns
    - repo_url: url to your forked repository. It should be: https://github.com/<YOUR_GITHUB_ORG_OR_USER>/btp-generative-ai-hub-use-cases
    - user: update with your github user
    - access_token: update with your github user access token
- **application**: The SAP AI Core application hosts the scenarios of ollama etc to serving open sourced llms in SAP AI Core
    - path_in_repo: relative path to the serving templates. No change needed.
- **configurations**: Review the configurations of the scenarios. By default, it is configured to load the mistral-7b quantization model with [resource plan infer.s in SAP AI Core](https://help.sap.com/docs/sap-ai-core/sap-ai-core-service-guide/choose-resource-plan-c58d4e584a5b40a2992265beb9b6be3c) defined in [../byom-oss-llm-templates](../byom-oss-llm-templates). It is recommend to go ahead first with the default configurations in config.json.
    - **Ollama**: No configuration required for ollama. Pull the model dynamically in [ollama/ollama.ipynb](ollama/ollama.ipynb)
    - **LocalAI**: LocalAI enable you to [preload model during startup](https://localai.io/advanced/#preloading-models-during-startup). The initial configuration in config.json will preload model [Mistral-7B-OpenOrca-GGUF](https://github.com/go-skynet/model-gallery/blob/main/mistral.yaml) with local-ai on resource plan 'infer.s' defined in [local-ai-template.yaml](../byom-oss-llm-templates/local-ai-template.yaml). In its model config file, GPU acceleration isn't enabled, hence it is quite slow. To have GPU acceleration for a model, you may set in its model config yaml file. For example [mixtral-Q6.yaml](https://github.com/go-skynet/model-gallery/blob/main/mixtral-Q6.yaml). Please review the [full config model file reference](https://localai.io/advanced/#full-config-model-file-reference)
        ```sh
        f16: true 
        mmap: true 
        gpu_layers: xx 
        ```
        In addition, you can install more models through end point /model/apply in [local-ai/local-ai.ipynb](local-ai/local-ai.ipynb). Please refer to https://localai.io/advanced/#preloading-models-during-startup

#### 2: Load the configurations from [config.json](config.json)
The service key of AI Core are located in section ai_core_sk of [config.json](config.json).<br/>
Please update it with your own service key before running this notebook 

In [61]:
import json
from time import sleep

with open("config.json") as f:
    config = json.load(f)

# Initializations
resource_group = config.get("resource_group", "default")
name = config.get("name", "open-source-llms")
print("Configurations loaded from config.json")
print("name: ", name, "resource_group: ", resource_group )

Configurations loaded from config.json


#### 3: Initialize AI Core SDK Client
The service key of AI Core are located in section ai_core_sk of [config.json](config.json).<br/>
Please update it with your own service key before running this notebook 

In [62]:
from ai_core_sdk.ai_core_v2_client import AICoreV2Client
from ai_core_sdk.models import ParameterBinding

ai_core_sk = config["ai_core_service_key"]
client = AICoreV2Client(base_url=ai_core_sk.get("serviceurls").get("AI_API_URL")+"/v2",
                        auth_url=ai_core_sk.get("url")+"/oauth/token",
                        client_id=ai_core_sk.get("clientid"),
                        client_secret=ai_core_sk.get("clientsecret"),
                        resource_group=resource_group)
print(f"resource group: {resource_group}, name: {name}")


resource group: oss-llm, name: byom-open-source-llms


#### 4: Create a dedicated resource group (Optional but recommended)
resource_group defined here must be matched with resource_group in [config.json](config.json). Default as "oss-llm"

In [None]:
resource_group = "oss-llm"

response = client.resource_groups.create(resource_group_id = resource_group)
print(response.__dict__)

### 5: Create repository and application

In [64]:
# Onboard repository
repo_config = config["git_repo"]
repository = client.repositories.create(name,
                                        url=repo_config.get("repo_url"),
                                        username=repo_config.get("user"),
                                        password=repo_config.get("access_token")
                                        )
print(repository)

# Create application
app_config = config["application"]
application = client.applications.create(revision=app_config.get("revision", "HEAD"),
                                        path=app_config.get("path_in_repo"),
                                        application_name=name,
                                        repository_name=name
                                        )
print(application)

Message: Repository has been on-boarded.
Id: byom-open-source-llms, Message: Application has been successfully created.


### 6: Check if application has synced and scenario created

In [65]:
max_tries = 10
i = 0
interval_s = 20
while i < max_tries:
    i = i +1
    app_status = client.applications.get_status(name)
    print(f"Health Status: {app_status.health_status}, Sync Status: {app_status.sync_status}, Sync Finished at: {app_status.sync_finished_at}" )
    
    if(app_status.sync_status == "Synced"):
        break

    # Synchronize the application and wait
    client.applications.refresh(name) 
    sleep(interval_s)

if app_status.sync_status == "Synced":
    print("Application synced")
    # Check scenarios
    scenarios = client.scenario.query()

    scenario_list = config["scenarios"]
    for scenario in scenario_list:
        scenario_name = scenario["name"]
        scenario_exists = scenario_name in [s.name for s in scenarios.resources]
        print(f"Scenario {scenario} synced") if scenario_exists else print(f"Scenario {scenario_name} not yet available")

else:
    #print(f"Application not yet synced after 10 time retry. Likely, something wrong in the templates under git repo {repository.url}/{app_config.get("path_in_repo")}.\nPlease check it. You can run this cell again once it is fixed.")
    print(f"Application not yet synced after 10 time retry. Please execute this cell again")

Health Status: Healthy, Sync Status: Synced, Sync Finished at: 2024-03-20T05:11:45Z
Application synced
Scenario ollama synced
Scenario local-ai synced
Scenario llama.cpp synced
Scenario vllm synced


### 7: Create configurations

In [66]:
def update_json_file(file_path, key, value):
    # Load the JSON configuration file
    with open(file_path, 'r') as file:
        config = json.load(file)

    # Update the value
    config[key] = value

    # Write the updated configuration back to the file
    with open(file_path, 'w') as file:
        json.dump(config, file, indent=4)
        print(f"{file_path} updated. {key}: {value}")

In [67]:
# Create serving configurations
conf_list = config["configurations"]

for conf in conf_list:
    parameter_bindings = [ParameterBinding(pb['key'], pb['value']) for pb in conf["parameters"]]    
    configuration = client.configuration.create(
        name=conf["name"],
        scenario_id=conf["scenario_id"],
        executable_id=conf["executable_id"],
        parameter_bindings=parameter_bindings,
    )
    print(f'--------------{conf["scenario_id"]}--------------')
    print(configuration)

    # Update the configuration_id in env.json under the corresponding folder
    # which will be used in continuos-deployment.ipynb to create deployment automatically.
    update_json_file(f'{conf["executable_id"]}/env.json',"configuration_id", configuration.id)
    config_id = configuration.id

--------------byom-ollama-server--------------
Id: 61fb062e-4bdb-40d7-918b-ee48fac7d48b, Message: Configuration created
ollama/env.json updated. configuration_id: 61fb062e-4bdb-40d7-918b-ee48fac7d48b
--------------byom-local-ai-server--------------
Id: c972d7d5-4357-4aa0-9671-a9bcd4469538, Message: Configuration created
local-ai/env.json updated. configuration_id: c972d7d5-4357-4aa0-9671-a9bcd4469538
--------------byom-llama.cpp-server--------------
Id: 33a87ba9-48e1-431c-b963-282665428ca4, Message: Configuration created
llama.cpp/env.json updated. configuration_id: 33a87ba9-48e1-431c-b963-282665428ca4
--------------byom-vllm-server--------------
Id: 6a640f56-df21-487e-855b-5e1cde04bd9c, Message: Configuration created
vllm/env.json updated. configuration_id: 6a640f56-df21-487e-855b-5e1cde04bd9c
