# Sample notebook showing end-to-end ML flow using the FedML DSP Library in NVIDIA GPU notebook.

## The FedML DSP Library reads the training data via SAP Datasphere, trains the model, deploys the model in SAP AI Core and the inference result is written back to SAP Datasphere.

## Install fedml_dsp library

In [None]:
%pip install fedml-dsp

In [None]:
from fedml_dsp import DbConnection, Fedml
import cudf, cuml, cupy
import json

## 1. Connect to SAP Datasphere , Explore & Acquire Data

### 1.1 Create DbConnection instance to get data from SAP Datasphere.

In [None]:
with open('config.json', 'r') as f:
    config = json.load(f)

In [None]:
db = DbConnection()

### 1.2 Query the SAP Datasphere data using SQL Queries. Get the data as a CUDF DataFrame

In [None]:
%%time
data = db.get_data_with_headers_cudf('SMALLCOVTYPE_VIEW', 1)

In [None]:
type(data)

In [None]:
data.head(5)

In [None]:
data.info()

### 1.3 Preprocess the data

In [None]:
def preprocess_data(concat_data):
    
    #map categorical values to numbers
    lable = concat_data['cover_type']
    
    #fix datatypes
    df_X = concat_data.drop(['cover_type'], axis=1)
    df_X = df_X.astype('float64')
    
    return df_X, lable

In [None]:
df_X, lable = preprocess_data(data)

## 2. Now, using the data, train the model

In [None]:
x_train, x_test, y_train, y_test  = cuml.train_test_split(df_X, lable, train_size=0.8)

### 2.1 Train the LogisticRegression model using the fit method

In [None]:
from cuml import LogisticRegression

In [None]:
model = LogisticRegression().fit(x_train, y_train)

### 2.2 Save the model

In [None]:
import joblib

In [None]:
# save
joblib.dump(model, 'LR_model.pkl')

## 3. Deploy the model to SAP AI Core

For more detailes of how to deploy a machine learning model to SAP AI Core and YAML file template, please refer to [(here)](https://github.tools.sap/btp-use-case-factory/FedML/blob/main/DSP/fedml-dsp.md).

Prerequisites before proceeding with the below cells : You have containerized your model for deployment and hosted the image in a container registry ( dockerhub, ecr etc)

### 3.1 Create a AI Core Service Key

You can create a AI Core Service Key by following these steps [(here)](https://help.sap.com/docs/sap-ai-core/sap-ai-core-service-guide/create-service-key) and [(here)](https://developers.sap.com/tutorials/ai-core-setup.html).

In [None]:
fedml = Fedml(aic_service_key='aic_service_key.json')

### 3.2 Onboard ai core resources

You will need to onboard any ai core resources needed. These include your github repository, AI core resource group, and secret to provide AI core pull premissions to your docker registry.

In [None]:
fedml.onboard_ai_core(create_resource_group=False,
                    resource_group="<your resource group>", 
                    onboard_new_repo=False,
                    github_info_path="github_info.json",
                    secret_path="secret.json")

### 3.3 Register application

You need to register the application you want to use in AI Core. You only need to perform this step when you need to register a new application. So if you are using an already existing AI Core application for your deployment, skip this step.

In [None]:
application_details = {
    "application_name": "<your application name>",
    "revision":"HEAD",
    "repository_url": "https://github.com/username/repo_name", # Change this
    "path": "deployment"}

In [None]:
fedml.register_application(application_details=application_details)

### 3.4 Deploy to AI Core

In [None]:
deployment_config = {
    "name": "<application name>", 
    "resource_group": "<resource group name>", 
    "scenario_id": "<scenario id>", 
    "executable_id": "<executable id>"
}

In [None]:
endpoint = fedml.ai_core_deploy(deployment_config=deployment_config)

## 4. Inference the deployed model by passing the test data

In [None]:
headers = {"Authorization":fedml.get_ai_core_token(),
           "ai-resource-group": "<resource group name>",
           "Content-Type": "text/csv"}

In [None]:
response_data = fedml.ai_core_inference(endpoint=endpoint,headers=headers,body=x_test.to_json(orient='records'))

In [None]:
res = response_data.json

In [None]:
print(res)

## 5. Store the inferencing result in SAP Datasphere

### 5.1 Store the inference result in the pandas dataframe

In [None]:
dwc_data = x_test.iloc[:,:-1]

In [None]:
dwc_data = dwc_data.assign(cover_type = res['prediction'])
dwc_data

### 5.2 Create a table in Datasphere for storing the inference result

In [None]:
db.create_table("CREATE TABLE Log_Reg_Model (elevation INTEGER PRIMARY KEY, aspect INTEGER, slope INTEGER,\
    horizontal_distance_to_hydrology INTEGER, vertical_distance_to_hydrology INTEGER, \
    horizontal_distance_to_roadways INTEGER, hillshade_9am INTEGER, hillshade_noon INTEGER, hillshade_3pm INTEGER,\
    horizontal_distance_to_fire_points INTEGER, wilderness_area_1 INTEGER, wilderness_area_2 INTEGER,\
    wilderness_area_3 INTEGER, wilderness_area_4 INTEGER, soil_type_1 bool, soil_type_2 bool, soil_type_3 bool,\
    soil_type_4 bool, soil_type_5 bool, soil_type_6 bool, soil_type_7 bool, soil_type_8 bool, soil_type_9 bool,\
    soil_type_10 bool, soil_type_11 bool, soil_type_12 bool, soil_type_13 bool, soil_type_14 bool, soil_type_15 bool,\
    soil_type_16 bool, soil_type_17 bool, soil_type_18 bool, soil_type_19 bool, soil_type_20 bool, soil_type_21 bool,\
    soil_type_22 bool, soil_type_23 bool, soil_type_24 bool, soil_type_25 bool, soil_type_26 bool, soil_type_27 bool,\
    soil_type_28 bool, soil_type_29 bool, soil_type_30 bool, soil_type_31 bool, soil_type_32 bool, soil_type_33 bool,\
    soil_type_34 bool, soil_type_35 bool, soil_type_36 bool, soil_type_37 bool, soil_type_38 bool, soil_type_39 bool,\
    soil_type_40 bool, cover_type INTEGER)

In [None]:
db.insert_into_table('Log_Reg_Model', dwc_data)