# Decanter AI Core SDK Jupyter Notebook Example
This example provides how to use the python package decanter-ai-core-sdk, including installation, usage of the apis, and some ways to show chart of the experiment's attributes and data values. 

## Running Example

In [1]:
import os
from decanter import core
from decanter.core.core_api import TrainInput, PredictInput

### Create Context
Create Context will set the connection to decanter core server, and create an event loop. Since Jupyter already have an event loop, SDK will just use the current event loop. See more in [here](https://www.notion.so/API-615d2fba4e7f45c4b5fe63cc192e481f#bb4f0a4b2847450abc4f80b025469170)

In Jupyter, it will initially exist a running event loop.

In [2]:
import asyncio
loop = asyncio.get_running_loop()
loop.is_running()

True

Note: It is ok to recreate context to set the usr, pwd, host.

In [3]:
# # enable default logger
# core.enable_default_logger()
# # set the username, password, host
# context = core.Context.create(
#         username='gp', password='gp-admin', host='http://192.168.2.12:2999')

client = core.CoreClient(
        username='gp', password='gp-admin', host='http://192.168.2.12:2999')

### Create Decanter Core Client
CoreClient handles the actions of calling api and getting the results.

In [6]:
client = core.CoreClient()

Example of using `core.upload`, `core.train`, `core.predict`.
While it's running, it'll show up the progress bar showing the progress of current running jobs.

In [5]:
# open train & test file
train_file_path = '/Users/matthewk/Desktop/Intern/decanter-ai-core-sdk/examples/data/train.csv'
test_file_path = '/Users/matthewk/Desktop/Intern/decanter-ai-core-sdk/examples/data/test.csv'
train_file = open(train_file_path , 'r')
test_file = open(test_file_path , 'r')

# upload data to corex 
train_data = client.upload(file=train_file, name="train_data")
test_data = client.upload(file=test_file, name="test_data")
# from decanter.core.jobs import DataUpload
# train_data = DataUpload.create(data_id = "{data_id}", name="train_data")

HBox(children=(FloatProgress(value=0.0, description='Progress train_data', style=ProgressStyle(description_wid…

HBox(children=(FloatProgress(value=0.0, description='Progress test_data', style=ProgressStyle(description_widt…

In [7]:
train_data

<decanter.core.jobs.data_upload.DataUpload at 0x7fd521e2ad68>

In [8]:
test_data.accessor['uri']

'hdfs://192.168.2.12:8020/data/5f607fb3ad2c960001f71109'

In [9]:
test_data.id

'5f607fb3ad2c960001f71109'

In [11]:
train_data = client.setup(
        data_source={
            'uri': test_data.accessor['uri'],
            'format': 'csv'
            },
        data_id=test_data.id,
        data_columns=[
            {
                'id': 'Survived',
                'data_type': 'categorical'
            }, {
                'id': 'Age',
                'data_type': 'numerical'
            }],
        name='mysetup')

HBox(children=(FloatProgress(value=0.0, description='Progress mysetup', style=ProgressStyle(description_width=…

In [12]:
train_data

<decanter.core.jobs.data_upload.DataUpload at 0x7fd521f6da58>

In [13]:
train_input = TrainInput(data=train_data, target='Survived', algos=["XGBoost"], max_model=2, tolerance=0.9)

In [14]:
train_input

<decanter.core.core_api.train_input.TrainInput at 0x7fd521f84668>

In [15]:
train_input.data

<decanter.core.jobs.data_upload.DataUpload at 0x7fd521f6da58>

In [16]:
# set train parameters train model
exp = client.train(train_input=train_input, select_model_by='mean_per_class_error', name='myexp')

# set predict parameters and predict result
predict_input = PredictInput(data=test_data, experiment=exp)
pred_res = client.predict(predict_input=predict_input, name='mypred')

HBox(children=(FloatProgress(value=0.0, description='Progress myexp', style=ProgressStyle(description_width='i…

In [17]:
# To prevent getting attributes when corresponding jobs aren't finished
exp.get(attr='attributes')
exp.best_model.get(attr='importances')
pred_res.get(attr='schema')

In [18]:
predict_input = PredictInput(data=test_data, experiment=exp)#, select_model='recommendation', select_opt='auc')

In [19]:
pred_res = client.predict(predict_input=predict_input, name='mypred')

HBox(children=(FloatProgress(value=0.0, description='Progress mypred', style=ProgressStyle(description_width='…

In [20]:
pred_res.show_df()

Unnamed: 0,Survived,prediction,0,1
0,0,0,0.825907,0.174093
1,0,0,0.731170,0.268830
2,0,0,0.789807,0.210193
3,1,1,0.241964,0.758036
4,1,1,0.265757,0.734243
...,...,...,...,...
173,0,0,0.648509,0.351491
174,1,1,0.239233,0.760767
175,0,0,0.426663,0.573337
176,1,0,0.578562,0.421438


### Show Data
Use `data.show_df()` and `predict_result.show_df()` to create pandas dataframe. <br>
Use `data.show()` or `predict_result.show()` to show data in text.

In [38]:
train_data.show_df()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,714,0,3,"Larsson, Mr. August Viktor",male,29.0,0,0,7545,9.4833,,S
1,715,0,2,"Greenberg, Mr. Samuel",male,52.0,0,0,250647,13.0000,,S
2,716,0,3,"Soholt, Mr. Peter Andreas Lauritz Andersen",male,19.0,0,0,348124,7.6500,F G73,S
3,717,1,1,"Endres, Miss. Caroline Louise",female,38.0,0,0,PC 17757,227.5250,C45,C
4,718,1,2,"Troutt, Miss. Edwina Celia ""Winnie""",female,27.0,0,0,34218,10.5000,E101,S
...,...,...,...,...,...,...,...,...,...,...,...,...
173,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0000,,S
174,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S
175,889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.4500,,S
176,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0000,C148,C


### Show models attributes
List the attributes you wish to see in a list, and call `core.plot.show_model_attr(attrs, exp)`

In [None]:
core.plot.show_model_attr(metric='mean_per_class_error', score_types=['validation', 'cv_averages'], exp=exp)

### Get Data values with Data instance
Plot chart with the values in data instance

In [None]:
import matplotlib.pyplot as plt
import numpy as np


labels = ['Survied', 'Dead']
labels_gender = ['male', 'female', 'male', 'female']


df = train_data.show_df()
df_s = df.loc[df["Survived"] == 1]
df_d = df.loc[df["Survived"] == 0]

sizes = [df_s.shape[0], df_d.shape[0]]

sizes_gender = []
for d in [df_s, df_d]:
    s1 = d.loc[d["Sex"] == "male"].shape[0]
    s2 = d.loc[d["Sex"] == "female"].shape[0]
    sizes_gender.append(s1)
    sizes_gender.append(s2)

    
colors = ['#99ff99', '#ff6666']
colors_gender = ['#87CEFA','#FFB6C1', '#87CEFA','#FFB6C1']
 
# Plot
plt.pie(sizes, labels=labels, colors=colors, startangle=90,frame=True)
plt.pie(sizes_gender,colors=colors_gender ,radius=0.75,startangle=90)
centre_circle = plt.Circle((0,0),0.5,color='black', fc='white',linewidth=0)
fig = plt.gcf()
fig.gca().add_artist(centre_circle)
 
plt.axis('equal')
plt.tight_layout()
plt.show()