# Automl

* Select your experiment type: Classification, Regression or Time Series Forecasting
* Data source, formats, and fetch data
* Choose your compute target: local or remote
* Automated machine learning experiment settings
* Run an automated machine learning experiment
* Explore model metrics
* Register and deploy model

>* Classification | Regression | Time Series Forecasting
>* Logistic Regression | Elastic Net | Elastic Net
>* Light GBM | Light GBM | Light GBM
>* Gradient Boosting | Gradient Boosting | Gradient Boosting
>* Decision Tree | Decision Tree | Decision Tree
>* K Nearest Neighbors | K Nearest Neighbors | K Nearest Neighbors
>* Linear SVC | LARS Lasso | LARS Lasso
>* C-Support Vector Classification (SVC) | Stochastic Gradient Descent (SGD) | Stochastic Gradient Descent (SGD)
>* Random Forest | Random Forest | Random Forest
>* Extremely Randomized Trees | Extremely Randomized Trees	Extremely Randomized Trees
>* Xgboost | Xgboost | Xgboost
>* DNN Classifier | DNN Regressor | DNN Regressor
>* DNN Linear Classifier | Linear Regressor | Linear Regressor
>* Naive Bayes |  | 
>* Stochastic Gradient Descent (SGD) |  | 

* **Chapitre 1: Local dataset for classifieur problem**

## Train & tunning

>### Local

In [1]:
from azureml.core import Workspace
from azureml.core.authentication import InteractiveLoginAuthentication
from azureml.train.automl import AutoMLConfig
import azureml.dataprep as dprep

In [2]:
ws = Workspace(
            subscription_id='21868db0-87fa-4f03-bb0f-db7749ad7c9f',
            resource_group='Darwin-Dataanalysis-ML',
            workspace_name='wei'
        )

In [3]:
ws.get_details()

{'id': '/subscriptions/21868db0-87fa-4f03-bb0f-db7749ad7c9f/resourceGroups/Darwin-Dataanalysis-ML/providers/Microsoft.MachineLearningServices/workspaces/wei',
 'name': 'wei',
 'location': 'northeurope',
 'type': 'Microsoft.MachineLearningServices/workspaces',
 'tags': {},
 'workspaceid': '63a55e15-586b-41e2-88d1-a78eb29a2429',
 'description': '',
 'friendlyName': '',
 'creationTime': '2019-06-24T11:17:48.0404884+00:00',
 'containerRegistry': '/subscriptions/21868db0-87fa-4f03-bb0f-db7749ad7c9f/resourceGroups/Darwin-Dataanalysis-ML/providers/Microsoft.ContainerRegistry/registries/weic7e5d00d',
 'keyVault': '/subscriptions/21868db0-87fa-4f03-bb0f-db7749ad7c9f/resourcegroups/darwin-dataanalysis-ml/providers/microsoft.keyvault/vaults/wei6205928243',
 'applicationInsights': '/subscriptions/21868db0-87fa-4f03-bb0f-db7749ad7c9f/resourcegroups/darwin-dataanalysis-ml/providers/microsoft.insights/components/wei3487882946',
 'identityPrincipalId': '326a1b58-3711-44c8-ad79-a9b689d96ac2',
 'identit

In [12]:
import pandas as pd
from sklearn import datasets

data_train = datasets.load_digits()

pd.DataFrame(data_train.data[100:,:]).to_csv("data/X_train_classifieur.csv", index=False)
pd.DataFrame(data_train.target[100:]).to_csv("data/y_train_classifieur.csv", index=False)

In [4]:
ds = ws.get_default_datastore()
ds.upload(src_dir='./data', target_path='digitsdata', overwrite=True, show_progress=True)

Uploading an estimated of 4 files
Uploading ./data\.ipynb_checkpoints\X_train_classifieur-checkpoint.csv
Uploading ./data\.ipynb_checkpoints\y_train_classifieur-checkpoint.csv
Uploading ./data\X_train_classifieur.csv
Uploading ./data\y_train_classifieur.csv
Uploaded ./data\y_train_classifieur.csv, 1 files out of an estimated total of 4
Uploaded ./data\.ipynb_checkpoints\y_train_classifieur-checkpoint.csv, 2 files out of an estimated total of 4
Uploaded ./data\.ipynb_checkpoints\X_train_classifieur-checkpoint.csv, 3 files out of an estimated total of 4
Uploaded ./data\X_train_classifieur.csv, 4 files out of an estimated total of 4
Uploaded 4 files


$AZUREML_DATAREFERENCE_2279f27742b14d64a01fea19c600721a

In [8]:
X = dprep.auto_read_file(path=ds.path('digitsdata/X_train_classifieur.csv'))
y = dprep.auto_read_file(path=ds.path('digitsdata/y_train_classifieur.csv'))


In [9]:
X_local = pd.read_csv('data/X_train_classifieur.csv')
y_local = pd.read_csv('data/y_train_classifieur.csv')

In [10]:
automl_config = AutoMLConfig(task = 'classification',
                                 debug_log = 'automl_errors.log',
                                 project_folder = './automl-classification',
                                 X = X,
                                 y = y,
                                 n_cross_validations=3,
                                 iteration_timeout_minutes=1,
                                 primary_metric='accuracy',
                                )

In [11]:
from azureml.core.experiment import Experiment

In [29]:
experiment = Experiment(ws, name='automl-classification')

In [30]:
run = experiment.submit(automl_config, show_output=True)



Running on local machine
Parent Run ID: AutoML_ea5434ca-7441-4f1e-9bdf-01768aaf84e5
Current status: DatasetCrossValidationSplit. Generating CV splits.
Current status: ModelSelection. Beginning model selection.

****************************************************************************************************
ITERATION: The iteration being evaluated.
PIPELINE: A summary description of the pipeline being evaluated.
DURATION: Time taken for the current iteration.
METRIC: The result of computing score on the fitted pipeline.
BEST: The best observed score thus far.
****************************************************************************************************

 ITERATION   PIPELINE                                       DURATION      METRIC      BEST
         0   StandardScalerWrapper SGD                      0:00:10       0.9476    0.9476
         1   StandardScalerWrapper SGD                      0:00:09       0.9458    0.9476
         2   MinMaxScaler SGD                           

In [32]:
best_run, fitted_model = run.get_output()



In [33]:
best_run

Experiment,Id,Type,Status,Details Page,Docs Page
automl-classification,AutoML_ea5434ca-7441-4f1e-9bdf-01768aaf84e5_40,,Completed,Link to Azure Portal,Link to Documentation


In [34]:
fitted_model

Pipeline(memory=None,
     steps=[('prefittedsoftvotingclassifier', PreFittedSoftVotingClassifier(classification_labels=None,
               estimators=[('12', Pipeline(memory=None,
     steps=[('TruncatedSVDWrapper', TruncatedSVDWrapper(n_components=0.45526315789473687, random_state=None)), ('SVCWrapper', SVCWrapper(C=232.99...
               flatten_transform=None,
               weights=[0.375, 0.25, 0.125, 0.125, 0.125]))])

In [54]:
pred_local = fitted_model.predict(X_local)

In [62]:
(y_local.values.reshape(-1) == pred_local).mean()
#pred_local

1.0

In [64]:
X_local.shape

(1697, 64)

>### Remote

In [71]:
from azureml.core.compute import AmlCompute
from azureml.core.compute import ComputeTarget

amlcompute_cluster_name = "automlcl"  # Name your cluster
provisioning_config = AmlCompute.provisioning_configuration(vm_size="STANDARD_D2_V2",
                                                            min_nodes=0,
                                                            max_nodes=4)
compute_target = ComputeTarget.create(
    ws, amlcompute_cluster_name, provisioning_config)


In [73]:
from azureml.core.runconfig import RunConfiguration
from azureml.core.conda_dependencies import CondaDependencies

run_config = RunConfiguration(framework="python")
run_config.target = compute_target
run_config.environment.docker.enabled = True

dependencies = CondaDependencies.create(
    pip_packages=["scikit-learn", "scipy", "numpy"])
run_config.environment.python.conda_dependencies = dependencies

In [77]:
automl_config = AutoMLConfig(
                             task='classification',
                             debug_log='automl_errors.log',
                             compute_target=compute_target,
                             run_configuration=run_config,
                             X = X,
                             y = y,
                             n_cross_validations=3,
                             iteration_timeout_minutes=1,
                             primary_metric='accuracy',
                             )

In [81]:
experiment = Experiment(ws, 'automl_remote')

In [82]:
run = experiment.submit(automl_config, show_output=True)

Running on remote compute: automlcl
Parent Run ID: AutoML_eb99ba56-0b42-4cd2-885d-07b9b535e8a2
Current status: ModelSelection. Beginning model selection.

****************************************************************************************************
ITERATION: The iteration being evaluated.
PIPELINE: A summary description of the pipeline being evaluated.
DURATION: Time taken for the current iteration.
METRIC: The result of computing score on the fitted pipeline.
BEST: The best observed score thus far.
****************************************************************************************************

 ITERATION   PIPELINE                                       DURATION      METRIC      BEST
         0   StandardScalerWrapper SGD                      0:00:53       0.9476    0.9476
         1   StandardScalerWrapper SGD                      0:00:45       0.9388    0.9476
         2   MinMaxScaler SGD                               0:00:53       0.9205    0.9476
         3   MinMaxSc

In [89]:
run

Experiment,Id,Type,Status,Details Page,Docs Page
automl_remote,AutoML_eb99ba56-0b42-4cd2-885d-07b9b535e8a2,automl,Completed,Link to Azure Portal,Link to Documentation


In [90]:
best_run, fitted_model = run.get_output()

In [92]:
fitted_model

Pipeline(memory=None,
     steps=[('prefittedsoftvotingclassifier', PreFittedSoftVotingClassifier(classification_labels=None,
               estimators=[('25', Pipeline(memory=None,
     steps=[('PCA', PCA(copy=True, iterated_power='auto', n_components=0.95, random_state=None,
  svd_solver='auto', tol=0.0, whiten=False)), ('S...666666666666, 0.16666666666666666, 0.16666666666666666, 0.16666666666666666, 0.16666666666666666]))])

## Depoiement

>### registre

In [93]:
description = 'My AutoML classifieur Model'
model = run.register_model(description = description)

Registering model AutoMLeb99ba560best


In [94]:
model

Model(workspace=Workspace.create(name='wei', subscription_id='21868db0-87fa-4f03-bb0f-db7749ad7c9f', resource_group='Darwin-Dataanalysis-ML'), name=AutoMLeb99ba560best, id=AutoMLeb99ba560best:2, version=2, tags={}, properties={})

> To obtain the registred model by name

In [123]:
import os
import urllib.request
from azureml.core import Model

model_path = Model.get_model_path('AutoMLeb99ba560best')

In [124]:
model_path

'azureml-models\\AutoMLeb99ba560best\\2\\model.pkl'

> to download

In [121]:
model.download(target_dir='./registred_model/', exist_ok=True)

'registred_model\\model.pkl'

In [None]:
!pip install -U scikit-learn==0.21.3 --user

> to test model locally

In [133]:
from sklearn.externals import joblib
model = joblib.load('registred_model\\model.pkl')
model.predict(X_local)

AttributeError: 'SVC' object has no attribute '_impl'

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,54,55,56,57,58,59,60,61,62,63
0,0.00,0.00,0.00,2.00,13.00,0.00,0.00,0.00,0.00,0.00,...,0.00,0.00,0.00,0.00,0.00,0.00,13.00,6.00,0.00,0.00
1,0.00,0.00,1.00,12.00,5.00,0.00,0.00,0.00,0.00,0.00,...,6.00,0.00,0.00,0.00,3.00,10.00,16.00,12.00,1.00,0.00
2,0.00,0.00,12.00,16.00,16.00,8.00,0.00,0.00,0.00,3.00,...,0.00,0.00,0.00,0.00,11.00,16.00,12.00,0.00,0.00,0.00
3,0.00,4.00,13.00,16.00,16.00,12.00,3.00,0.00,0.00,3.00,...,0.00,0.00,0.00,3.00,15.00,12.00,2.00,0.00,0.00,0.00
4,0.00,0.00,0.00,8.00,14.00,4.00,0.00,0.00,0.00,0.00,...,0.00,0.00,0.00,0.00,0.00,10.00,13.00,8.00,0.00,0.00
5,0.00,0.00,9.00,15.00,5.00,0.00,0.00,0.00,0.00,0.00,...,15.00,0.00,0.00,0.00,5.00,12.00,12.00,9.00,1.00,0.00
6,0.00,0.00,0.00,5.00,11.00,1.00,0.00,0.00,0.00,0.00,...,6.00,0.00,0.00,0.00,0.00,6.00,14.00,16.00,8.00,0.00
7,0.00,0.00,0.00,0.00,6.00,10.00,0.00,0.00,0.00,0.00,...,0.00,0.00,0.00,0.00,0.00,0.00,8.00,15.00,0.00,0.00
8,0.00,0.00,2.00,11.00,16.00,4.00,0.00,0.00,0.00,0.00,...,0.00,0.00,0.00,0.00,0.00,12.00,7.00,0.00,0.00,0.00
9,0.00,0.00,15.00,16.00,16.00,12.00,2.00,0.00,0.00,2.00,...,0.00,0.00,0.00,1.00,15.00,16.00,15.00,3.00,0.00,0.00


>### score file

In [125]:
%%writefile score.py
import json
import numpy as np
import pandas as pd
import os
import pickle
from sklearn.externals import joblib
from sklearn.linear_model import LogisticRegression
from azureml.core.model import Model

def init():

    global model

    model_path = Model.get_model_path('AutoMLeb99ba560best')
    model = joblib.load(model_path)
    
def run(raw_data):
    # get predictions and explanations for each data point
    data = pd.read_json(raw_data)
    # make prediction
    predictions = model.predict(data)
    # retrieve model explanations

    # you can return any data type as long as it is JSON-serializable
    return {'predictions': predictions.tolist()}

Writing score.py


>### ACI

In [110]:
from azureml.core.webservice import AciWebservice, Webservice
from azureml.core.model import Model

deployment_config = AciWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)
# service = Model.deploy(ws, "aciservice", [model], inference_config, deployment_config)
# service.wait_for_deployment(show_output = True)
# print(service.state)

In [111]:
AciWebservice.deploy_configuration

<function azureml.core.webservice.aci.AciWebservice.deploy_configuration(cpu_cores=None, memory_gb=None, tags=None, properties=None, description=None, location=None, auth_enabled=None, ssl_enabled=None, enable_app_insights=None, ssl_cert_pem_file=None, ssl_key_pem_file=None, ssl_cname=None, dns_name_label=None)>

In [112]:
from azureml.core.model import InferenceConfig

In [98]:
from azureml.core.webservice import AciWebservice, AksWebservice, LocalWebservice

In [106]:
import json
test_sample = json.dumps({'data': [
    list(X_local.iloc[0,:]), 
    list(X_local.iloc[4,:]), 
]})

In [107]:
test_sample

'{"data": [[0.0, 0.0, 0.0, 2.0, 13.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 8.0, 15.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, 16.0, 5.0, 2.0, 0.0, 0.0, 0.0, 0.0, 15.0, 12.0, 1.0, 16.0, 4.0, 0.0, 0.0, 4.0, 16.0, 2.0, 9.0, 16.0, 8.0, 0.0, 0.0, 0.0, 10.0, 14.0, 16.0, 16.0, 4.0, 0.0, 0.0, 0.0, 0.0, 0.0, 13.0, 8.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 13.0, 6.0, 0.0, 0.0], [0.0, 0.0, 0.0, 8.0, 14.0, 4.0, 0.0, 0.0, 0.0, 0.0, 7.0, 16.0, 7.0, 0.0, 0.0, 0.0, 0.0, 0.0, 14.0, 10.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 16.0, 6.0, 0.0, 0.0, 0.0, 0.0, 0.0, 3.0, 16.0, 16.0, 10.0, 0.0, 0.0, 0.0, 0.0, 2.0, 16.0, 12.0, 14.0, 6.0, 0.0, 0.0, 0.0, 0.0, 12.0, 15.0, 11.0, 10.0, 0.0, 0.0, 0.0, 0.0, 0.0, 10.0, 13.0, 8.0, 0.0, 0.0]]}'