# Automated ML

TODO: Import Dependencies. In the cell below, import all the dependencies that you will need to complete the project.

In [1]:
from azureml.data.dataset_factory import TabularDatasetFactory
from azureml.core import Workspace, Experiment, Dataset, Model, Environment
from azureml.train.automl import AutoMLConfig
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
from azureml.core.conda_dependencies import CondaDependencies
from azureml.core.model import InferenceConfig
from azureml.core.webservice import AciWebservice


from sklearn.model_selection import train_test_split
import pandas as pd
import os
import json
import joblib
import sklearn

## Dataset

### Overview
TODO: In this markdown cell, give an overview of the dataset you are using. Also mention the task you will be performing.

### Kaggle - Housing Prices Competition for Kaggle Learn Users
We will be using the "Housing Prices Competition for Kaggle Learn Users" training and test datasets for this capstone project.

This is a regression competition in which competitors try to predict the price of the houses in the test dataset using the training dataset.

The original dataset was first published by Dean De Cock in his paper [Ames, Iowa: Alternative to the Boston Housing Data as an End of Semester Regression Project](https://www.researchgate.net/publication/267976209_Ames_Iowa_Alternative_to_the_Boston_Housing_Data_as_an_End_of_Semester_Regression_Project) at Journal of Statistics Education (November 2011).

For competition purposes, approximately all of the data has been divided into two parts: "training dataset" and "test dataset" We will be using the training dataset for training and the test dataset for submission to the competition.

TODO: Get data. In the cell below, write code to access the data you will be using in this project. Remember that the dataset needs to be external.

In [2]:
ws = Workspace.from_config()

# choose a name for experiment
experiment_name = 'automl_experiment'

experiment=Experiment(ws, experiment_name)

print('Workspace name: ' + ws.name, 
      'Azure region: ' + ws.location, 
      'Subscription id: ' + ws.subscription_id, 
      'Resource group: ' + ws.resource_group, sep = '\n')

Performing interactive authentication. Please follow the instructions on the terminal.
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code RYT78Q4EY to authenticate.
You have logged in. Now let us find all the subscriptions to which you have access...
Interactive authentication successfully completed.
Workspace name: quick-starts-ws-138013
Azure region: southcentralus
Subscription id: 48a74bb7-9950-4cc1-9caa-5d50f995cc55
Resource group: aml-quickstarts-138013


In [3]:
# https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.compute.amlcompute(class)?view=azure-ml-py#provisioning-configuration-vm-size-----vm-priority--dedicated---min-nodes-0--max-nodes-none--idle-seconds-before-scaledown-none--admin-username-none--admin-user-password-none--admin-user-ssh-key-none--vnet-resourcegroup-name-none--vnet-name-none--subnet-name-none--tags-none--description-none--remote-login-port-public-access--notspecified--

# https://docs.microsoft.com/en-us/azure/machine-learning/how-to-tune-hyperparameters
# https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/ml-frameworks/scikit-learn/train-hyperparameter-tune-deploy-with-sklearn/train-hyperparameter-tune-deploy-with-sklearn.ipynb


# Choose a name for your CPU cluster
cpu_cluster_name = "cpu-cluster"

# Verify that cluster does not exist already
try:
    cpu_cluster = ComputeTarget(workspace=ws, name=cpu_cluster_name)
    print('Found existing cluster, use it.')
except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2',
                                                              max_nodes=4)
    cpu_cluster = ComputeTarget.create(ws, cpu_cluster_name, compute_config)

cpu_cluster.wait_for_completion(show_output=True)

Creating
Succeeded
AmlCompute wait for completion finished

Minimum number of nodes requested have been provisioned


In [4]:
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler,OneHotEncoder
from sklearn.compose import ColumnTransformer
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
import pandas as pd
from sklearn.pipeline import Pipeline

#global X_train, X_test, y_train, y_test

def clean_data(data, test):

    # Convert dataset to pandas dataframe
    X = data.to_pandas_dataframe()
    X_test = test.to_pandas_dataframe()
    # Set Id to index
    X.set_index('Id',inplace=True)
    X_test.set_index('Id',inplace=True)
    # Remove rows with missing target, separate target from predictors
    X.dropna(axis=0, subset=['SalePrice'], inplace=True)
    y = X.SalePrice 
    # Remove target and 'Utilities' 
    X.drop(['SalePrice', 'Utilities'], axis=1, inplace=True)
    X_test.drop(['Utilities'], axis=1, inplace=True)
    # Split the data
    X_train, X_valid, y_train, y_valid = train_test_split(X,y)
    # Select object columns
    categorical_cols = [cname for cname in X_train.columns if X_train[cname].dtype == "object"]
    # Select numeric columns
    numerical_cols = [cname for cname in X_train.columns if X_train[cname].dtype in ['int64','float64']]

    # Imputation lists
    # imputation to null values of these numerical columns need to be 'constant'
    constant_num_cols = ['GarageYrBlt', 'MasVnrArea']
    # imputation to null values of these numerical columns need to be 'mean'
    mean_num_cols = list(set(numerical_cols).difference(set(constant_num_cols)))
    # imputation to null values of these categorical columns need to be 'constant'
    constant_categorical_cols = ['Alley', 'MasVnrType', 'BsmtQual', 'BsmtCond','BsmtExposure', 'BsmtFinType1', 'BsmtFinType2', 'FireplaceQu', 'GarageType', 'GarageFinish', 'GarageQual', 'GarageCond', 'PoolQC', 'Fence', 'MiscFeature']
    # imputation to null values of these categorical columns need to be 'most_frequent'
    mf_categorical_cols = list(set(categorical_cols).difference(set(constant_categorical_cols)))

    my_cols = constant_num_cols + mean_num_cols + constant_categorical_cols + mf_categorical_cols

    # Define transformers
    # Preprocessing for numerical data - mean
    numerical_transformer_m = Pipeline(steps=[('imputer', SimpleImputer(strategy='mean')),('scaler', StandardScaler())])
    # Preprocessing for numerical data - constant
    numerical_transformer_c = Pipeline(steps=[('imputer', SimpleImputer(strategy='constant', fill_value=0)),('scaler', StandardScaler())])

    # Preprocessing for categorical data for most frequent
    categorical_transformer_mf = Pipeline(steps=[('imputer', SimpleImputer(strategy='most_frequent')), ('onehot', OneHotEncoder(handle_unknown = 'ignore', sparse = False))])
    # Preprocessing for categorical data for constant
    categorical_transformer_c = Pipeline(steps=[('imputer', SimpleImputer(strategy='constant', fill_value='NA')), ('onehot', OneHotEncoder(handle_unknown = 'ignore', sparse = False))])

    # Bundle preprocessing for numerical and categorical data
    preprocessor = ColumnTransformer(transformers=[
        ('num_mean', numerical_transformer_m, mean_num_cols),
        ('num_constant', numerical_transformer_c, constant_num_cols),
        ('cat_mf', categorical_transformer_mf, mf_categorical_cols),
        ('cat_c', categorical_transformer_c, constant_categorical_cols)])

    # Transform data
    X_train = preprocessor.fit_transform(X_train)
    X_valid = preprocessor.transform(X_valid)
    X_test = preprocessor.transform(X_test)
    
    
    # Concat datasets
    # https://stackoverflow.com/questions/41989950/numpy-array-concatenate-valueerror-all-the-input-arrays-must-have-same-number
    train_data = np.concatenate([X_train, y_train[:,None]], axis=1)
    valid_data = np.concatenate([X_valid, y_valid[:,None]], axis=1)
    
    
    # Return data
    return train_data, valid_data, X_test

In [5]:
# Get the dataset
ds_train = Dataset.get_by_name(ws, name='Housing Prices Dataset')
ds_test = Dataset.get_by_name(ws, name='Housing Prices Test Dataset')

# Use the clean_data function to clean your data.
train_data, valid_data, test_data = clean_data(ds_train, ds_test)
print (train_data.shape)
print (valid_data.shape)
print(test_data.shape)

(1095, 399)
(365, 399)
(1459, 398)


In [7]:
train_data[1,398]

324000.0

In [6]:
# automl_config requires TabularDataset as a result we need to
# create a dataset from pandas dataframe
# https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-register-datasets#create-a-filedataset
print(type(train_data))

# create data folder if not exist 
if "data" not in os.listdir():
    os.mkdir("./data")

# convert train dataframe
# https://stackoverflow.com/questions/11106536/adding-row-column-headers-to-numpy-arrays
number_of_columns=len(train_data[1,:])
names = [i for i in range(number_of_columns)]
test_ds_names = [i for i in range(number_of_columns-1)]
train_path = 'data/train_cleaned.csv'
cleaned_train_data = pd.DataFrame(train_data, columns=names)
cleaned_train_data.to_csv(train_path, index=False, header=True, sep=',')

# convert valid dataframe
valid_path = 'data/valid_cleaned.csv'
cleaned_valid_data = pd.DataFrame(valid_data, columns=names)
cleaned_valid_data.to_csv(valid_path, index=False, header=True, sep=',')

# convert test dataframe
test_path = 'data/test_cleaned.csv'
cleaned_test_data = pd.DataFrame(test_data, columns=test_ds_names)
cleaned_test_data.to_csv(test_path, index=False, header=True, sep=',')

# get the datastore to upload prepared data
datastore = ws.get_default_datastore()

# upload the local file from src_dir to the target_path in datastore
datastore.upload(src_dir='data', target_path='data')

# create a dataset referencing the cloud location
train_dataset = Dataset.Tabular.from_delimited_files(path = [(datastore, ('data/train_cleaned.csv'))])

# create a dataset referencing the cloud location
valid_dataset = Dataset.Tabular.from_delimited_files(path = [(datastore, ('data/valid_cleaned.csv'))])

# create a dataset referencing the cloud location
test_dataset = Dataset.Tabular.from_delimited_files(path = [(datastore, ('data/test_cleaned.csv'))])

<class 'numpy.ndarray'>
Uploading an estimated of 7 files
Uploading data/data_description.txt
Uploaded data/data_description.txt, 1 files out of an estimated total of 7
Uploading data/sample_submission.csv
Uploaded data/sample_submission.csv, 2 files out of an estimated total of 7
Uploading data/test.csv
Uploaded data/test.csv, 3 files out of an estimated total of 7
Uploading data/test_cleaned.csv
Uploaded data/test_cleaned.csv, 4 files out of an estimated total of 7
Uploading data/train.csv
Uploaded data/train.csv, 5 files out of an estimated total of 7
Uploading data/train_cleaned.csv
Uploaded data/train_cleaned.csv, 6 files out of an estimated total of 7
Uploading data/valid_cleaned.csv
Uploaded data/valid_cleaned.csv, 7 files out of an estimated total of 7
Uploaded 7 files


## AutoML Configuration

TODO: Explain why you chose the automl settings and cofiguration you used below.

In [24]:
print(valid_dataset.to_pandas_dataframe().head(1))

      0     1     2     3     4     5     6     7    8    9  ...  389  390  \
0 -0.11 -0.08 -0.96 -0.05 -0.08 -0.67 -1.35 -0.78 0.79 1.41  ... 0.00 0.00   

   391  392  393  394  395  396  397       398  
0 0.00 1.00 0.00 1.00 0.00 0.00 0.00 137000.00  

[1 rows x 399 columns]


In [8]:
project_folder = './'
# TODO: Put your automl settings here
automl_settings = {
    "experiment_timeout_minutes": 30,
    "max_concurrent_iterations": 5,
    "max_cores_per_iteration":-1,
    "max_concurrent_iterations":4, 
    "n_cross_validations":5,
    "enable_early_stopping": True,
}

# TODO: Put your automl config here
automl_config = AutoMLConfig(compute_target = ws.compute_targets['cpu-cluster'],
                             task = "regression",
                             primary_metric = 'normalized_root_mean_squared_error',
                             training_data=train_dataset,
                             validation_data=valid_dataset,
                             label_column_name="398",   
                             path = project_folder
                            )

In [9]:
# TODO: Submit your experiment
remote_run = experiment.submit(automl_config)

Running on remote.


## Run Details

OPTIONAL: Write about the different models trained and their performance. Why do you think some models did better than others?

TODO: In the cell below, use the `RunDetails` widget to show the different experiments.

In [10]:
from azureml.widgets import RunDetails
RunDetails(remote_run).show()
remote_run.wait_for_completion(show_output=True)
assert(remote_run.get_status()=="Completed")

_AutoMLWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', 's…


Current status: FeaturesGeneration. Generating features for the dataset.
Current status: DatasetFeaturizationCompleted. Completed fit featurizers and featurizing the dataset.
Current status: ModelSelection. Beginning model selection.

****************************************************************************************************
DATA GUARDRAILS: 

TYPE:         Missing feature values imputation
STATUS:       PASSED
DESCRIPTION:  No feature missing values were detected in the training data.
              Learn more about missing value imputation: https://aka.ms/AutomatedMLFeaturization

****************************************************************************************************

TYPE:         High cardinality feature detection
STATUS:       PASSED
DESCRIPTION:  Your inputs were analyzed, and no high cardinality features were detected.
              Learn more about high cardinality feature handling: https://aka.ms/AutomatedMLFeaturization

*********************************

## Best Model

TODO: In the cell below, get the best model from the automl experiments and display all the properties of the model.



In [11]:
# Retrieve and save your best automl model.

# https://github.com/MicrosoftLearning/DP100/blob/master/08B%20-%20Using%20Automated%20Machine%20Learning.ipynb
# Get the best run object
best_run, fitted_model = remote_run.get_output()
print("Summary:")
print(remote_run.summary())
print("********************\n")
print("Best run:")
print(best_run)
print("********************\n")
print("Estimator:")
print(fitted_model.steps[-1])
print("********************\n")
print("Model:")
print(fitted_model)
print("********************\n")
best_run_metrics = best_run.get_metrics()
print('MAE:', best_run_metrics['mean_absolute_error'])
print('RMSLE:', best_run_metrics['root_mean_squared_log_error'])

print("********************\n")

for metric_name in best_run_metrics:
    metric = best_run_metrics[metric_name]
    print(metric_name, metric)

Package:azureml-automl-runtime, training version:1.21.0, current version:1.20.0
Package:azureml-core, training version:1.21.0.post1, current version:1.20.0
Package:azureml-dataprep, training version:2.8.2, current version:2.7.3
Package:azureml-dataprep-native, training version:28.0.0, current version:27.0.0
Package:azureml-dataprep-rslex, training version:1.6.0, current version:1.5.0
Package:azureml-dataset-runtime, training version:1.21.0, current version:1.20.0
Package:azureml-defaults, training version:1.21.0, current version:1.20.0
Package:azureml-interpret, training version:1.21.0, current version:1.20.0
Package:azureml-pipeline-core, training version:1.21.0, current version:1.20.0
Package:azureml-telemetry, training version:1.21.0, current version:1.20.0
Package:azureml-train-automl-client, training version:1.21.0, current version:1.20.0
Package:azureml-train-automl-runtime, training version:1.21.0, current version:1.20.0


Summary:
[['StackEnsemble', 1, 0.03380328643849116], ['VotingEnsemble', 1, 0.03180440083786807], ['RandomForest', 3, 0.04429112365301695], ['ElasticNet', 4, 0.056939635214845054], ['GradientBoosting', 2, 0.04160443656951125], ['XGBoostRegressor', 2, 0.03180748661834252], ['DecisionTree', 16, 0.048254770835566146], ['LassoLars', 2, 0.05435186707617986], ['LightGBM', 1, 0.03993076835714785]]
********************

Best run:
Run(Experiment: automl_experiment,
Id: AutoML_fede95ff-f89e-4f15-986e-44861eac6025_30,
Type: azureml.scriptrun,
Status: Completed)
********************

Estimator:
('prefittedsoftvotingregressor', PreFittedSoftVotingRegressor(estimators=[('1',
                                          Pipeline(memory=None,
                                                   steps=[('maxabsscaler',
                                                           MaxAbsScaler(copy=True)),
                                                          ('xgboostregressor',
                            

In [12]:
pred = fitted_model.predict(test_dataset.to_pandas_dataframe())
pred

array([126831.42107202, 156172.9848931 , 171035.24002545, ...,
       167362.4379617 , 121088.78678041, 228484.94880558])

In [98]:
my_array = test_dataset.to_pandas_dataframe().to_numpy()
my_array

predict = fitted_model.predict(my_array)
predict

DataException: DataException:
	Message: Expected column(s) 0 not found in fitted data.
	InnerException: None
	ErrorResponse 
{
    "error": {
        "code": "UserError",
        "message": "Expected column(s) 0 not found in fitted data.",
        "target": "X",
        "inner_error": {
            "code": "BadArgument",
            "inner_error": {
                "code": "MissingColumnsInData"
            }
        },
        "reference_code": "17049f70-3bbe-4060-a63f-f06590e784e5"
    }
}

In [43]:
test_dataset.to_pandas_dataframe().head(1)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,388,389,390,391,392,393,394,395,396,397
0,-0.11,-0.79,-0.96,-0.05,-0.4,-0.7,1.66,-0.78,-1.02,-1.05,...,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0


In [13]:
sample_submission_file = pd.read_csv("data/sample_submission.csv")
output = pd.DataFrame({'Id': sample_submission_file.Id,
                       'SalePrice': pred})
output.to_csv('data/submission.csv', index=False)
print ("Submission file is saved")

Submission file is saved


In [14]:
#TODO: Save the best model
# https://knowledge.udacity.com/questions/357007
os.makedirs('outputs', exist_ok=True)
joblib.dump(fitted_model, 'automl_model.pkl')


['automl_model.pkl']

## Model Deployment

Remember you have to deploy only one of the two models you trained.. Perform the steps in the rest of this notebook only if you wish to deploy this model.

TODO: In the cell below, register the model, create an inference config and deploy the model as a web service.

In [15]:
# Register model
model = Model.register(workspace= ws,model_path='automl_model.pkl', model_name='best_automl_run')
# Check model
for model in Model.list(ws):
    print("Model Name: {}\n".format(model.name))
    print(model)
    print("********************\n")

Registering model best_automl_run
Model Name: best_automl_run

Model(workspace=Workspace.create(name='quick-starts-ws-138013', subscription_id='48a74bb7-9950-4cc1-9caa-5d50f995cc55', resource_group='aml-quickstarts-138013'), name=best_automl_run, id=best_automl_run:1, version=1, tags={}, properties={})
********************



In [16]:
env = Environment.get(workspace=ws, name="AzureML-AutoML")

In [17]:
print("packages", env.python.conda_dependencies.serialize_to_string())

packages channels:
- anaconda
- conda-forge
- pytorch
dependencies:
- python=3.6.2
- pip=20.2.4
- pip:
  - azureml-core==1.21.0.post1
  - azureml-pipeline-core==1.21.0
  - azureml-telemetry==1.21.0
  - azureml-defaults==1.21.0
  - azureml-interpret==1.21.0
  - azureml-automl-core==1.21.0
  - azureml-automl-runtime==1.21.0
  - azureml-train-automl-client==1.21.0
  - azureml-train-automl-runtime==1.21.0.post1
  - azureml-dataset-runtime==1.21.0
  - inference-schema
  - py-cpuinfo==5.0.0
  - boto3==1.15.18
  - botocore==1.18.18
- numpy~=1.18.0
- scikit-learn==0.22.1
- pandas~=0.25.0
- py-xgboost<=0.90
- fbprophet==0.5
- holidays==0.9.11
- setuptools-git
- psutil>5.0.0,<6.0.0
name: azureml_7ade26eb614f97df8030bc480da59236



In [66]:
%%writefile conda_dependencies.yml

dependencies:
- python=3.6.2
- scikit-learn
- joblib
- numpy
- pip:
  - azureml-defaults
  - inference-schema[numpy-support]
  - azureml-train-automl
  - xgboost

Overwriting conda_dependencies.yml


In [18]:
from azureml.core import Environment

my_env = Environment.from_conda_specification(name = 'my-env', file_path = './my-env.yml')

In [41]:
with open('score.py') as f:
    print(f.read())

import joblib
import numpy as np
import os
import json

# The init() method is called once, when the web service starts up.
#
# Typically you would deserialize the model file, as shown here using joblib,
# and store it in a global variable so your run() method can access it later.
def init():
    global model

    # The AZUREML_MODEL_DIR environment variable indicates
    # a directory containing the model file you registered.
    model_filename = 'automl_model.pkl'
    model_path = os.path.join(os.environ['AZUREML_MODEL_DIR'], model_filename)

    model = joblib.load(model_path)


# The run() method is called each time a request is made to the scoring API.
#
def run(input_data):
    try:
        #print(input_data)
        data = json.loads(input_data)['data']
        #print(data)
        data = np.array(data)
        print(data)
        result = model.predict(data)
        print(result)
        return json.dumps({"result": result.tolist()})
    except Exception as e:
        result = 

In [150]:
service_name = 'my-automl-service-2'
#my_model = Model(ws, 'best_automl_run', version=1)
my_model = Model(ws, 'best_automl_run')
inference_config = InferenceConfig(entry_script='score.py', environment=my_env)
aci_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)

service = Model.deploy(workspace=ws,
                       name=service_name,
                       models=[my_model],
                       inference_config=inference_config,
                       deployment_config=aci_config,
                       overwrite=True)
service.wait_for_deployment(show_output=True)

Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running..........................................
Succeeded
ACI service creation operation finished, operation "Succeeded"


In [151]:
# Enable application insights
service.update(enable_app_insights=True)

TODO: In the cell below, send a request to the web service you deployed to test it.

In [152]:
test_sample = json.dumps([
    [-0.1091467318432988,-0.7878328282055617,-0.9620381458295,-0.05176425589470907,-0.3990418044871181,-0.6996614738795086,1.6647989626058908,-0.7828122736455998,-1.019409385612943,-1.0547672602550182,-0.80414438831858,-1.1755218737677209,0.5839329469649059,1.820249248431149,-0.11592277681125393,-1.2088300494050561,-0.8081486915878783,0.0653853220487234,0.395140194492683,1.1641317088210494,-0.21245574299649977,-0.945306321497596,-0.6923580278210051,-0.24337571736088476,-0.11745323705982251,-0.6730127578674934,-1.0282712862192567,-0.33876020915131566,-0.35817772778407514,0.09708150263564544,-0.8572262486740709,-0.08922048501487677,0.3587383764498879,0.2083304398259691,-0.580433554353486,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0],
    [-0.1091467318432988,-0.078394549517165,-0.9620381458295,-0.05176425589470907,0.6465132166485319,0.4326060132625712,1.6647989626058908,1.178696741553329,-1.019409385612943,0.17523289946728718,-0.80414438831858,-1.3205145278182204,-0.2873085802339204,-0.27976173751075345,-0.11592277681125393,-0.36970960100246153,-0.8081486915878783,1.0941417825339512,0.395140194492683,-0.7573159342214862,-0.21245574299649977,-0.3306326742055131,-0.1539438427249521,-0.24337571736088476,-0.11745323705982251,-0.3638240675754198,-1.0282712862192567,-0.4368931268816567,-0.35817772778407514,0.34872690226248776,-0.8572262486740709,21.966141419285748,2.3565088172763637,0.20182846372195212,0.02115689742032186,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0]
])


In [157]:
# Get the test data
my_test_values = test_dataset.to_pandas_dataframe()
test_list = my_test_values.values.tolist()

# Get first item
my_list=test_list[0]
#print(my_list)
my_data={}
print(len(my_list))

# Create dictionary
for count in range(len(my_list)):
    my_data[str(count)]=my_list[count]
print("My Data: ")
print(my_data)

# 
print("Data: ")
data = {"data":
        [
          my_data,
      ]
    }
# Convert to JSON string
input_data = json.dumps(data)
print(input_data)

In [147]:
import requests

# Set the content type
headers = {'Content-Type': 'application/json'}

# Make the request and display the response
resp = requests.post(service.scoring_uri, input_data, headers=headers)

print(resp)

<Response [502]>


In [153]:
import json

output = service.run(test_sample)

print(output)

ERROR:azureml.core.webservice.aci:Received bad response from service. More information can be found by calling `.get_logs()` on the webservice object.
Response Code: 502
Headers: {'Connection': 'keep-alive', 'Content-Length': '60', 'Content-Type': 'text/html; charset=utf-8', 'Date': 'Mon, 08 Feb 2021 00:27:48 GMT', 'Server': 'nginx/1.10.3 (Ubuntu)', 'X-Ms-Request-Id': '7198c91f-0152-4db9-a2cd-df49affa25ff', 'X-Ms-Run-Function-Failed': 'True'}
Content: b'dispatcher for __array_function__ did not return an iterable'



WebserviceException: WebserviceException:
	Message: Received bad response from service. More information can be found by calling `.get_logs()` on the webservice object.
Response Code: 502
Headers: {'Connection': 'keep-alive', 'Content-Length': '60', 'Content-Type': 'text/html; charset=utf-8', 'Date': 'Mon, 08 Feb 2021 00:27:48 GMT', 'Server': 'nginx/1.10.3 (Ubuntu)', 'X-Ms-Request-Id': '7198c91f-0152-4db9-a2cd-df49affa25ff', 'X-Ms-Run-Function-Failed': 'True'}
Content: b'dispatcher for __array_function__ did not return an iterable'
	InnerException None
	ErrorResponse 
{
    "error": {
        "message": "Received bad response from service. More information can be found by calling `.get_logs()` on the webservice object.\nResponse Code: 502\nHeaders: {'Connection': 'keep-alive', 'Content-Length': '60', 'Content-Type': 'text/html; charset=utf-8', 'Date': 'Mon, 08 Feb 2021 00:27:48 GMT', 'Server': 'nginx/1.10.3 (Ubuntu)', 'X-Ms-Request-Id': '7198c91f-0152-4db9-a2cd-df49affa25ff', 'X-Ms-Run-Function-Failed': 'True'}\nContent: b'dispatcher for __array_function__ did not return an iterable'"
    }
}

TODO: In the cell below, print the logs of the web service and delete the service

In [148]:
print(service.get_logs())

2021-02-08T00:19:08,285679005+00:00 - rsyslog/run 
2021-02-08T00:19:08,286104309+00:00 - gunicorn/run 
2021-02-08T00:19:08,286473912+00:00 - iot-server/run 
rsyslogd: /azureml-envs/azureml_7ade26eb614f97df8030bc480da59236/lib/libuuid.so.1: no version information available (required by rsyslogd)
2021-02-08T00:19:08,304988361+00:00 - nginx/run 
/usr/sbin/nginx: /azureml-envs/azureml_7ade26eb614f97df8030bc480da59236/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_7ade26eb614f97df8030bc480da59236/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_7ade26eb614f97df8030bc480da59236/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_7ade26eb614f97df8030bc480da59236/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml

In [154]:
# Delete the service
service.delete()

In [42]:
# Delete compute cluster
cpu_cluster.delete()

# References
- Cock, Dean. (2011). Ames, Iowa: Alternative to the Boston Housing Data as an End of Semester Regression Project. Journal of Statistics Education. 19. 10.1080/10691898.2011.11889627.
- [Deployment to Cloud Example](https://github.com/ErkanHatipoglu/MachineLearningNotebooks/tree/master/how-to-use-azureml/deployment/deploy-to-cloud)