# Automated ML

Import Dependencies. In the cell below, import all the dependencies that you will need to complete the project.

In [1]:
import requests
import json
import logging
import joblib
import pandas as pd
from sklearn.model_selection import train_test_split

import azureml.core
from azureml.core.workspace import Workspace
from azureml.core.experiment import Experiment
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.train.automl import AutoMLConfig
from azureml.core.compute_target import ComputeTargetException
from azureml.core.dataset import Dataset
from azureml.data.dataset_factory import TabularDatasetFactory
from azureml.widgets import RunDetails
from azureml.core.model import Model, InferenceConfig
from azureml.core.webservice import AciWebservice

# Check core SDK version number
print("SDK version:", azureml.core.VERSION)

SDK version: 1.33.0


## Initialize Workspace

Initialize a workspace object from persisted configuration. Make sure the config file is present at ./config.json

In [2]:
ws = Workspace.from_config()

print('Workspace name: ' + ws.name,
      'Azure region: ' + ws.location,
      'Subscription id: ' + ws.subscription_id,
      'Resource group: ' + ws.resource_group, sep='\n')


Workspace name: workspace-rvl
Azure region: westeurope
Subscription id: b17f1c19-34a2-47b8-a207-40ea477828fc
Resource group: resource-group-rvl


## Create an Azure ML experiment
Let's create an experiment named "automl-experiment-heart-failure-classification" and a folder to hold the training scripts. The script runs will be recorded under the experiment in Azure.

The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the source directory for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the source directory would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the source_directory of the step.


In [3]:
experiment_name = 'automl-experiment-heart-failure-classification'
project_folder = './classification-project'

experiment = Experiment(ws, experiment_name)
experiment

Name,Workspace,Report Page,Docs Page
automl-experiment-heart-failure-classification,workspace-rvl,Link to Azure Machine Learning studio,Link to Documentation


## Create or Attach an AmlCompute cluster
We will need to create a compute target for our AutoML run. We will use `vm_size = Standard_D2_V2` in our provisioning configuration and select `max_nodes` to be no greater than 4.

In [4]:
aml_compute_cluster_name = "cpu-cluster"

# Verify that cluster doesn't exist already
try:
    aml_compute = ComputeTarget(workspace=ws, name=aml_compute_cluster_name)
    print("Found existing cluster, use it.")

except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(vm_size="Standard_D2_V2",
                                                           min_nodes=0,
                                                           max_nodes=4)
    aml_compute = ComputeTarget.create(workspace=ws,
                                       name=aml_compute_cluster_name,
                                       provisioning_configuration=compute_config)
    
aml_compute.wait_for_completion(show_output=True)

Found existing cluster, use it.
Succeeded
AmlCompute wait for completion finished

Minimum number of nodes requested have been provisioned


In [5]:
compute_targets = ws.compute_targets

for i, key in enumerate(compute_targets):
    print(f"{i+1}. Compute target\n\tname: {compute_targets[key].name}\n\tType: {compute_targets[key].type}\n")
    

1. Compute target
	name: compute-instance-rvl
	Type: ComputeInstance

2. Compute target
	name: cpu-cluster
	Type: AmlCompute



## Dataset

### Overview

This dataset contains the medical records of 299 patients who had heart failure, collected during their follow-up period, where each patient profile has 13 clinical features.

The 12 clinical input features and the target feature are:

- age: age of the patient (years)
- anaemia: decrease of red blood cells or hemoglobin (boolean)
- high blood pressure: if the patient has hypertension (boolean)
- creatinine phosphokinase (CPK): level of the CPK enzyme in the blood (mcg/L)
- diabetes: if the patient has diabetes (boolean)
- ejection fraction: percentage of blood leaving the heart at each contraction (percentage)
- platelets: platelets in the blood (kiloplatelets/mL)
- sex: woman or man (binary)
- serum creatinine: level of serum creatinine in the blood (mg/dL)
- serum sodium: level of serum sodium in the blood (mEq/L)
- smoking: if the patient smokes or not (boolean)
- time: follow-up period (days)
- [target] death event: if the patient deceased during the follow-up period (boolean)

The task we are concerned with is to predict if the patient deceased during the follow-up period. We will be using `DEATH_EVENT` column as the target and since this is a boolean variable, the task at hand is Binary Classification. 

In the cell below, we write code to access the data that we will be using in this project.

We will try to load the dataset from the Workspace. If it isn't found because it was deleted, it can be recreated with the link that has the CSV

Make sure the `key` is the same name as the dataset that is uploaded. Also we provide a matching description. If it is hard to find or unknown, loop over the `ws.datasets.keys()` and `print()` them. 


In [6]:
ws.datasets.keys()

KeysView({'Sample: Diabetes': DatasetRegistration(id='e05bc75c-873f-4d8a-94aa-9f546bf91115', name='Sample: Diabetes', version=1, description='', tags={'opendatasets': 'sample-diabetes'}), 'mnist_opendataset': DatasetRegistration(id='27c2b726-2af7-4301-b7ce-eacbe267c6d1', name='mnist_opendataset', version=1, description='training and test dataset', tags={})})

In [7]:
found = False
key = "Health-Failure"
description_text = "Health Failure dataset for mortality prediction"

dataset_url = "https://archive.ics.uci.edu/ml/machine-learning-databases/00519/heart_failure_clinical_records_dataset.csv"

if key in ws.datasets.keys():
    dataset = ws.datasets[key]
    print("The Dataset was found!")
else:
    # Create AML Dataset and register it into Workspace
    dataset = Dataset.Tabular.from_delimited_files(dataset_url)
    #Register Dataset in Workspace
    dataset = dataset.register(workspace=ws,
                               name=key,
                               description=description_text)

df = dataset.to_pandas_dataframe()

In [8]:
df.describe()

Unnamed: 0,age,anaemia,creatinine_phosphokinase,diabetes,ejection_fraction,high_blood_pressure,platelets,serum_creatinine,serum_sodium,sex,smoking,time,DEATH_EVENT
count,299.0,299.0,299.0,299.0,299.0,299.0,299.0,299.0,299.0,299.0,299.0,299.0,299.0
mean,60.833893,0.431438,581.839465,0.41806,38.083612,0.351171,263358.029264,1.39388,136.625418,0.648829,0.32107,130.26087,0.32107
std,11.894809,0.496107,970.287881,0.494067,11.834841,0.478136,97804.236869,1.03451,4.412477,0.478136,0.46767,77.614208,0.46767
min,40.0,0.0,23.0,0.0,14.0,0.0,25100.0,0.5,113.0,0.0,0.0,4.0,0.0
25%,51.0,0.0,116.5,0.0,30.0,0.0,212500.0,0.9,134.0,0.0,0.0,73.0,0.0
50%,60.0,0.0,250.0,0.0,38.0,0.0,262000.0,1.1,137.0,1.0,0.0,115.0,0.0
75%,70.0,1.0,582.0,1.0,45.0,1.0,303500.0,1.4,140.0,1.0,1.0,203.0,1.0
max,95.0,1.0,7861.0,1.0,80.0,1.0,850000.0,9.4,148.0,1.0,1.0,285.0,1.0


In [10]:
# Split the dataset into training and testing datasets
train_df, test_df = train_test_split(df, test_size=0.2)

# Save training data to csv file
train_df.to_csv("./data/train_data.csv", index=False)

# Read saved training data and create a dataset in Azure ML
data_store = ws.get_default_datastore()
data_store.upload(src_dir="./data", target_path="training_data")
train_ds = TabularDatasetFactory.from_delimited_files(path=[(data_store, 'training_data/train_data.csv')])


Uploading an estimated of 1 files
Target already exists. Skipping upload for training_data/train_data.csv
Uploaded 0 files


## Review the Dataset Result

You can peek the result of a TabularDataset at any range using `skip(i)` and `take(j).to_pandas_dataframe()`. Doing so evaluates only `j` records for all the steps in the `TabularDataset`, which makes it fast even against large datasets.

`TabularDataset` objects are composed of a list of transformation steps (optional).

In [11]:
train_ds.take(5).to_pandas_dataframe()

Unnamed: 0,age,anaemia,creatinine_phosphokinase,diabetes,ejection_fraction,high_blood_pressure,platelets,serum_creatinine,serum_sodium,sex,smoking,time,DEATH_EVENT
0,58.0,1,145,0,25,0,219000.0,1.2,137,1,1,170,1
1,45.0,0,582,0,38,1,422000.0,0.8,137,0,0,245,0
2,50.0,0,1548,0,30,1,211000.0,0.8,138,1,0,108,0
3,43.0,1,358,0,50,0,237000.0,1.3,135,0,0,97,0
4,50.0,1,249,1,35,1,319000.0,1.0,128,0,0,28,1


## AutoML Configuration

As mentioned above in the dataset section we are dealing with a binary classification. Therefore the argument `task` is set to `classification` and the since we are predicting `DEATH_EVENT` we need to set `label_column_name="DEATH_EVENT"`

To help manage child runs and when they can be performed, we recommend you create a dedicated cluster per experiment, and match the number of `max_concurrent_iterations` of your experiment to the number of nodes in the cluster. This way, you use all the nodes of the cluster at the same time with the number of concurrent child runs/iterations you want.

Configure `max_concurrent_iterations` in your `AutoMLConfig` object. If it is not configured, then by default only one concurrent child run/iteration is allowed per experiment.

Besides other arguments that are self-explanatory, to automate Feature engineering AzureML enables this through `featurization` that needs to be set to `True`. This way features that best characterize the patterns in the data are selected to create predictive models.

In [12]:
# AutoMl settings
automl_settings = {
    "experiment_timeout_minutes": 30,
    "max_concurrent_iterations": 4,
    "primary_metric": "AUC_weighted",
    "enable_early_stopping": True,
    "verbosity": logging.INFO
}

# AutoMl config
automl_config = AutoMLConfig(compute_target=aml_compute,
                             task="classification",
                             training_data=train_ds,
                             label_column_name="DEATH_EVENT",
                             n_cross_validations=5,
                             featurization="auto",
                             path=project_folder,
                             debug_log = "automl_errors.log",
                             **automl_settings                             
                            )

In [13]:
# Submit your experiment
remote_run = experiment.submit(automl_config, show_output=True)

Submitting remote run.
No run_configuration provided, running on cpu-cluster with default configuration
Running on remote compute: cpu-cluster


Experiment,Id,Type,Status,Details Page,Docs Page
automl-experiment-heart-failure-classification,AutoML_3896527e-fe9d-4dc0-a4ec-b9878390f92c,automl,NotStarted,Link to Azure Machine Learning studio,Link to Documentation



Current status: FeaturesGeneration. Generating features for the dataset.
Current status: ModelSelection. Beginning model selection.

****************************************************************************************************
DATA GUARDRAILS: 

TYPE:         Class balancing detection
STATUS:       PASSED
DESCRIPTION:  Your inputs were analyzed, and all classes are balanced in your training data.
              Learn more about imbalanced data: https://aka.ms/AutomatedMLImbalancedData

****************************************************************************************************

TYPE:         Missing feature values imputation
STATUS:       PASSED
DESCRIPTION:  No feature missing values were detected in the training data.
              Learn more about missing value imputation: https://aka.ms/AutomatedMLFeaturization

****************************************************************************************************

TYPE:         High cardinality feature detection
STATUS

## Run Details

In the cell below, use the `RunDetails` widget to show the different experiments.

In [14]:
RunDetails(remote_run).show()
remote_run.wait_for_completion(show_output=True)

_AutoMLWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', 's…

Experiment,Id,Type,Status,Details Page,Docs Page
automl-experiment-heart-failure-classification,AutoML_3896527e-fe9d-4dc0-a4ec-b9878390f92c,automl,Completed,Link to Azure Machine Learning studio,Link to Documentation




****************************************************************************************************
DATA GUARDRAILS: 

TYPE:         Class balancing detection
STATUS:       PASSED
DESCRIPTION:  Your inputs were analyzed, and all classes are balanced in your training data.
              Learn more about imbalanced data: https://aka.ms/AutomatedMLImbalancedData

****************************************************************************************************

TYPE:         Missing feature values imputation
STATUS:       PASSED
DESCRIPTION:  No feature missing values were detected in the training data.
              Learn more about missing value imputation: https://aka.ms/AutomatedMLFeaturization

****************************************************************************************************

TYPE:         High cardinality feature detection
STATUS:       PASSED
DESCRIPTION:  Your inputs were analyzed, and no high cardinality features were detected.
              Learn more abo

{'runId': 'AutoML_3896527e-fe9d-4dc0-a4ec-b9878390f92c',
 'target': 'cpu-cluster',
 'status': 'Completed',
 'startTimeUtc': '2021-08-24T06:27:57.369741Z',
 'endTimeUtc': '2021-08-24T07:05:18.958263Z',
 'properties': {'num_iterations': '1000',
  'training_type': 'TrainFull',
  'acquisition_function': 'EI',
  'primary_metric': 'AUC_weighted',
  'train_split': '0',
  'acquisition_parameter': '0',
  'num_cross_validation': '5',
  'target': 'cpu-cluster',
  'DataPrepJsonString': '{\\"training_data\\": {\\"datasetId\\": \\"19be7d47-7277-41a0-9a34-38dc733a5ef6\\"}, \\"datasets\\": 0}',
  'EnableSubsampling': None,
  'runTemplate': 'AutoML',
  'azureml.runsource': 'automl',
  'display_task_type': 'classification',
  'dependencies_versions': '{"azureml-widgets": "1.33.0", "azureml-train": "1.33.0", "azureml-train-restclients-hyperdrive": "1.33.0", "azureml-train-core": "1.33.0", "azureml-train-automl": "1.33.0", "azureml-train-automl-runtime": "1.33.0", "azureml-train-automl-client": "1.33.0", 

## Best Model

In the cell below, get the best model from the automl experiments and display all the properties of the model.



In [15]:
# Retrieve your best automl model.
best_run, best_model = remote_run.get_output()
best_run_metrics = best_run.get_metrics()

In [16]:
best_model

Pipeline(memory=None,
         steps=[('datatransformer',
                 DataTransformer(enable_dnn=False, enable_feature_sweeping=True, feature_sweeping_config={}, feature_sweeping_timeout=86400, featurization_config=None, force_text_dnn=False, is_cross_validation=True, is_onnx_compatible=False, observer=None, task='classification', working_dir='/mnt/batch/tasks/shared/LS_root/mount...
)), ('svcwrapper', SVCWrapper(C=1.2067926406393288, break_ties=False, cache_size=200, class_weight='balanced', coef0=0.0, decision_function_shape='ovr', degree=3, gamma='scale', kernel='rbf', max_iter=-1, probability=True, random_state=None, shrinking=True, tol=0.001, verbose=False))], verbose=False)), ('18', Pipeline(memory=None, steps=[('sparsenormalizer', Normalizer(copy=True, norm='l1')), ('lightgbmclassifier', LightGBMClassifier(boosting_type='goss', colsample_bytree=0.7922222222222222, learning_rate=0.09473736842105263, max_bin=80, max_depth=6, min_child_weight=0, min_data_in_leaf=0.051728965517

In [17]:
best_run

Experiment,Id,Type,Status,Details Page,Docs Page
automl-experiment-heart-failure-classification,AutoML_3896527e-fe9d-4dc0-a4ec-b9878390f92c_48,azureml.scriptrun,Completed,Link to Azure Machine Learning studio,Link to Documentation


In [18]:
best_run_metrics

{'recall_score_micro': 0.8323581560283688,
 'matthews_correlation': 0.6011658827195101,
 'AUC_macro': 0.9254142154142153,
 'recall_score_macro': 0.7786141636141636,
 'recall_score_weighted': 0.8323581560283688,
 'average_precision_score_macro': 0.9049394939703133,
 'AUC_weighted': 0.9254142154142153,
 'precision_score_macro': 0.8249180999181001,
 'precision_score_weighted': 0.8446117962740305,
 'norm_macro_recall': 0.5572283272283273,
 'accuracy': 0.8323581560283688,
 'precision_score_micro': 0.8323581560283688,
 'average_precision_score_weighted': 0.9329489070476112,
 'weighted_accuracy': 0.8687154900753924,
 'f1_score_weighted': 0.8261379589370126,
 'balanced_accuracy': 0.7786141636141636,
 'AUC_micro': 0.9163027435113928,
 'average_precision_score_micro': 0.9219125394213965,
 'log_loss': 0.41467446104078487,
 'f1_score_macro': 0.7881405015647374,
 'f1_score_micro': 0.8323581560283688,
 'confusion_matrix': 'aml://artifactId/ExperimentRun/dcid.AutoML_3896527e-fe9d-4dc0-a4ec-b9878390f9

In [19]:
best_run.get_details()

{'runId': 'AutoML_3896527e-fe9d-4dc0-a4ec-b9878390f92c_48',
 'target': 'cpu-cluster',
 'status': 'Completed',
 'startTimeUtc': '2021-08-24T07:03:24.658132Z',
 'endTimeUtc': '2021-08-24T07:05:16.282761Z',
 'properties': {'runTemplate': 'automl_child',
  'pipeline_id': '__AutoML_Ensemble__',
  'pipeline_spec': '{"pipeline_id":"__AutoML_Ensemble__","objects":[{"module":"azureml.train.automl.ensemble","class_name":"Ensemble","spec_class":"sklearn","param_args":[],"param_kwargs":{"automl_settings":"{\'task_type\':\'classification\',\'primary_metric\':\'AUC_weighted\',\'verbosity\':20,\'ensemble_iterations\':15,\'is_timeseries\':False,\'name\':\'automl-experiment-heart-failure-classification\',\'compute_target\':\'cpu-cluster\',\'subscription_id\':\'b17f1c19-34a2-47b8-a207-40ea477828fc\',\'region\':\'westeurope\',\'spark_service\':None}","ensemble_run_id":"AutoML_3896527e-fe9d-4dc0-a4ec-b9878390f92c_48","experiment_name":"automl-experiment-heart-failure-classification","workspace_name":"work

In [20]:
best_run.get_properties()

{'runTemplate': 'automl_child',
 'pipeline_id': '__AutoML_Ensemble__',
 'pipeline_spec': '{"pipeline_id":"__AutoML_Ensemble__","objects":[{"module":"azureml.train.automl.ensemble","class_name":"Ensemble","spec_class":"sklearn","param_args":[],"param_kwargs":{"automl_settings":"{\'task_type\':\'classification\',\'primary_metric\':\'AUC_weighted\',\'verbosity\':20,\'ensemble_iterations\':15,\'is_timeseries\':False,\'name\':\'automl-experiment-heart-failure-classification\',\'compute_target\':\'cpu-cluster\',\'subscription_id\':\'b17f1c19-34a2-47b8-a207-40ea477828fc\',\'region\':\'westeurope\',\'spark_service\':None}","ensemble_run_id":"AutoML_3896527e-fe9d-4dc0-a4ec-b9878390f92c_48","experiment_name":"automl-experiment-heart-failure-classification","workspace_name":"workspace-rvl","subscription_id":"b17f1c19-34a2-47b8-a207-40ea477828fc","resource_group_name":"resource-group-rvl"}}]}',
 'training_percent': '100',
 'predicted_cost': None,
 'iteration': '48',
 '_aml_system_scenario_identifi

In [21]:
# Save the best model
joblib.dump(best_model, filename="./best_automl_model.joblib")

['./best_automl_model.joblib']

## Model Deployment

Remember you have to deploy only one of the two models you trained.. Perform the steps in the rest of this notebook only if you wish to deploy this model.

In the cell below, register the model, create an inference config and deploy the model as a web service.

In [22]:
model = Model.register(workspace=ws,
                       model_name="heart_failure_pred_model", 
                       model_path="./best_automl_model.joblib",
                       description="Best AutoML model"
                      )

Registering model heart_failure_pred_model


In [23]:
env = best_run.get_environment()

inf_config = InferenceConfig(environment=env,
                             entry_script='./score.py')

deployment_config = AciWebservice.deploy_configuration(cpu_cores=1,
                                                       memory_gb=1,
                                                       auth_enabled=False,
                                                       enable_app_insights=True)

service = Model.deploy(workspace=ws, 
                       name="automl-service",
                       models=[model],
                       inference_config=inf_config,
                       deployment_config=deployment_config)

service.wait_for_deployment(show_output=True)

Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running
2021-08-24 07:24:30+00:00 Creating Container Registry if not exists.
2021-08-24 07:24:30+00:00 Registering the environment.
2021-08-24 07:24:30+00:00 Use the existing image.
2021-08-24 07:24:31+00:00 Generating deployment configuration.
2021-08-24 07:24:33+00:00 Submitting deployment to compute..
2021-08-24 07:24:47+00:00 Checking the status of deployment automl-service..
2021-08-24 07:28:24+00:00 Checking the status of inference endpoint automl-service.
Succeeded
ACI service creation operation finished, operation "Succeeded"


Check the state of the deployed service and get its URIs 

In [24]:
print(f"Service State: {service.state}\n")
print(f"Scoring URI: {service.scoring_uri}\n")
print(f"Swagger URI: {service.swagger_uri}\n")

Service State: Healthy

Scoring URI: http://2afdfafb-f8cc-4ad4-84f8-648df017062a.westeurope.azurecontainer.io/score

Swagger URI: http://2afdfafb-f8cc-4ad4-84f8-648df017062a.westeurope.azurecontainer.io/swagger.json



In the cell below, send a request to the web service you deployed to test it.

In [25]:
# URL for the web service
scoring_uri = service.scoring_uri

# If the service is authenticated, set the key or token
#key = ""

# 3 sets of data to score, so we get two results back
data_df = test_df.sample(n=3)
labels = data_df.pop('DEATH_EVENT')


# Convert to JSON string
input_data = json.dumps({"data": data_df.to_dict(orient='records')})
with open("input_data.json", 'w') as _f:
    _f.write(input_data)

print(input_data)

{"data": [{"age": 63.0, "anaemia": 1, "creatinine_phosphokinase": 1767, "diabetes": 0, "ejection_fraction": 45, "high_blood_pressure": 0, "platelets": 73000.0, "serum_creatinine": 0.7, "serum_sodium": 137, "sex": 1, "smoking": 0, "time": 186}, {"age": 60.0, "anaemia": 0, "creatinine_phosphokinase": 897, "diabetes": 1, "ejection_fraction": 45, "high_blood_pressure": 0, "platelets": 297000.0, "serum_creatinine": 1.0, "serum_sodium": 133, "sex": 1, "smoking": 0, "time": 80}, {"age": 50.0, "anaemia": 1, "creatinine_phosphokinase": 1051, "diabetes": 1, "ejection_fraction": 30, "high_blood_pressure": 0, "platelets": 232000.0, "serum_creatinine": 0.7, "serum_sodium": 136, "sex": 0, "smoking": 0, "time": 246}]}


In [31]:
# Set the content type
headers = {"Content-Type": "application/json"}

# If authentication is enabled, set the authorization header
#headers["Authorization"] = f"Bearer {key}"

# Make the request and display the response
resp = requests.post(scoring_uri, input_data, headers=headers)
print(resp.json())

[0, 0, 0]


In [32]:
print(f"Predictions from Service: {resp.json()}\n")
print(f"Data Labels: {labels.tolist()}")

Predictions from Service: [0, 0, 0]

Data Labels: [0, 0, 0]


In the cell below, print the logs of the web service and delete the service

In [30]:
print(service.get_logs())

2021-08-24T07:28:10,510071271+00:00 - gunicorn/run 
Dynamic Python package installation is disabled.
Starting HTTP server
2021-08-24T07:28:10,511791718+00:00 - rsyslog/run 
2021-08-24T07:28:10,514880282+00:00 - iot-server/run 
2021-08-24T07:28:10,515976976+00:00 - nginx/run 
rsyslogd: /azureml-envs/azureml_3489174eb648a475780c9959ff366072/lib/libuuid.so.1: no version information available (required by rsyslogd)
EdgeHubConnectionString and IOTEDGE_IOTHUBHOSTNAME are not set. Exiting...
2021-08-24T07:28:10,693107410+00:00 - iot-server/finish 1 0
2021-08-24T07:28:10,694493528+00:00 - Exit code 1 is normal. Not restarting iot-server.
Starting gunicorn 20.1.0
Listening at: http://127.0.0.1:31311 (12)
Using worker: sync
worker timeout is set to 300
Booting worker with pid: 42
SPARK_HOME not set. Skipping PySpark Initialization.
Generating new fontManager, this may take some time...
Initializing logger
2021-08-24 07:28:14,022 | root | INFO | Starting up app insights client
logging socket was 

In [34]:
service.delete()