# Automated ML

TODO: Import Dependencies. In the cell below, import all the dependencies that you will need to complete the project.

In [1]:
import azureml.core
from azureml.core.experiment import Experiment
from azureml.core.workspace import Workspace
from azureml.core.dataset import Dataset
from azureml.train.automl import AutoMLConfig
from azureml.train.automl.run import AutoMLRun

ModuleNotFoundError: No module named 'azureml.core'

## Dataset

### Overview
The National Basketball Association (NBA) Games Data Kaggle dataset (https://www.kaggle.com/datasets/nathanlauga/nba-games) is a dataset of all NBA games from 2004-2020.  The information was collected from the NBA stats website and contains key information about the perfromance of each team during the game.  I will be using this data to train a model on what the key characteristics are for an NBA team in a win.  Using a data-driven approach, NBA teams can choose their focus area for strategizing and practicing when preparing for a successful season.

In [3]:
ws = Workspace.from_config()

# choose a name for experiment
experiment_name = 'nba_automl'

experiment=Experiment(ws, experiment_name)

## AutoML Configuration

TODO: Explain why you chose the automl settings and cofiguration you used below.

In [4]:
# Read Data
dataset = Dataset.get_by_name(ws, name='nba-games-data')
df = dataset.to_pandas_dataframe()

from sklearn.model_selection import train_test_split
train_data, test_data = train_test_split(df, test_size=0.15)

In [5]:
automl_settings = {"n_cross_validations": 3,
    "primary_metric": "AUC_weighted",
    "enable_early_stopping": True,
    "max_concurrent_iterations": 2,  
    "experiment_timeout_hours": 0.25
}

automl_config = AutoMLConfig( task="classification",
    debug_log="automl_errors.log",
    training_data=train_data,
    label_column_name='HOME_TEAM_WINS',
    **automl_settings)

In [6]:
remote_run = experiment.submit(automl_config)



Experiment,Id,Type,Status,Details Page,Docs Page
nba_automl,AutoML_c0d4f8bc-c737-42c2-8800-8776418cc59a,automl,Preparing,Link to Azure Machine Learning studio,Link to Documentation


2022-11-16:04:44:25,476 INFO     [explanation_client.py:334] Using default datastore for uploads


## Run Details


In [7]:
from azureml.widgets import RunDetails
RunDetails(remote_run).show()

_AutoMLWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', 's…

## Best Model

TODO: In the cell below, get the best model from the automl experiments and display all the properties of the model.



In [8]:
best_auto_run, fitted_auto_model = remote_run.get_output()
best_auto_run.get_metrics()

{'precision_score_micro': 0.8749886078923025,
 'matthews_correlation': 0.7409749307723649,
 'norm_macro_recall': 0.7378136468114699,
 'f1_score_micro': 0.8749886078923025,
 'recall_score_weighted': 0.8749886078923025,
 'recall_score_macro': 0.868906823405735,
 'AUC_weighted': 0.9495522139946031,
 'average_precision_score_micro': 0.9530815446942466,
 'precision_score_weighted': 0.874748374645893,
 'AUC_micro': 0.9515323725645869,
 'average_precision_score_weighted': 0.9519651975315454,
 'log_loss': 0.2822017167785791,
 'AUC_macro': 0.9495522139946031,
 'recall_score_micro': 0.8749886078923025,
 'f1_score_macro': 0.8703357534006483,
 'average_precision_score_macro': 0.9492868060781564,
 'accuracy': 0.8749886078923025,
 'precision_score_macro': 0.872077349708066,
 'balanced_accuracy': 0.868906823405735,
 'weighted_accuracy': 0.8806961847171122,
 'f1_score_weighted': 0.8747213241302378,
 'accuracy_table': 'aml://artifactId/ExperimentRun/dcid.AutoML_c0d4f8bc-c737-42c2-8800-8776418cc59a_26/a

In [9]:
import joblib
joblib.dump(fitted_auto_model, 'nba-games-model.pkl')

['nba-games-model.pkl']

In [14]:
remote_run.register_model(
    model_name='nba-games-auto',
    tags={'version':'1'}
)

Model(workspace=Workspace.create(name='quick-starts-ws-215640', subscription_id='3e42d11f-d64d-4173-af9b-12ecaa1030b3', resource_group='aml-quickstarts-215640'), name=nba-games-auto, id=nba-games-auto:1, version=1, tags={'version': '1'}, properties={})

In [11]:
import joblib
model = joblib.load('../nba-games-model.pkl')

ModuleNotFoundError: No module named 'azureml.automl'

In [None]:
type(model)

In [10]:
!pip install scikit-learn

Collecting scikit-learn
  Downloading scikit_learn-1.1.3-cp310-cp310-macosx_10_9_x86_64.whl (8.7 MB)
[2K     [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.7/8.7 MB[0m [31m4.8 MB/s[0m eta [36m0:00:00[0mm eta [36m0:00:01[0m[36m0:00:01[0m
[?25hCollecting scipy>=1.3.2
  Downloading scipy-1.9.3-cp310-cp310-macosx_10_9_x86_64.whl (34.3 MB)
[2K     [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m34.3/34.3 MB[0m [31m4.8 MB/s[0m eta [36m0:00:00[0mm eta [36m0:00:01[0m[36m0:00:01[0m
Collecting threadpoolctl>=2.0.0
  Using cached threadpoolctl-3.1.0-py3-none-any.whl (14 kB)
Installing collected packages: threadpoolctl, scipy, scikit-learn
Successfully installed scikit-learn-1.1.3 scipy-1.9.3 threadpoolctl-3.1.0
