# Automated ML

TODO: Import Dependencies. In the cell below, import all the dependencies that you will need to complete the project.

In [1]:
import azureml.core
from azureml.core.experiment import Experiment
from azureml.core.workspace import Workspace
from azureml.core.dataset import Dataset
from azureml.train.automl import AutoMLConfig
from azureml.train.automl.run import AutoMLRun

## Dataset

### Overview
The National Basketball Association (NBA) Games Data Kaggle dataset (https://www.kaggle.com/datasets/nathanlauga/nba-games) is a dataset of all NBA games from 2004-2020.  The information was collected from the NBA stats website and contains key information about the perfromance of each team during the game.  I will be using this data to train a model on what the key characteristics are for an NBA team in a win.  Using a data-driven approach, NBA teams can choose their focus area for strategizing and practicing when preparing for a successful season.

In [2]:
ws = Workspace.from_config()

# choose a name for experiment
experiment_name = 'nba_automl'

experiment=Experiment(ws, experiment_name)

## AutoML Configuration

TODO: Explain why you chose the automl settings and cofiguration you used below.

In [3]:
# Read Data
dataset = Dataset.get_by_name(ws, name='nba-games-data')
df = dataset.to_pandas_dataframe()

from sklearn.model_selection import train_test_split
train_data, test_data = train_test_split(df, test_size=0.15)

In [4]:
automl_settings = {"n_cross_validations": 3,
    "primary_metric": "AUC_weighted",
    "enable_early_stopping": True,
    "max_concurrent_iterations": 2,  
    "experiment_timeout_hours": 0.25
}

automl_config = AutoMLConfig( task="classification",
    debug_log="automl_errors.log",
    training_data=train_data,
    label_column_name='HOME_TEAM_WINS',
    **automl_settings)

In [5]:
remote_run = experiment.submit(automl_config)

2022-11-14:18:46:21,406 INFO     [font_manager.py:1337] generated new fontManager


Experiment,Id,Type,Status,Details Page,Docs Page
nba_automl,AutoML_a8c66d27-3f6c-488f-8d6f-7df5558d8ff3,automl,Preparing,Link to Azure Machine Learning studio,Link to Documentation


2022-11-14:19:04:26,751 INFO     [explanation_client.py:334] Using default datastore for uploads


## Run Details


In [6]:
from azureml.widgets import RunDetails
RunDetails(remote_run).show()

_AutoMLWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', 's…

## Best Model

TODO: In the cell below, get the best model from the automl experiments and display all the properties of the model.



In [7]:
best_auto_run, fitted_auto_model = remote_run.get_output()
best_auto_run.get_metrics()

In [None]:
import joblib
joblib.dump(fitted_auto_model, 'nba-games-model.pkl')