## Model Training
### Train the PyTorch Deep Learning regression model with Azure ML service on a remote Azure ML Compute resource

#### <font color='red'> Before you begin: please download the training dataset from Kaggle and save it into the "data" folder as "train.csv". You will need to login into Kaggle to be able to download the dataset. </font>

#### Setup diagnostics collection

In [1]:
from azureml.telemetry import set_diagnostics_collection

set_diagnostics_collection(send_diagnostics=True)

Turning diagnostics collection on. 


#### Initialize the Azure ML Workspace

In [2]:
from azureml.core.workspace import Workspace

ws = Workspace.from_config()
print("Workspace name: " + ws.name, 
      "Azure region: " + ws.location, 
      "Resource group: " + ws.resource_group, sep = "\n")

Found the config file in: C:\AI+ Tour Tutorials\Azure ML service\housing\AzureML\aml_config\config.json
Workspace name: ML-Service-Workspace
Azure region: eastus
Resource group: ML-Service-RG


#### Attach your compute target

In [3]:
from azureml.core.compute import ComputeTarget

cluster_name = "gpucluster"
compute_target = ComputeTarget(workspace=ws, name=cluster_name)

print(compute_target.status.serialize())

{'allocationState': 'Steady', 'allocationStateTransitionTime': '2019-02-15T13:47:12.310000+00:00', 'creationTime': '2019-02-15T13:47:08.984315+00:00', 'currentNodeCount': 0, 'errors': None, 'modifiedTime': '2019-02-15T13:47:26.304081+00:00', 'nodeStateCounts': {'idleNodeCount': 0, 'leavingNodeCount': 0, 'preemptedNodeCount': 0, 'preparingNodeCount': 0, 'runningNodeCount': 0, 'unusableNodeCount': 0}, 'provisioningState': 'Succeeded', 'provisioningStateTransitionTime': None, 'scaleSettings': {'minNodeCount': 0, 'maxNodeCount': 4, 'nodeIdleTimeBeforeScaleDown': 'PT120S'}, 'targetNodeCount': 0, 'vmPriority': 'Dedicated', 'vmSize': 'STANDARD_NC6'}


#### Specify the training script folder

In [4]:
script_folder = "./script"

#### Create an Experiment in your Workspace to track the training runs

In [5]:
from azureml.core import Experiment

experiment_name = "pytorch-dl-regression"
experiment = Experiment(ws, name=experiment_name)

#### Upload data to the cloud

In [6]:
ds = ws.get_default_datastore()
print(ds.datastore_type, ds.account_name, ds.container_name)

ds.upload(src_dir="../data", target_path="pytorch-dl-regression", overwrite=True, show_progress=True)

AzureBlob mlservicstoragevqkhmalr azureml-blobstore-03a77933-b9d0-4918-bd23-4f23d00afafb
Uploading ../data\train.csv
Uploaded ../data\train.csv, 1 files out of an estimated total of 1


$AZUREML_DATAREFERENCE_e2775e6b2b5740878e5a2b7612f6803f

#### Create a Run Configuration or Estimator, which allows you to submit training jobs to your target compute environment. Here we create an Estimator, which is specific for PyTorch

In [7]:
from azureml.train.dnn import PyTorch

script_params = {
    "--data-folder": ds.as_mount(),
    "--num_hidden_layers": 1,
    "--hidden_layer_size": 256,
    "--dropout_rate": 0.1,
    "--learning_rate": 0.005
}

estimator = PyTorch(source_directory=script_folder,
                    script_params=script_params,
                    compute_target=compute_target,
                    entry_script="train_model.py",
                    use_gpu=True,
                    conda_packages=["scikit-learn", "pandas"]
                    )

#### Submit your trainig job

In [8]:
run = experiment.submit(estimator)
print(run)

Run(Experiment: pytorch-dl-regression,
Id: pytorch-dl-regression_1550239454_4cf766ba,
Type: azureml.scriptrun,
Status: Starting)


#### Get more details of your run

In [9]:
print(run.get_details())

{'runId': 'pytorch-dl-regression_1550239454_4cf766ba', 'target': 'gpucluster', 'status': 'Starting', 'properties': {'azureml.runsource': 'experiment', 'ContentSnapshotId': 'd929bd20-3ecf-4ecd-b161-91b249ccecd2'}, 'runDefinition': {'Script': 'train_model.py', 'Arguments': ['--data-folder', '$AZUREML_DATAREFERENCE_workspaceblobstore', '--num_hidden_layers', '1', '--hidden_layer_size', '256', '--dropout_rate', '0.1', '--learning_rate', '0.005'], 'SourceDirectoryDataStore': None, 'Framework': 0, 'Communicator': 0, 'Target': 'gpucluster', 'DataReferences': {'workspaceblobstore': {'DataStoreName': 'workspaceblobstore', 'Mode': 'Mount', 'PathOnDataStore': None, 'PathOnCompute': None, 'Overwrite': False}}, 'JobName': None, 'AutoPrepareEnvironment': True, 'MaxRunDurationSeconds': None, 'NodeCount': 1, 'Environment': {'Python': {'InterpreterPath': 'python', 'UserManagedDependencies': False, 'CondaDependencies': {'name': 'project_environment', 'dependencies': ['python=3.6.2', {'pip': ['azureml-de

#### Monitor your job

In [10]:
from azureml.widgets import RunDetails

RunDetails(run).show()

_UserRunWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': True, 'log_level': 'INFO', 's…

#### In case you need to cancel your job while still running

In [11]:
# run.cancel()