#### Notebook mainly focuses on terminoloy on Azure, methods available and process
There is no machine learning calculation or model, any modeling will be denoted as model.py

<h3>Getting Started</h3>
1.	Setting Up Machine<br>
    a.	Go to Azure portal<br>
    b.	+ Create a resource <br>
        i.	Create a new Machine Learning resource<br>
        ii.	Workspace name<br>
        iii.	Subscription<br>
        iv.	Resource group<br>
        v.	Location<br>
    c.	Launch Azure Machine Learning studio ml.azure.com<br>
2.	Create compute instance<br>
<img src="./images/Picture1.png"/>

On the Compute Instances tab, if you already have a compute instance, start it; otherwise create a new compute instance with the following settings:
<ul>
    <li>Virtual Machine type: CPU</li>
    <li>Virtual Machine size: Standard_DS11_v2</li>
    <li>Compute name: enter a unique name</li>
</ul>


<img src="./images/Picture2.png"/>

Clone a ml-basics repo
1.	Launch jupyterlab
2.	Open terminal
3.	Then run bash
cd Users
git clone https://github.com/microsoftdocs/ml-basics
4.	The files should all be there


In a production environment, you'd typically set the minimum number of nodes value to 0 so that compute is only started when it is needed. However, compute can take a while to start, so to reduce the amount of time you spend waiting for it in this module, you've initialized it with two permanently running nodes. <br>
If you decide not to complete this module, be sure to stop your compute instance, edit the compute cluster to reset the minimum number of nodes to 0, and delete the inference cluster in order to avoid leaving your compute running and incurring unnecessary charges to your Azure subscription. Alternatively, if you're finished exploring Azure Machine Learning, delete the entire resource group in your Azure subscription.<br>

Difference between
<br>•	compute instances
<br>•	<u>compute clusters</u>: creates a single or multi-node compute. The compute executes in a containerized environment and packages your model dependencies in a docker container.
<br>•	<u>inference clusters</u>: deploy inference pipeline as real-time. Inference is the phase where the deployed model is used to make predictions.

<h4>Clean up</h4>
Emphasizes on clean up.
Prevents unnecessary overcharges

<hr>
<h3>Workspace</h3>
There are several ways to create workspace.
<ol>
    <li>Using azure portal, like above.</li>
    <li>Use azure ML python SDK</li>
    <li><code>from azureml.core import Workspace<br>
    ws = Workspace.create(name='aml-workspace', 
                      subscription_id='123456-abc-123...',
                      resource_group='aml-resources',
                      create_resource_group=True,
                      location='eastus'
        )</code>
    </li>
    <li>Use CLI <br><code>az ml workspace create -w 'aml-workspace' -g 'aml-resources'</code>
    <br>-w name -g R group
    </li>
    <li>Azure Resource Manager template</li>
</ol>
create: Name, Sub, Create, R group, location
<br>You can also import a config file named `config.json` with the properties:
<br><code>{
    "subscription_id": "1234567-abcde-890-fgh...",
    "resource_group": "aml-resources",
    "workspace_name": "aml-workspace"
}</code>
        <br>

In [None]:
# Notice this is .get not .create
from azureml.core import Workspace 

ws = Workspace.get(name='practice1', 
                      subscription_id='5c2f11df-87b0-4d35-8e09-fbadc191e71e', 
                      resource_group='exam_practice1' )
for compute_name in ws.compute_targets:
    compute = ws.compute_targets[compute_name]
    print(compute.name, ":", compute.type)

<h3>Experiments</h3>
Experiment can be run multiple times, width different data, code or settings. 

Every experiment contains log files. To output the log, just `print` the statements to the log.However, if you want to record named metrics for comparison across runs, you can do so by using the Run object; which provides a range of logging functions specifically for this purpose. These include:
<ul>
    <li>log: Record a single named value.</li>
    <li>log_list: Record a named list of values.</li>
    <li>log_row: Record a row with multiple columns.</li>
    <li>log_table: Record a dictionary as a table.</li>
    <li>log_image: Record an image file or a plot.</li>
</ul>

In [None]:
# Creates experiement, then records number of observation in a csv file

from azureml.core import Experiment
import pandas as pd

# Create an Azure ML experiment in your workspace
experiment = Experiment(workspace = ws, name = 'my-experiment')

# Start logging data from the experiment
run = experiment.start_logging()

# load the dataset and count the rows
data = pd.read_csv('data.csv')
row_count = (len(data))

# Log the row count
run.log('observations', row_count)

# Complete the experiment
run.complete()

##### Log Metrics:<br>
from azureml.widgets import RunDetails
<br>View logs with either of these
<ul>
    <li>RunDetails(run).show()</li>
    <li>run.get_metrics()<br>
        print(json.dumps(metrics, indent=2))</li>
</ul>

##### Log Output files:<br>
Often these are trained machine learning models, but you can save any sort of file and make it available as an output of your experiment run. The output files of an experiment are saved in its outputs folder.
 <br>
<br>View logs with either of these
<ul>
    <li>run.upload_file(name='outputs/sample.csv', path_or_stream='./sample.csv')</li>
    <li>f = run.get_file_names()<br>
        print(json.dumps(f, indent=2))</li>
</ul>

##### Script files:<br>
An experiment script is just a Python code file that contains the code you want to run in the experiment. To access the experiment run context (which is needed to log metrics) the script must import the `azureml.core.Run` class and call its `get_context method`. 
 <br>
<br>View logs with either of these
<ul>
    <li>run.upload_file(name='outputs/sample.csv', path_or_stream='./sample.csv')</li>
    <li>f = run.get_file_names()<br>
        print(json.dumps(f, indent=2))</li>
</ul>

In [None]:
# experiment.py file
from azureml.core import Run
import pandas as pd
import matplotlib.pyplot as plt
import os

# Get the experiment run context
run = Run.get_context()

# load the diabetes dataset
data = pd.read_csv('data.csv')

# Count the rows and log the result
row_count = (len(data))
run.log('observations', row_count)

# Save a sample of the data
os.makedirs('outputs', exist_ok=True)
data.sample(100).to_csv("outputs/sample.csv", index=False, header=True)

# Complete the run
run.complete()


To run a script as an experiment, you must define a script configuration that defines the script to be run and the Python environment in which to run it. This is implemented by using a `ScriptRunConfig` object.

In [None]:
from azureml.core import Experiment, ScriptRunConfig

# Create a script config and run about experiment.py file
script_config = ScriptRunConfig(source_directory=experiment_folder,
                                script='experiment.py') 

# submit the experiment
experiment = Experiment(workspace = ws, name = 'my-experiment')
run = experiment.submit(config=script_config)
run.wait_for_completion(show_output=True)

<hr>
<h3>Training Script</h3>
You can use a `ScriptRunConfig` to run a script-based experiment that trains a machine learning model.

<br>Say you have a model.py, it needs to be trained, you can call `ScriptRunConfig`.

In [None]:
from azureml.core import Experiment, ScriptRunConfig, Environment
from azureml.core.conda_dependencies import CondaDependencies

# Create a Python environment for the experiment
sklearn_env = Environment("sklearn-env")

# Ensure the required packages are installed
packages = CondaDependencies.create(conda_packages=['scikit-learn','pip'],
                                    pip_packages=['azureml-defaults'])
sklearn_env.python.conda_dependencies = packages

# Create a script config <HERE you call your model>
script_config = ScriptRunConfig(source_directory='training_folder',
                                script='model.py',
                                environment=sklearn_env) 

# Submit the experiment
experiment = Experiment(workspace=ws, name='training-experiment')
run = experiment.submit(config=script_config)
run.wait_for_completion()

You can also add arguments to the model.

In [None]:
# Create a script config
script_config = ScriptRunConfig(source_directory='training_folder',
                                script='model.py',
                                arguments = ['--reg-rate', 0.1],
                                environment=sklearn_env)

****You want to use a script-based experiment to train a PyTorch model, setting the batch size and learning rate hyperparameters to specified values each time the experiment runs. What should you do? Add arguments for batch size and learning rate to the script, and set them in the arguments property of the ScriptRunConfig.



<h4>Retrieving Model</h4>
After an experiment run has completed, you can use the run objects get_file_names method to list the files generated. Standard practice is for scripts that train models to save them in the run's outputs folder.
<br>
You can also use the run object's `download_file` and `download_files` methods to download output files to the local file system.

In [None]:
# "run" is a reference to a completed experiment run

# List the files generated by the experiment
for file in run.get_file_names():
    print(file)

# Download a named file
run.download_file(name='outputs/model.pkl', output_file_path='model.pkl')

<h4>Register a model</h4>
Model registration enables you to track multiple versions of the model and retrive models for inference.  Registering a model with the same name as an existing model automatically creates a new version of the model, starting with 1 and increasing in units of 1.
<br>
TWO WAYS:<br>

To register a model from a local file, you can use the register method of the Model object 

In [None]:
from azureml.core import Model

model = Model.register(workspace=ws,
                       model_name='classification_model',
                       model_path='model.pkl', # local path
                       description='A classification model',
                       tags={'data-format': 'CSV'},
                       model_framework=Model.Framework.SCIKITLEARN,
                       model_framework_version='0.20.3')

In [None]:
# using run references, you can also register the model
run.register_model( model_name='classification_model',
                    model_path='outputs/model.pkl', # run outputs path
                    description='A classification model',
                    tags={'data-format': 'CSV'},
                    model_framework=Model.Framework.SCIKITLEARN,
                    model_framework_version='0.20.3')

****You have run an experiment to train a model. You want the model to be stored in the workspace, and available to other experiments and published services. What should you do? Register the model in the workspace.

#### Viewing the Model
<code>from azureml.core import Model
<br>
for model in Model.list(ws):
    # Get model name and auto-generated version
    print(model.name, 'version:', model.version)</code>