# Training the Autoencoder

Due to SageMaker having native support for Tensorflow, training the autoencoder is very simple - we only need to provide the script and point a [Tensorflow Estimator](https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/using_tf.html#train-a-model-with-tensorflow) to it and the data. Note that tensorflow (and all its dependencies) doesn't even have to be available to the notebook itself - it will only be used inside the script.

## Script

The script is an edited version of the code created in the original notebook. The main difference is that the hyperparameters (and any others) are passed to it through the command line, using the `argparse` (or any other CLI parameter parser) module.

Another slight difference (also handled through parameters and environment variables) is that SageMaker defines specific directories where the data will be made available to the script and where the trained model should be saved. With that, the script doesn't have to know about S3 or any data movement outside of its environment - it's completely self-contained.

In [1]:
!pygmentize src/train_ae.py

[34mimport[39;49;00m [04m[36mos[39;49;00m
[34mimport[39;49;00m [04m[36msagemaker[39;49;00m
[34mimport[39;49;00m [04m[36mnumpy[39;49;00m [34mas[39;49;00m [04m[36mnp[39;49;00m
[34mimport[39;49;00m [04m[36margparse[39;49;00m
[34mfrom[39;49;00m [04m[36mtensorflow[39;49;00m[04m[36m.[39;49;00m[04m[36mkeras[39;49;00m [34mimport[39;49;00m layers, Input, models
[34mfrom[39;49;00m [04m[36mpathlib[39;49;00m [34mimport[39;49;00m Path
[34mimport[39;49;00m [04m[36mlogging[39;49;00m


[34mdef[39;49;00m [32mparse_arguments[39;49;00m():
    parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
    parser.add_argument([33m"[39;49;00m[33m--num-epochs[39;49;00m[33m"[39;49;00m, [36mtype[39;49;00m=[36mint[39;49;00m, default=[34m15[39;49;00m)
    parser.add_argument([33m"[39;49;00m[33m--batch-size[39;49;00m[33m"[39;49;00m, [36mtype[39;49;00m=[36mint[39;49;00m, default=[34m1024[39;49;00m)
    parse

## Run the Training

As this is a model training, there are some additional parameters to pass to it. This examples only defined the number of epochs and batch size, but any hyperparameter could be passed the same way. [SageMaker Hyperparameter Optimization](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html) (SageMaker Automatic Tuning) uses the same strategy to optimize trained models over their possible hyperparameter ranges.

## Input Data Parameters

In [4]:
bucket = "sagemaker-us-east-1-160951647621"
data_path = "wafer-data-processing-2020-10-04-21-48-20-207/output/autoencoder/train"

inputs = {
    "train": f"s3://{bucket}/{data_path}"
}
print(inputs)

{'train': 's3://sagemaker-us-east-1-160951647621/wafer-data-processing-2020-10-04-21-48-20-207/output/autoencoder/train'}


## Instance and Hyperparameters

In [5]:
train_instance_type = "ml.p3.2xlarge"
hyperparameters = {
    #"max-rows": 1024,
    "num-epochs": 15,
    "batch-size": 1024,
}

## Estimator Creation

In [6]:
import sagemaker
from sagemaker.tensorflow import TensorFlow

estimator = TensorFlow(entry_point='src/train_ae.py',
                       base_job_name='train-autoencoder',
                       train_instance_type=train_instance_type,
                       train_instance_count=1,
                       hyperparameters=hyperparameters,
                       role=sagemaker.get_execution_role(), # Passes to the container the AWS role that you are using on this notebook
                       framework_version='2.1.0',
                       py_version='py3',
                       script_mode=True)

## Model Training

In [7]:
estimator.fit(inputs)

's3_input' class will be renamed to 'TrainingInput' in SageMaker Python SDK v2.
'create_image_uri' will be deprecated in favor of 'ImageURIProvider' class in SageMaker Python SDK v2.


2020-10-05 21:46:59 Starting - Starting the training job...
2020-10-05 21:47:03 Starting - Launching requested ML instances......
2020-10-05 21:48:08 Starting - Preparing the instances for training......
2020-10-05 21:49:16 Downloading - Downloading input data
2020-10-05 21:49:16 Training - Downloading the training image.........
2020-10-05 21:50:48 Training - Training image download completed. Training in progress.[34m2020-10-05 21:50:52,198 sagemaker-containers INFO     Imported framework sagemaker_tensorflow_container.training[0m
[34m2020-10-05 21:50:52,630 sagemaker-containers INFO     Invoking user script
[0m
[34mTraining Env:
[0m
[34m{
    "additional_framework_parameters": {},
    "channel_input_dirs": {
        "train": "/opt/ml/input/data/train"
    },
    "current_host": "algo-1",
    "framework_module": "sagemaker_tensorflow_container.training:main",
    "hosts": [
        "algo-1"
    ],
    "hyperparameters": {
        "batch-size": 1024,
        "num-epochs": 15,
 

The message above that "Your model will NOT be servable with SageMaker TensorFlow Serving container" is not a concern. We saved the models as Keras `.h5` files, and we'll use them for data processing, not inference. The models are compacted by SageMaker and made available at this location:

In [8]:
estimator.model_data

's3://sagemaker-us-east-1-160951647621/train-autoencoder-2020-10-05-21-46-58-914/output/model.tar.gz'

With the encoder and decoder ready for use, we can proceed to [data augmentation](data_aug.ipynb).