# Train Model
#### tensorflow_p36 environment

ref:  https://github.com/aws-samples/amazon-sagemaker-script-mode/blob/master/tf-eager-script-mode/tf-eager-sm-scriptmode.ipynb

Note:  AWS tutorials tend to name the post-training data = 'test'.   Most books call this 'val' for validation or 'eval' for model evaluation.   I named it 'val'.   So if you follow the example, AWS calls it 'test', I call it 'val'

## Step 3 - SageMaker HOSTED Training
At this point, you know you have a working training script (train.py).  So, you can have SageMaker deploy it to outside (not local) resources.  




In [None]:
import sagemaker
from sagemaker.tensorflow import TensorFlow

## Data
SageMaker will pull the data from S3.    This is much faster than putting it in your Docker.   However, this is somewhat confusing because the MobileNet software (and utilities) were looking for data path in the config file.  We need to merge this approach:
- allow SageMaker to pull from S3
- AND, we want to continue leveraging the config design pattern

The other challenge is working with tarballs versus tfrecord files.

In [None]:
s3_prefix = 'cfaanalyticsresearch-sagemaker'

traindata_s3_prefix = '{}/datasets/cfa_products/train'.format(s3_prefix)
valdata_s3_prefix = '{}/datasets/cfa_products/val'.format(s3_prefix)
print (traindata_s3_prefix)
print (valdata_s3_prefix)

## TIP
you would be wise to test and make sure you path is good before continuing!!  
cut/paste the printed value and put it into the following form.   You can run this AWS CLI command in a new cell.  

! aws s3 ls s3://cfaanalyticsresearch-sagemaker/datasets/cfa_products/train/  
! aws s3 ls s3://cfaanalyticsresearch-sagemaker/datasets/cfa_products/val/

### Copy data from local (SageMaker instance) to S3
If you ran the TrainModel_Step1 notebook, the data was moved to:
- code/tfrecords/train 
- code/tfrecords/val

In [None]:
! pwd

train_s3 = sagemaker.Session().upload_data(path='./code/tfrecords/train/', key_prefix=traindata_s3_prefix)
val_s3 = sagemaker.Session().upload_data(path='./code/tfrecords/val/', key_prefix=valdata_s3_prefix)

inputs = {'train':train_s3, 'val': val_s3}

print(inputs)

In [None]:
model_dir = '/opt/ml/model'     # this is related to how it gets deployed in the Docker
                                # this is a SAGEMAKER thing - don't confuse with the model_dir 
                                # that we have inside our code
# p2.xlarge == $1/hr
# p3.2xlarge = $3/hr
# this is a very controlled train & quick so the better server makes sense
# if you are developing - use the p2
train_instance_type = 'ml.p3.2xlarge'   

hyperparameters = {'pipeline_config_path' : 'sagemaker_mobilenet_v1_ssd_retrain.config',
                   'num_train_steps' : '502',
                   'num_eval_steps' : '10'
                  }

# SageMaker Execution Role
role = sagemaker.get_execution_role()

In [None]:
estimator = TensorFlow(entry_point='train.py',
                       source_dir='code',
                       model_dir=model_dir,
                       train_instance_type=train_instance_type,
                       train_instance_count=1,
                       hyperparameters=hyperparameters,
                       role=role,
                       base_job_name='cfa-products-mobilenet-v1_SSD',
                       framework_version='1.13',
                       py_version='py3',
                       script_mode=True)

In [None]:
estimator.fit(inputs)

In [None]:
##
# data locationi
# config files
# job name