# AWS Sagemaker Training and Deploying
## Cyclone Kenneth 2019-04-25
### Part II

In this part II notebook, we will upload the data to AWS S3 that we generated for training in the previous notebook. We will kick off an AWS Sagemaker object detection job and monitor the results. At the end of this notebook, you will have trained your own OSM-based CNN object detector!

![](assets/happycloud.png)



A couple of things worth noting:

🤔 ML models are not super useful unless they are scaled across a large amount of data

🤔 To effectively scale across data, you need to be efficient

🤔 Because we will be passing sensitive data to this notebook in order to scale our cloud compute through Sagemaker, we will use papermill to run this notebook from within python. It creates a simple wrapper around the notebook so that we can specify variables.

e.g.

``` python
import papermill as pm
pm.execute_notebook('osm_ml_training_pt2.ipynb','osm_ml_training_pt2_out.ipynb', parameters = dict(sage_bucket='',my_bucket='', role=''))

```

In [9]:
import sagemaker
from sagemaker import get_execution_role
from sagemaker.amazon.amazon_estimator import get_image_uri

We will use 'papermill' (https://github.com/nteract/papermill) to pass sensitive variables to this jupyter notebook. Things like passwords, cloud locations, etc, should be paramterized as a best practice -- Never stored in a repo (especially public facing).

In [2]:
sage_bucket=''
my_bucket=''
prefix = my_bucket   #this is your model prefix
sessname =''
nclass = 1
epochs =2
mini_batch_size =2
lr = 0.001
lr_scheduler_factor =0.1
momentum =0.9
weight_decay =0.0005
overlap = 0.5
momentum = 0.45
weight_decay =0.0005
nms_thresh = 0.45
image_shape =256
label_width =150
n_train_samples = 16551
network ='resnet-50'
optim = 'sgd'
role = ''

In [10]:
import boto3
s3 = boto3.client('s3')

s3.upload_file('rec/val.rec', sage_bucket, my_bucket+'/validatation/val.rec')
s3.upload_file('rec/train.rec', sage_bucket, my_bucket+'/train/train.rec')

In [11]:
sess = sagemaker.Session()
bucket = sess.default_bucket()
#sage_bucket='sagemaker-us-east-2-771575179338'
#my_bucket='geohack_sbs'
training_image = get_image_uri(sess.boto_region_name, 'object-detection', repo_version="latest")


In [None]:
s3_train_data = 's3://{}/{}'.format(sage_bucket, my_bucket+'/train/')
s3_validation_data = 's3://{}/{}'.format(sage_bucket, my_bucket+'/validatation/')

s3_output_location = 's3://{}/{}/output'.format(sage_bucket, my_bucket)

od_model = sagemaker.estimator.Estimator(training_image,
                                         role, 
                                         train_instance_count=1, 
                                         train_instance_type='ml.p2.xlarge',
                                         train_volume_size = 50,
                                         train_max_run = 360000,
                                         input_mode= 'File',
                                         output_path=s3_output_location,
                                         sagemaker_session=sess)
                                         
od_model.set_hyperparameters(base_network=network,
                             use_pretrained_model=1,
                             num_classes=nclass,
                             mini_batch_size=mini_batch_size,
                             epochs=epochs,
                             learning_rate=lr,
                             lr_scheduler_step='3,6',
                             lr_scheduler_factor=lr_scheduler_factor,
                             optimizer=optim,
                             momentum=momentum,
                             weight_decay=weight_decay,
                             overlap_threshold=overlap,
                             nms_threshold=nms_thresh,
                             image_shape=image_shape,   
                             label_width=label_width,		
                             num_training_samples=n_train_samples)

train_data = sagemaker.session.s3_input(s3_train_data, distribution='FullyReplicated', 
                        content_type='application/x-recordio', s3_data_type='S3Prefix')
validation_data = sagemaker.session.s3_input(s3_validation_data, distribution='FullyReplicated', 
                             content_type='application/x-recordio', s3_data_type='S3Prefix')
data_channels = {'train': train_data, 'validation': validation_data}
od_model.fit(inputs=data_channels, logs=True)    

  

So now you are training!!! This will take a little while. We are only training for a very small number of epochs (2!), so we don't expect to have a really robust model. Potentially many 100s of epochs may be required depeneding on the quality and amount of training data we have. 

To level set, this model will be CRAPPY. But that is ok. You now have the basic tools required to set up and improve upon your own problem.

🤔 What are the big considerations as a data scientist?

🤔 What could we do to improve our model?

🤔 How could we evaluate the quality of our data?


In [None]:
object_detector = od_model.deploy(initial_instance_count = 1,instance_type = 'ml.m4.xlarge')   

#response = object_detector.predict(data)

# Tears down the SageMaker endpoint and endpoint configuration
#object_detector.delete_endpoint()

# Deletes the SageMaker model
#object_detector.delete_model()
