# Training Job with Encrypted Static Assets

In the [notebook about creating a training job in VPC mode](https://github.com/aws/amazon-sagemaker-examples/blob/master/sagemaker-fundamentals/create-training-job/create_training_job_vpc.ipynb) you learnt how to create a SageMaker training job with network isolation. Network isolation enables you to protect your data and model from being intercepted by cyber pirates. 

![pirate](assets/pirate.jpg)

Another way you can protect your static assets is to encrypt them before moving them from location A to location B. In this notebook, you will walk through a few techniques on that with the help of AWS Key Management Service [(AWS KMS)](https://docs.aws.amazon.com/kms/latest/developerguide/overview.html).

Encryption is a wildly used technology, in addition to the above introductory material, you can find many free lectures online. 

## Symmetric Ciphers
We will focus on symmetric ciphers in this notebook. Quote from the GNU Privacy Handbook

> A symmetric cipher is a cipher that uses the same key for both encryption and decryption. Two parties communicating using a symmetric cipher must agree on the key beforehand. Once they agree, the sender encrypts a message using the key, sends it to the receiver, and the receiver decrypts the message using the key. As an example, the German Enigma is a symmetric cipher, and daily keys were distributed as code books. Each day, a sending or receiving radio operator would consult his copy of the code book to find the day's key. Radio traffic for that day was then encrypted and decrypted using the day's key. Modern examples of symmetric ciphers include 3DES, Blowfish, and IDEA.

## Environment to run this notebook
You can run this notebook on your local machine or EC2 instance as an IAM user or you can run it on SageMaker Notebook Instance as a SageMaker service role. To avoid confusion, we will assume you are running it as an IAM user.

## Permissions
You will need to attach the following permissions to the IAM user

* IAMFullAccess 
* AWSKeyManagementServicePowerUser
* AmazonEC2ContainerRegistryFullAccess

## Outline of this notebook

* Generate a symmetric customer master key (CMK)
* Allow your SageMaker service role to use the CMK
* Generate a data key from the CMK
* Encrypt some data with the data key and upload the encrypted data to S3
* Create a SageMaker service role
* Build a training image 
* Create a SageMaker training job using the encrypted data
* Verify that data retrieved from S3 is encrypted and SageMaker needs your data key to decrypt

The process of using a data key to encrypt your data instead of using master key directly is called [**envelope encryption**](https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#enveloping)
You can directly use the master key to encrypt your data, but by using a data key, you reduced the risk of [man-in-the-middle-attack](https://en.wikipedia.org/wiki/Man-in-the-middle_attack). 
We will discuss the use of data key in detail later. 

![envelope-encryption](assets/envelope-encryption.jpg)