# Preparing for Sagemaker
## Code
### Uploading local code
There should be a single folder with code. the name of this folder will be passed as an argument for `source_dir`. 

### Importing third party modules
You can add a `requirements.txt` file for external packages, which would be installed by sagemaker as long as it is accompanied by a `setup.py`.

### Importing custom modules
For internal modules, you need to explicitly pass the name of the module as dependencies.

For example, if you want to access a module called `ssd300` in the scripts folder, you need to pass it to SageMaker as:
```python
{dependencies=['scripts/ssd300']
```

## Data
### Local data
For local data with the name `data_dir`, you need to specify it as:
```python
inputs = {'training': f'file://{data_dir}'}
```
### Remote data
For remote data, you to upload to s3 and then point Sagemaker to that bucket.

## Training Code
### Command line arguments
All training scripts must be able to accept the following arguments:
```python
p.add_argument('--model-dir', type=str, default=os.environ.get('SM_MODEL_DIR'))  # for tf
p.add_argument('--model_dir', type=str, default=os.environ.get('SM_MODEL_DIR'))  # for pytorch
p.add_argument('--data_dir', type=str, default = os.environ.get('SM_CHANNEL_TRAINING'))
```
### Accessing uploaded data
`SM_CHANNEL_XX` is the location of your data. For example, if you passed inputs with:
```python
inputs = {'training': training_data, 'test': test_data}
```
then you can access that data through `os.environ.get('SM_CHANNEL_TRAINING')` and `os.environ.get('SM_CHANNEL_TEST')`.

The default location for `SM_CHANNEL_XX` is `/opt/ml/input/data/`. That means `SM_CHANNEL_TRAINING` is the equivalent to `/opt/ml/input/data/training`, and that `SM_CHANNEL_TEST` is equivalent to `/opt/ml/input/data/test`.

### Writing artifacts
Any writes must be saved at `SM_MODEL_DIR`.
```python
p.add_argument('-o', '--output_path', default = os.environ.get('SM_MODEL_DIR'))
```
The default location for `SM_MODEL_DIR` is `/opt/ml/model`.

In [9]:
import sagemaker
import os
from sagemaker.tensorflow import TensorFlow

sess = sagemaker.Session()
role = "SageMakerRole"

In [10]:
git_config = {'repo': 'https://github.com/mynameisvinn/SSD300', 
              'branch': 'master'}

In [11]:
tf_estimator = TensorFlow(entry_point='train_ssd300.py', 
                          role=role,
                          source_dir="scripts",
                          instance_count=1, 
                          instance_type='local',
                          framework_version='1.12.0', 
                          py_version='py3',
                          script_mode=True,
                          dependencies=['scripts/ssd300'],
                          hyperparameters={
                              'epochs': 2,
                              'batch_size': 1,
                              'data_def_dir': '/opt/ml/input/data/training/tooth_id_v1.3',
                              'reload_data_path': '/opt/ml/input/data/training/image_label_sample_data.npy',
                              'exp_name': 'myexperiment',
                              'model_type': 'tooth-id',
                              'steps_per_epoch': 1,
                          }
                         )

In [12]:
data_dir = os.path.join(os.getcwd(), 'for_vin')
f'file://{data_dir}'

'file:///Users/mynameisvinn/Dropbox/Temp/ml_dental_ssd300/for_vin'

In [13]:
inputs = {'training': f'file://{data_dir}'}
tf_estimator.fit(inputs) 

Building with native build. Learn about native build in Compose here: https://docs.docker.com/go/compose-native-build/
Creating tj6lg3zzpl-algo-1-mi3m8 ... 
Creating tj6lg3zzpl-algo-1-mi3m8 ... done
Attaching to tj6lg3zzpl-algo-1-mi3m8
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m 2021-02-15 19:41:51,014 sagemaker-containers INFO     Imported framework sagemaker_tensorflow_container.training
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m 2021-02-15 19:41:51,027 sagemaker-containers INFO     No GPUs detected (normal if no gpus installed)
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m 2021-02-15 19:41:51,707 sagemaker-containers INFO     Installing module with the following command:
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m /usr/bin/python -m pip install -U . -r requirements.txt
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m Processing /opt/ml/code
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m Collecting matplotlib==3.3.2 (from -r requirements.txt (line 1))
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m [?25l  Downloading https://files.pythonhosted.org/packages/

[K    100% |████████████████████████████████| 204kB 3.4MB/s ta 0:00:01
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m [?25hCollecting GitPython>=2.0.8 (from neptune_client==0.4.121->-r requirements.txt (line 2))
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m [?25l  Downloading https://files.pythonhosted.org/packages/fb/67/47a04d8a9d7f94645676fe683f1ee3fe9be01fe407686c180768a92abaac/GitPython-3.1.13-py3-none-any.whl (159kB)
[K    100% |████████████████████████████████| 163kB 8.3MB/s ta 0:00:01
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m [?25hCollecting packaging (from neptune_client==0.4.121->-r requirements.txt (line 2))
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m [?25l  Downloading https://files.pythonhosted.org/packages/3e/89/7ea760b4daa42653ece2380531c90f64788d979110a2ab51049d92f408af/packaging-20.9-py2.py3-none-any.whl (40kB)
[K    100% |████████████████████████████████| 40kB 2.4MB/s ta 0:00:01
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m Collecting msgpack (from bravado->neptune_client==0.4.121->-r requirements.txt (line 2))

[36mtj6lg3zzpl-algo-1-mi3m8 |[0m [?25l  Downloading https://files.pythonhosted.org/packages/4d/70/fd441df751ba8b620e03fd2d2d9ca902103119616f0f6cc42e6405035062/pyrsistent-0.17.3.tar.gz (106kB)
[K    100% |████████████████████████████████| 112kB 4.2MB/s ta 0:00:01
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m [?25hCollecting webcolors; extra == "format" (from jsonschema[format]>=2.5.1->bravado-core>=5.16.1->bravado->neptune_client==0.4.121->-r requirements.txt (line 2))
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m   Downloading https://files.pythonhosted.org/packages/12/05/3350559de9714b202e443a9e6312937341bd5f79f4e4f625744295e7dd17/webcolors-1.11.1-py3-none-any.whl
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m Collecting strict-rfc3339; extra == "format" (from jsonschema[format]>=2.5.1->bravado-core>=5.16.1->bravado->neptune_client==0.4.121->-r requirements.txt (line 2))
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m   Downloading https://files.pythonhosted.org/packages/56/e4/879ef1dbd6ddea1c77c0078cd59b503368b0456bcca7d063a

[36mtj6lg3zzpl-algo-1-mi3m8 |[0m Using TensorFlow backend.
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m 2021-02-15 19:42:22,574:INFO:__main__:Initializing training module...
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m 2021-02-15 19:42:22,574:INFO:ssd300.train_eval.trainer:Loading image and label data...
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m 2021-02-15 19:42:22,574:INFO:ssd300.train_eval.trainer:Pre-existing data available, reading data...
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m 2021-02-15 19:42:22,628:INFO:ssd300.train_eval.trainer:Load data complete
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m 2021-02-15 19:42:22,628:INFO:ssd300.train_eval.trainer:Loading ssd300 model...
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m __________________________________________________________________________________________________
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m Layer (type)                    Output Shape         Param #     Connected to                     
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m input_1 (InputLayer)            (None, 300, 300, 3)

[36mtj6lg3zzpl-algo-1-mi3m8 |[0m Epoch 1/2
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m 
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m Epoch 00001: val_loss improved from inf to 457.85492, saving model to /opt/ml/model/checkpoint_epoch-01_loss-149.1058_val_loss-457.8549.h5
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m Epoch 2/2
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m 
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m Epoch 00002: val_loss improved from 457.85492 to 30.49664, saving model to /opt/ml/model/checkpoint_epoch-02_loss-1182.4841_val_loss-30.4966.h5
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m savvvvvving
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m 2021-02-15 19:42:50,429:INFO:ssd300.train_eval.trainer:savvvvvving
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m Traceback (most recent call last):
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m   File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m     "__main__", mod_spec)
[36mtj6lg3zzpl-algo-1-mi3m8 |[0m   File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
[36

RuntimeError: Failed to run: ['docker-compose', '-f', '/private/var/folders/xb/tv5r7gc92ql_wr439ljm2wcr0000gn/T/tmpme2h_lmg/docker-compose.yaml', 'up', '--build', '--abort-on-container-exit'], Process exited with code: 1