## Hyper parameter tuning by SageMaker

1. OS level setting
2. Prepare Requirements
3. Hyperparameter Tuning

**Reference**

* [Hyperparameter Tuning using SageMaker PyTorch Container](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/hyperparameter_tuning/pytorch_mnist/hpo_pytorch_mnist.ipynb)

## OS level setting

Install packages.

```
sudo yum install gcc72-c++.x86_64
sudo yum install clang
```

g++: Install & link same version of gcc.

## Prepare Requirements

In [1]:
!git pull origin master
!pip install pipenv

From https://github.com/icoxfog417/allennlp-sagemaker-tuning
 * branch            master     -> FETCH_HEAD
Already up-to-date.
[31men-core-web-sm 2.1.0 requires spacy>=2.1.0, which is not installed.[0m
[33mYou are using pip version 10.0.1, however version 19.0.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [2]:
! export PIPENV_VENV_IN_PROJECT=1 && cd ../ && pipenv install --python=3.6

[31m[22mVirtualenv already exists![39m[22m
[39m[1mRemoving existing virtualenv…[39m[22m
[39m[1mCreating a virtualenv for this project…[39m[22m
Pipfile: [31m[1m/home/ec2-user/SageMaker/allennlp-sagemaker-tuning/Pipfile[39m[22m
[39m[1mUsing[39m[22m [31m[1m/home/ec2-user/anaconda3/envs/JupyterSystemEnv/bin/python[39m[22m [32m[22m(3.6.8)[39m[22m [39m[1mto create virtualenv…[39m[22m
⠴[0m Creating virtual environment...[K[34m[22mUsing base prefix '/home/ec2-user/anaconda3/envs/JupyterSystemEnv'
New python executable in /home/ec2-user/SageMaker/allennlp-sagemaker-tuning/.venv/bin/python
Installing setuptools, pip, wheel...
done.
Running virtualenv with interpreter /home/ec2-user/anaconda3/envs/JupyterSystemEnv/bin/python
[39m[22m
[K[?25h[32m[22m✔ Successfully created virtual environment![39m[22m[0m 
Virtualenv location: [32m[22m/home/ec2-user/SageMaker/allennlp-sagemaker-tuning/.venv[39m[22m
  [32m[22m$ pipenv --rm[39m[22m and rebuilding th

In [13]:
import os


def set_pythonpath():
    import sys
    python_version = "python" + str(sys.version_info.major) \
                     + "." + str(sys.version_info.minor)
    venv_dir = "../.venv/lib/{}/site-packages".format(python_version)
    lib_dir = os.path.join(os.path.realpath("."), venv_dir)
    project_dir = os.path.join(os.path.realpath("."), "../")
    sys.path.append(lib_dir)
    sys.path.append(project_dir)

set_pythonpath()

In [14]:
def execute_example():
    from example.train import train as train_fn
    
    root = 'https://raw.githubusercontent.com/allenai/allennlp/master/tutorials/tagger/'
    train_data_path = root + 'training.txt'
    validation_data_path = root + 'validation.txt'

    embedding_dim = 6
    hidden_dim = 6
    num_epochs = 1

    train_fn(train_data_path, validation_data_path,
                   embedding_dim, hidden_dim, num_epochs=num_epochs)

In [15]:
execute_example()

2it [00:00, 2296.36it/s]
2it [00:00, 6636.56it/s]
100%|██████████| 4/4 [00:00<00:00, 26092.09it/s]
ERROR:allennlp.common.util:unable to check gpu_memory_mb(), continuing
Traceback (most recent call last):
  File "/home/ec2-user/SageMaker/allennlp-sagemaker-tuning/notebooks/../example/../.venv/lib/python3.6/site-packages/allennlp/common/util.py", line 379, in gpu_memory_mb
    encoding='utf-8')
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/subprocess.py", line 336, in check_output
    **kwargs).stdout
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/subprocess.py", line 418, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['nvidia-smi', '--query-gpu=memory.used', '--format=csv,nounits,noheader']' returned non-zero exit status 9.
accuracy: 0.2222, loss: 1.2204 ||: 100%|██████████| 1/1 [00:00<00:00, 202.52it/s]
accuracy: 0.2222, loss: 1.2095 ||: 100%|██████████| 1/1 [00:00<00:00, 381.75it/s]


['V', 'V', 'V', 'V', 'V']


## Hyperparameter Tuning

### Create Session

In [19]:
import sagemaker
from sagemaker.tuner import IntegerParameter, CategoricalParameter, ContinuousParameter, HyperparameterTuner


sagemaker_session = sagemaker.Session()
bucket = "sagemaker.tech-sketch.jp"
prefix = "allennlp_test"
role = sagemaker.get_execution_role()

### Upload data

In [21]:
from allennlp.common.file_utils import cached_path


root = "https://raw.githubusercontent.com/allenai/allennlp/master/tutorials/tagger/"
urls = [(root + file_name) for file_name in ("training.txt", "validation.txt")]
paths = [cached_path(u) for u in urls]
s3_paths = []

for path in paths:
    s3_path = sagemaker_session.upload_data(path=path, bucket=bucket, key_prefix=prefix)
    print("input spec (in this case, just an S3 path): {}".format(s3_path))
    s3_paths.append(s3_path)

input spec (in this case, just an S3 path): s3://sagemaker.tech-sketch.jp/allennlp_test/c3e1f451545a79cf7582dec24d072db6f5bb0d1ae24a924d03c9944516e16b60.47b1193282cbd926a1b602cc6d5a22324cfab24e669ca04f1ff4851a35c73393
input spec (in this case, just an S3 path): s3://sagemaker.tech-sketch.jp/allennlp_test/a377491818b2bbd2f0561346da1d8d25f29bbc1c8df640eaf6ee125071d18d16.22d6cc9ff0fe67add48c843670f9b158a2cd4d4527d8d3b9587a7c48ff356e2f
