## Training Models with AWS SageMaker

SageMaker is a managed service to train and deploy ML models. This tutorial is a quickstart guide to train models on SageMaker with FARM.

### Prerequisites
* AWS IAM Role with access to SageMaker and S3.
* A training script that needs to be executed. You can find examples at https://github.com/deepset-ai/FARM/tree/master/examples.
* A directory containing at least a `requirements.txt` file with the dependencies(including FARM) for running the training script. 
* Cleaned/processed training data uploaded to a S3 bucket.

In [1]:
from sagemaker.pytorch.estimator import PyTorch

In [None]:
role = "arn:aws:iam::xxxxxxxxxxxx:role/service-role/AmazonSageMaker-ExecutionRole-20191204Txxxxxx"

`Estimator` class is a high level abstraction for handling the training task on SageMaker. `PyTorch` Estimator builds a container with a specific version of PyTorch, install dependencies as supplied in the `source_dir`, and executes the training script as specified by `entry_point`. 

In [None]:
estimator = PyTorch(    
    entry_point="<path_to_train_script>",
    source_dir="<path>",  # the dir contains the requirements.txt file
    framework_version="1.3.1", # PyTorch version
    train_instance_count=1,
    train_instance_type="ml.p3.8xlarge",
    role=role  # IAM role to assume for execution
)

The `fit()` methods starts a training job. It takes the S3 path of the train data as an argument. `wait` argument specifies whether the call should wait for the training job to finish.

In [None]:
estimator.fit(inputs = {"train": "s3://<path-to-train-data>"}, wait=False)