## Amazon SageMaker Initialization

Initialize the notebook instance. Get the AWS Region and a SageMaker execution role.

### SageMaker role

The following code cell defines `role` which is the IAM role ARN used to create and run SageMaker training and hosting jobs. This is the same IAM role used to create this SageMaker Notebook instance. 

`role` must have permission to create a SageMaker training job and host a model. For granular policies you can use to grant these permissions, see [Amazon SageMaker Roles](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html). If you do not require fine-tuned permissions for this demo, you can use the IAM managed policy AmazonSageMakerFullAccess to complete this demo. 

In [None]:
%%time
! python3 -m pip install --upgrade sagemaker
import sagemaker
from sagemaker import get_execution_role
from sagemaker.estimator import Estimator
import boto3

sagemaker_session = sagemaker.Session()
bucket = sagemaker_session.default_bucket()

role = (
    get_execution_role()
)  # provide a pre-existing role ARN as an alternative to creating a new role
role_name = role.split(["/"][-1])
print(f"SageMaker Execution Role: {role}")
print(f"The name of the Execution role: {role_name[-1]}")

client = boto3.client("sts")
account = client.get_caller_identity()["Account"]
print(f"AWS account: {account}")

session = boto3.session.Session()
region = session.region_name
print(f"AWS region: {region}")

To verify that the role above has required permissions:

1. Go to the [IAM console](https://console.aws.amazon.com/iam/home).
2. Select **Roles**.
3. Enter the role name in the search box to search for that role. 
4. Select the role.
5. Use the **Permissions** tab to verify this role has required permissions attached.

## Configure SageMaker PyTorch Estimator function options

In the following code blocks, you can update the estimator function to use a different instance type, instance count, distribution strategy and hyperparameters. You're also passing an entry point to the training script.

**Instance types**

For this experiment, we recommend using any of the following instance types:

1. `ml.p4d.24xlarge`
1. `ml.p4de.24xlarge`

**Instance count**

You should use at least 2 instances.

In [None]:
instance_type = "ml.p4d.24xlarge"  # Other supported instance type: ml.p3.16xlarge, ml.p3dn.24xlarge
instance_count = 8  # You can use 2, 4, 8, etc.

**Distribution strategy**

Note that to use SMDDP collectives, you need to update the `distribution` strategy, and set it to use `smdistributed dataparallel`.

In [None]:
dist_strategy = {"smdistributed": {"dataparallel": {"enabled": True}}}

**Create the estimator function and pass the parameters**

Use all parameters from previous sections to configure the estimator function.

In [None]:
from sagemaker.pytorch import PyTorch

estimator = PyTorch(
    entry_point="run_dataloader.py",
    role=role,
    image_uri= <YOUR_DLC>, # Modify this line to use your DLC.
    source_dir=".",
    instance_count=instance_count,
    instance_type=instance_type,
    py_version="py39",
    sagemaker_session=sagemaker_session,
    debugger_hook_config=False,
    distribution=dist_strategy,
)

## Start the SageMaker training job
The last step before launching the training job is to assign it a name. It is used as prefix to the SageMaker training job, so you can identify it easily in the [SageMaker console](console.aws.amazon.com/sagemaker/).

In [None]:
job_name = f"pt-smddp-test-{instance_count}-p4d"

In [None]:
# Submit SageMaker training job
estimator.fit(job_name=job_name)