# Distributed Training Pipeline

---

This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. 

![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-2/advanced_functionality|distributed_tensorflow_mask_rcnn|mask-rcnn-scriptmode-fsx.ipynb)

---


This notebook defines a distributed training pipeline **blueprint** that can be used to pre-process the training data, and compile and train the model using any machine learning framework. The use of this blueprint is illustrated through examples.

## Create AWS CloudFormation Stack

This notebook must be opened in an [Amazon SageMaker Notebook instance](https://docs.aws.amazon.com/sagemaker/latest/dg/nbi.html) created via the  [AWS CloudFormation stack](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/stacks.html) template: [cfn-sgaemaker-notebook.yaml](./cfn-sagemaker-notebook.yaml). 

When you use the [template](./cfn-sagemaker-notebook.yaml) to create the CloudFormation stack, you will need to specify the name of an existing [Amazon S3 bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/creating-bucket.html) in the `S3BucketName` template parameter, and the path to an exisitng [S3 folder](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-folders.html#create-folder) in that bucket in the `FSxS3ImportPrefix` template parameter.  The `S3BucketName` S3 bucket must be located in the AWS region of your CloudFormation stack. Default value for `FSxS3ImportPrefix` parameter is `sagemaker`, and it is highly recommended you use this default value, and ensure `sagemaker` exists as a top-level folder in your `S3BucketName` S3 bucket.

If you have not already created the CloudFormation stack, close this noteobok, [create the CloudFormation stack using AWS management console](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-console-create-stack.html), and open this notebook in the SageMaker Notebook instance created via the CloudFormation stack.

## Prerequisites

So that we may not run out of disk space while building Docker container images later in this notebook, configure a new Docker overlay file-system by executing following steps.

#### Create new Docker overlay file-system directory:

    sudo mkdir -p /home/ec2-user/SageMaker/docker-overlay
    sudo chown root:root /home/ec2-user/SageMaker/docker-overlay

#### Update the Docker overlay file-system configuration 

Add following content to `/etc/docker/daemon.json`

    {
      "data-root": "/home/ec2-user/SageMaker/docker-overlay",
      "runtimes": {
          "nvidia": {
              "path": "nvidia-container-runtime",
              "runtimeArgs": []
          }
      }
    }

#### Restart Docker daemon

    sudo systemctl stop docker
    sudo systemctl start docker
    sudo systemctl status docker.service


## Initialize SageMaker Session

Let us specify the `s3_bucket` and `s3_prefix` that we will use throughout the notebook. The `s3_bucket` and `s3_prefix`  must be the same as the `S3BucketName` and `FSxS3ImportPrefix` CloudFormation stack parameters, respectively. 

In [None]:
!pip3 install sagemaker

import boto3
import sagemaker
from sagemaker import get_execution_role

s3_bucket  =  None # must be same as CloudFormation parameter S3BucketName
s3_prefix = 'sagemaker' # must be same as CloudFormation parameter FSxS3ImportPrefix

role = get_execution_role() # you may provide a pre-existing role ARN here
print(f"SageMaker Execution Role: {role}")

session = boto3.session.Session()
aws_region = session.region_name
print(f"AWS Region: {aws_region}")

sagemaker_session = sagemaker.session.Session(boto_session=session)

try:
    s3_client = boto3.client('s3')
    response = s3_client.get_bucket_location(Bucket=s3_bucket)
    bucket_region = response['LocationConstraint']
    bucket_region = 'us-east-1' if bucket_region is None else bucket_region
    
    print(f"Bucket region: {bucket_region}")
    
    s3_client.head_object(Bucket=s3_bucket, Key=f"{s3_prefix}/")
    print(f"Using S3 folder: s3://{s3_bucket}/{s3_prefix}/ in this notebook")
except:
    print(f"Access Error: Check if '{s3_bucket}' S3 bucket is in '{aws_region}' region, and {s3_prefix} path exists")

sts = boto3.client("sts")
aws_account_id = sts.get_caller_identity()["Account"]

print(f"AWS Account Id: {aws_account_id}")

## Discover Attached EFS File-system

When you create this SageMaker Notebook instance via the [cfn-sagemaker-notebook.yaml](./cfn-sagemaker-notebook.yaml) CloudFormation script, an EFS file-system is automatically created and attached to this SageMaker Notebook instance. Below, we discover the EFS file-system attached to this SageMaker Notebook instance.

In [None]:
import re
notebook_attached_efs=!df -kh | grep 'fs-' | sed 's/\(fs-[0-9a-z]*\)\.efs\..*/\1/'

efs_enabled = False
if notebook_attached_efs and re.match(r'fs-[0-9a-z]+', notebook_attached_efs[0]):
    efs_enabled=True
    print(f"SageMaker notebook has attached EFS: {notebook_attached_efs}")
else:
    print("No EFS file-system is attached to this notebook")


## Discover Attached FSx for Lustre File-system

When you create this SageMaker Notebook instance via the [cfn-sagemaker-notebook.yaml](./cfn-sagemaker-notebook.yaml) CloudFormation script, a FSx for Lustre file-system is automatically created and attached to this SageMaker Notebook instance, and `s3://S3BucketName/FSxS3ImportPrefix` is automatically imported to the FSx for Lustre file-system. Below, we discover the FSx for Lustre file-system attached to this SageMaker Notebook instance.

In [None]:
import boto3

def fsx_file_systems(fsx_client):
    """Generator for listing Fsx file systems"""

    next_token = None
    while True:
        if next_token:
            resp = fsx_client.describe_file_systems(NextToken=next_token)
        else:
            resp = fsx_client.describe_file_systems()
            
        file_systems = resp['FileSystems']
        for fs in file_systems:
            yield fs

        try:
            next_token = resp['NextToken']
        except KeyError:
            break

fsx_file_system_id = None

notebook_attached_fsx = !df -kh | grep '@tcp:/' \
    | sed 's/\([0-9a-zA-Z\.]*\)@tcp:\/\([a-zA-Z0-9]*\).*/\1 \2/'
fsx_mount_name = notebook_attached_fsx[0].split()[1]

fsx_client = boto3.client("fsx")

for fsx_fs in fsx_file_systems(fsx_client):
    mount_name = fsx_fs['LustreConfiguration']['MountName']
    fs_id = fsx_fs['FileSystemId']
    if mount_name == fsx_mount_name:
        fsx_file_system_id = fs_id
        break
        
if fsx_file_system_id:
    print(f"FSx for Lustre file-system is attached: {fsx_file_system_id}")
else:
    print(f"No FSx for Lustre file-system is attached")


## Define SageMaker File-system  Channels

Next define SagerMaker data channels for FSx for Lustre and EFS file-systems.

### Define Amazon FSx Lustre Data Channel 

Next, we define the *fsx* data channel using FSx Lustre file-system.

In [None]:
from sagemaker.inputs import FileSystemInput

data_channels = None

if fsx_file_system_id:
    file_system_type = 'FSxLustre'
    file_system_access_mode = 'rw'
    
    # file_system_directory_path below must match the FSxS3ImportPrefix parameter value
    # in the CloudFormation stack you used to create this notebook server. Default value is: 'sagemaker'
    file_system_directory_path = "sagemaker"
    
    fsx = FileSystemInput(file_system_id=fsx_file_system_id,
                           file_system_type=file_system_type,
                           directory_path=f"/{fsx_mount_name}/{file_system_directory_path}",
                           file_system_access_mode=file_system_access_mode)

    data_channels = {'fsx': fsx}
    print(data_channels)
else:
    print("Fatal Error: FSx for Lustre file-system is not available.")
    print("Did you create the SageMaker Notebook via the CloudFormation stack?")

### Define Amazon EFS Data Channel 

Next, we define the *efs* data channel using EFS file-system.

In [None]:
from sagemaker.inputs import FileSystemInput

if efs_enabled:
    # Specify EFS file system id.
    efs_file_system_id = notebook_attached_efs[0]
    print(f"EFS file-system-id: {efs_file_system_id}")

    file_system_access_mode = "rw"

    # Specify your file system type
    file_system_type = "EFS"

    efs = FileSystemInput(
        file_system_id=efs_file_system_id,
        file_system_type=file_system_type,
        directory_path=f"/",
        file_system_access_mode=file_system_access_mode,
    )
    
    data_channels['efs'] = efs
    print(data_channels)
else:
    print("Fatal Error: EFS file-system is not available.")
    print("Did you create the SageMaker Notebook via the CloudFormation stack?")

## Select Security Group and Subnet

Here, we automatically select a VPC security group and subnet created via the CloudFormation stack. We will use these to run our processing and training jobs, so they have access to the EFS and Fsx for Lustre file-systems available via the VPC subnet.

**Note:** For maximum performance, by default, we choose the single subnet used by the FSx for Lustre file-system. However, if that subnet does not have the Amazon EC2 instance capacity for you to launch your processing or training jobs, you may edit the `subnets` variable below to specify your subnet list. To see all available subnets, see `Subnets` under  `Outputs` tab of your CloudFormation stack. For using `trn1` instances, you must specify only one subnet in the `subnets` list.

In [None]:
import boto3

security_group_ids=None
subnets=None

if fsx_file_system_id:
    fsx_client = boto3.client("fsx")
    ec2_client = boto3.client('ec2')
    
    response = fsx_client.describe_file_systems(FileSystemIds=[fsx_file_system_id])
    file_system=response['FileSystems'][0]
    subnets = file_system['SubnetIds']
    network_interface_ids = file_system['NetworkInterfaceIds']
         
    response = ec2_client.describe_network_interfaces(
        NetworkInterfaceIds=network_interface_ids)
    network_interface = response['NetworkInterfaces'][0]
    groups = network_interface['Groups']
    security_group_ids = [ x['GroupId'] for x in groups ]
   
subnets = list(set(subnets)) if isinstance(subnets, list) else None
security_group_ids = list(set(security_group_ids)) if isinstance(security_group_ids, list) \
                        else None

print(f"Subnets: {subnets}")
print(f"Security groups: {security_group_ids}")

## Specify Pipeline Spec

We specify a *pipeline spec* file that we wish to build and execute in this notebook. Each pipeline spec has a `name`, a `version`, relative path to its `container` directory, and includes a list of pipeline `steps`. Each step in the pipeline has a `name`, a `description`, and a `config` file that defines the step. 

For example, we specify [neuronx_nemo_megatron/llama2_7b/pipeline.yaml](./examples/neuronx_nemo_megatron/llama2_7b/pipeline.yaml) pipeline, below. Complete the [prerequisites](./examples/neuronx_nemo_megatron/llama2_7b/README.md) for this pipeline before proceeding further.

In [None]:
import yaml
import json

pipeline_file="examples/neuronx_nemo_megatron/llama2_7b/pipeline.yaml"
with open(pipeline_file, "r") as pf:
    pipeline_spec=yaml.safe_load(pf)

print(json.dumps(pipeline_spec, indent=2))

## Maybe Build and Push Pipeline Container Image to ECR

Next, if `pipeline.container` is specified, we build and push the pipeline spec container image to Amazon ECR. This may take several minutes on first-time build on this notebook. The build log file is created in pipeline spec container path.

**Note:** Either `pipeline.image` or `pipeline.container` must be specified

In [None]:
%%time
import sys, os, subprocess, stat

container_spec =  pipeline_spec['pipeline'].get('container', None)
if container_spec:
    container_path = os.path.join("containers", pipeline_spec['pipeline']['container'])
    with open(os.path.join(container_path, "build.log"), "w") as logfile:
            print(f"Building and pushing {container_path} to ECR; see log file: {container_path}/build.log")
            container_build_script = os.path.join(container_path, "build_tools", "build_and_push.sh")
            
            st = os.stat(container_build_script)
            os.chmod(container_build_script, st.st_mode | stat.S_IEXEC)
            subprocess.check_call([container_build_script, aws_region], stdout=logfile, stderr=subprocess.STDOUT)
            
            image_tag = !cat {container_path}/build_tools/set_env.sh \
                | grep 'IMAGE_TAG' | sed 's/.*IMAGE_TAG=\(.*\)/\1/'
            
            image_name = !cat {container_path}/build_tools/set_env.sh \
                | grep 'IMAGE_NAME' | sed 's/.*IMAGE_NAME=\(.*\)/\1/'
            
            ecr_image_uri=f"{aws_account_id}.dkr.ecr.{aws_region}.amazonaws.com/{image_name[0]}:{image_tag[0]}"
else:
    ecr_image_uri = pipeline_spec['pipeline'].get('image', None)

assert ecr_image_uri, "No 'container' or 'image' specified in pipeline spec"
print(ecr_image_uri)


## Maybe Download HuggingFace Model Snapshot

If `huggingface` object is specified in the pipeline spec, the HuggingFace model snapshot with revision `huggingface.revision` is downloaded and uploaded to the `s3_bucket` under the `s3_prefix`. If `huggingface.tensors` is not set, or set to `false`, model tensors are **not** downloaded from model snaphshot.

**Note: You must specify `token` below if the `huggingface.model` requires a HuggingFace token for downloading from the HuggingFace hub.**

In [None]:
import os
import subprocess

hf_spec = pipeline_spec.get('huggingface', None)
hf_model_id = None
hf_model_revision = None

if hf_spec:
    token = None # Specify HuggingFace token, if required by model
   
    hf_model_id = hf_spec.get('model', None)
    assert hf_model_id, "'huggingface.model' is required'"

    hf_model_revision = hf_spec.get('revision', None)
    assert hf_model_revision, "'huggingface.revision' is required'"

    tensors = hf_spec.get('tensors', None)
    if not tensors:
        print(f"Downloading {hf_model_id}:{hf_model_revision} from HuggingFace without tensors")
    else:
        print(f"Downloading {hf_model_id}:{hf_model_revision} from HuggingFace Hub with tensors")
   
    s3_model_prefix = f"{s3_prefix}/huggingface/models/{hf_model_id}/{hf_model_revision}"  # folder where model checkpoint will go
    print(f"s3_model_prefix: {s3_model_prefix}")

    try:
        s3_client.head_object(Bucket=s3_bucket, Key=f"{s3_model_prefix}/config.json")
        print(f"Skipping download; HuggingFace model already exists at s3://{s3_bucket}/{s3_model_prefix}/")
    except:
        subprocess.check_output(f"pip install huggingface-hub", shell=True, stderr=subprocess.STDOUT)
        from huggingface_hub import snapshot_download
        from tempfile import TemporaryDirectory
        from pathlib import Path

        print(f"Downloading HuggingFace model snapshot: {hf_model_id}, revision: {hf_model_revision}")
        with TemporaryDirectory(suffix="model", prefix="hf", dir=".") as cache_dir:
            ignore_patterns = ["*.msgpack", "*.h5"] if tensors else [ "*.msgpack", "*.h5", "*.bin", "*.safetensors"]
            snapshot_download(repo_id=hf_model_id, 
                revision=hf_model_revision, 
                cache_dir=cache_dir,
                ignore_patterns=ignore_patterns,
                token=token)

            local_model_path = Path(cache_dir)
            model_snapshot_path = str(list(local_model_path.glob(f"**/snapshots/{hf_model_revision}"))[0])
            print(f"model_snapshot_path: {model_snapshot_path}")

            print("Uploading snapshot to S3...")
            for root, dirs, files in os.walk(model_snapshot_path):
                for file in files:
                    full_path = os.path.join(root, file)
                    with open(full_path, 'rb') as data:
                        key = f"{s3_model_prefix}/{full_path[len(model_snapshot_path)+1:]}"
                        s3_client.upload_fileobj(data, s3_bucket, key)

            print("Snapshot uploaded to S3.")

## Pipeline Steps

Next, we process the steps in the pipeline spec. Each step in the pipeline spec has its `config` defined in a YAML file. We use each step's `config` file to generate a step script, which is launched at run time by the [launch.py](./launch.py) script. For each step, we define a [Sagemaker Pipeline](https://docs.aws.amazon.com/sagemaker/latest/dg/pipelines-sdk.html) step, as shown below.

In [None]:
import os
import yaml
from script_builder import build_script
import shutil
from sagemaker.workflow.pipeline_context import PipelineSession
from sagemaker.workflow.steps import TrainingStep
from sagemaker.estimator import Estimator

pipeline_name = pipeline_spec['pipeline']['name']
pipeline_version = pipeline_spec['pipeline']['version']
release_name=f"{pipeline_name}-{pipeline_version}"

pipeline_dir = os.path.dirname(os.path.abspath(pipeline_file))
pipeline_step_specs = pipeline_spec['pipeline']['steps']

s3_output_path = f"s3://{s3_bucket}/{s3_prefix}/{release_name}"
environment = {"RELEASE_NAME": release_name}
if hf_model_id:
    environment["HF_MODEL_ID"] = hf_model_id
if hf_model_revision:
    environment["HF_MODEL_REVISION"] = hf_model_revision

shutil.copy("launch.py", pipeline_dir)

template_file = "train_script.j2"

pipeline_steps = []

for pipeline_step_spec in pipeline_step_specs:
    
    pipeline_step_config = pipeline_step_spec['config']
    print(f"Processing {pipeline_step_config}")

    # safe load yaml config file
    with open(os.path.join(pipeline_dir, pipeline_step_config), "r") as config_file:
        config = yaml.safe_load(config_file)

        pipeline_step_script = pipeline_step_config.replace(".yaml", ".sh")
        build_script(template_path="templates", template_file=template_file, 
            output_file=os.path.join(pipeline_dir, pipeline_step_script),
            **config)
        
        train_config = config.get('train')
        distribution = train_config.get('distribution', None)
        hyperparameters = {"script_file": pipeline_step_script, 
                           "torch_distributed": "true" if distribution == "torch" else "false"}
        max_run = train_config.get('max_run', 360000)
        resources = config.get('resources', {})
        step_estimator = Estimator(
            image_uri=ecr_image_uri,
            role=role,
            entry_point="launch.py",
            source_dir=pipeline_dir,
            instance_count=resources.get('instance_count', 1),
            instance_type=resources.get('instance_type', "ml.m5.4xlarge"), 
            volume_size=resources.get('volume_size', 200),
            hyperparameters=hyperparameters,
            environment=environment,
            max_run=max_run,
            distribution = distribution if isinstance(distribution, dict) else None,
            subnets=subnets,
            security_group_ids=security_group_ids,
            output_path=s3_output_path
        )

        training_step = TrainingStep(
            name=pipeline_step_spec['name'],
            description=pipeline_step_spec['description'],
            estimator=step_estimator,
            inputs=data_channels,
            depends_on=pipeline_steps[-1:].copy(),
        )
        pipeline_steps.append(training_step)



## Create SageMaker Pipeline Definition

Now that we have created all the pipeline steps, we are ready to create the SageMaker Pipeline definition.

In [None]:
import json
from sagemaker.workflow.pipeline import Pipeline

pipeline = Pipeline(
    name=release_name,
    steps=pipeline_steps
)
pipeline.upsert(role_arn=role)

definition = json.loads(pipeline.definition())
print(json.dumps(definition, indent=2))

## Start Pipeline Execution

Having defined the pipeline, now we can start a pipeline run.

In [None]:
execution = pipeline.start()

## Describe Pipeline Execution

Next, we describe pipeline execution.

In [None]:
execution.describe()

## List Pipeline Execution Steps

Next, we list the pipeline execution steps. You can run this cell at anytime after starting pipeline execution to get the latest status of pipeline steps.

In [None]:
execution.list_steps()

## Conclusion

This concludes the notebook. If you wish, you may create and try your own examples.

When you delete the CloudFormation stack, all resources created by the stack, except for the EFS file-system, are deleted. You can reuse the EFS file-system, next time you create a new stack, and the previously stored content on the EFS file-system will be available in your new stack. 

The content stored on the FSx for Lustre file-system is automatically exported to the S3 bucket. Please verify your content from FSx for Lustre file-system has been successfully exported to your S3 bucket, before you delete the stack.

## Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.

![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-1/advanced_functionality|distributed_tensorflow_mask_rcnn|mask-rcnn-scriptmode-fsx.ipynb)

![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-2/advanced_functionality|distributed_tensorflow_mask_rcnn|mask-rcnn-scriptmode-fsx.ipynb)

![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-1/advanced_functionality|distributed_tensorflow_mask_rcnn|mask-rcnn-scriptmode-fsx.ipynb)

![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ca-central-1/advanced_functionality|distributed_tensorflow_mask_rcnn|mask-rcnn-scriptmode-fsx.ipynb)

![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/sa-east-1/advanced_functionality|distributed_tensorflow_mask_rcnn|mask-rcnn-scriptmode-fsx.ipynb)

![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-1/advanced_functionality|distributed_tensorflow_mask_rcnn|mask-rcnn-scriptmode-fsx.ipynb)

![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-2/advanced_functionality|distributed_tensorflow_mask_rcnn|mask-rcnn-scriptmode-fsx.ipynb)

![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-3/advanced_functionality|distributed_tensorflow_mask_rcnn|mask-rcnn-scriptmode-fsx.ipynb)

![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-central-1/advanced_functionality|distributed_tensorflow_mask_rcnn|mask-rcnn-scriptmode-fsx.ipynb)

![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-north-1/advanced_functionality|distributed_tensorflow_mask_rcnn|mask-rcnn-scriptmode-fsx.ipynb)

![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-1/advanced_functionality|distributed_tensorflow_mask_rcnn|mask-rcnn-scriptmode-fsx.ipynb)

![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-2/advanced_functionality|distributed_tensorflow_mask_rcnn|mask-rcnn-scriptmode-fsx.ipynb)

![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-1/advanced_functionality|distributed_tensorflow_mask_rcnn|mask-rcnn-scriptmode-fsx.ipynb)

![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-2/advanced_functionality|distributed_tensorflow_mask_rcnn|mask-rcnn-scriptmode-fsx.ipynb)

![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-south-1/advanced_functionality|distributed_tensorflow_mask_rcnn|mask-rcnn-scriptmode-fsx.ipynb)
