# IAM Role

In this notebook, we review the concept of IAM role and set up one to be used for later notebooks in `/sagemaker-fundamentals`. An IAM role is an identity associated with your AWS account that has pre-configured permission policies that determine what this role can do and with respect to your AWS resources. For example, you can define a role that can do *everything* to your S3 resource but nothing else. Such an role is useful if it is *assumed* by a program that synchronize your data from your local machine to an S3 bucket. You can be certain that this program would not accidently create EC2 instances that incur higher costs. 

One application of IAM role is: it grants permissions to AWS services (e.g. SageMaker) to procure resources on your behalf. When you use an AWS service(e.g. SageMaker), you can define a role that the service can assume on your behalf to access the AWS resources. For example, SageMaker service needs to access S3 buckets, EC2 instances, Elastic Container Registries etc. To avoid incurring too much compute cost, you can define a role that is able to create only low cost EC2 instances. When SageMaker assumes this role and runs your ML workflow, you can estimate the upper bound of the compute cost based on the EC2 policy of the role. 

For more extensive readings on IAM role, refer to the [AWS documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_terms-and-concepts.html#iam-term-role-chaining)

## Environments to run this notebook

1. If you are running this notebook from an EC2 instance, then you need to make sure `AWS_PROFILE` environment variable is set to `default`. 

2. If you are running this notebook on your local machine, you will need to install and configure aws command line interface. Follow [this link](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html) to do so. 

I do not recommend to run it on an SageMaker Notebook Instance or Studio, because they automatically created an IAM role for you and the whole point of this notebook is to create one manually from scratch. 

Moreover, we will use `boto3` to interface with AWS API, make sure you have it installed. 

In [154]:
!pip install -Uq boto3 

## Create an IAM Role

When you create an IAM role you need to specify
1. Which AWS entities (users or services) you trust to assume this role
2. What permissions this role has

1 is determined by a *trust policy* and 2 is determined by a *permission* policy. 

The entity that you entrust to assume the role are refered as *Principal* 

In [84]:
import boto3  # your python gate way to all aws services
import pprint # print readable dictionary
import json

pp = pprint.PrettyPrinter(indent=1)
iam = boto3.client('iam')

In [85]:
# get the ARN of the user
user_arn = boto3.client('sts').get_caller_identity()['Arn']

def create_execution_role(role_name="basic-role"):
    """Create an service role to procure services on your behalf
    
    Args:
        role_name (str): name of the role
    
    Return:
        dict
    """    
    # if the role already exists, delete it
    
    # Note: you need to make sure the role is not
    # used in production, because the code below
    # will delete the role and create a new one
    role = None
    for rol in iam.list_roles()['Roles']:
        if rol['RoleName'] == role_name:
            # detach policy from the role before deleting it
            role = boto3.resource('iam').Role(role_name)
            
            for p in role.attached_policies.all():
                role.detach_policy(PolicyArn=p.arn)
            break
    
    # Trust policy document
    trust_relation_policy_doc = {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "AWS": user_arn, # Allow user to take this role
            "Service": [
              "sagemaker.amazonaws.com" # Allow SageMaker to take the role
            ],
          },
          "Action": "sts:AssumeRole",
        }
      ]
    }
    
    if role is not None:
        iam.delete_role(RoleName=role.name)
    
    res = iam.create_role(
        RoleName=role_name,
        AssumeRolePolicyDocument=json.dumps(trust_relation_policy_doc)
    )
    return res

The trust policy above says we entrust the user of current boto3 session (most likely yourself) and SageMaker to assume this role. 

In [86]:
role_res = create_execution_role()
pp.pprint(role_res)

{'ResponseMetadata': {'HTTPHeaders': {'content-length': '861',
                                      'content-type': 'text/xml',
                                      'date': 'Fri, 26 Feb 2021 23:17:52 GMT',
                                      'x-amzn-requestid': 'f058c3a1-5a78-415f-b867-9afe133dd349'},
                      'HTTPStatusCode': 200,
                      'RequestId': 'f058c3a1-5a78-415f-b867-9afe133dd349',
                      'RetryAttempts': 0},
 'Role': {'Arn': 'arn:aws:iam::688520471316:role/basic-role',
          'AssumeRolePolicyDocument': {'Statement': [{'Action': 'sts:AssumeRole',
                                                      'Effect': 'Allow',
                                                      'Principal': {'AWS': 'arn:aws:iam::688520471316:user/hongshan',
                                                                    'Service': ['sagemaker.amazonaws.com']}}],
                                       'Version': '2012-10-17'},
          'CreateDa

Now, let's give the role some permissions. The dictionary below is an example of policy permission. It says: allow the role to list buckets under the AWS account. 

In [None]:
basic_s3_permission = {
    "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "s3:ListBuckets" # service: API
                ],
                "Resource": [
                    "arn:aws:s3:::*" 
                ]
            }
        ]
    }

In [147]:
def attach_permission(role_name, policy_name, policy_doc):
    """Attach a basic permission policy to the role"""

    # Create the policy
    # If the policy with policy name $policy_name already exists,
    # then we need to delete it first
    
    # Note: you need to make sure that you do not have a policy 
    # with $policy_name in production, because we will delete it
    # and create a new one with the policy document given by 
    # $policy_doc
    
    policy = None
    for p in iam.list_policies()['Policies']:
        if p['PolicyName']==policy_name:
            # Before we delete the policy, we need to detach it
            # from all IAM entities 
            policy = boto3.resource('iam').Policy(p['Arn'])
            
            # 1. detach from all groups
            for grp in policy.attached_groups.all():
                policy.detach_group(GroupName=grp.name)
                
            # 2. detach from all users
            for usr in policy.attached_users.all():
                policy.detach_user(UserName=usr.name)
            
            # 3. detach from all roles
            for rol in policy.attached_roles.all():
                policy.detach_role(RoleName=rol.name)
                
            break
    
    if policy is not None:
        iam.delete_policy(PolicyArn=policy.arn)   
    
    # create a new policy
    policy = iam.create_policy(
        PolicyName=policy_name,
        PolicyDocument=json.dumps(policy_doc))['Policy']
    
    # attach the policy to the role
    res = iam.attach_role_policy(
        RoleName=role_name,
        PolicyArn=policy['Arn']
        )
    return res

In [148]:


perm_res = attach_permission(
    role_name=role_res['Role']['RoleName'],
    policy_name='Basic',
    policy_doc=basic
    )

pp.pprint(perm_res)

MalformedPolicyDocumentException: An error occurred (MalformedPolicyDocument) when calling the CreatePolicy operation: Syntax errors in policy.

## Test your role

Now, you have created an execution role `basic-role` with the following permission:
```
basic_s3_permission = {
    "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "s3:GetObject",
                    "s3:PutObject",
                    "s3:DeleteObject", 
                    "s3:ListBucket"
                ],
                "Resource": [
                    "arn:aws:s3:::*"
                ]
            }
        ]
    }
```
Let's test that it can indeed perform those actions on your behalf.

Assume a role in boto3

https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#assume-role-provider


In [26]:
import os
os.environ['AWS_PROFILE']


'default'

## Create a profile for the role

To be able to assume the role we just created, we need to create a profile for it so that `boto3` can create a session using this profile. 

To create a profile for the role `basic-role`, open `~/.aws/config` with your favorite editor and add the following 

```
[profile basic-role]
role_arn = <ARN of basic-role>
```

In [79]:
import boto3
sess = boto3.session.Session(
    profile_name="basic-role",
    aws_access_key_id="AKIA2ATYEUMKG7SJWULB",
    aws_secret_access_key="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",
)


s3 = sess.resource('s3')

# ARN and a role session name.
assumed_role_object=sts_client.assume_role(
    RoleArn="arn:aws:iam::account-of-role-to-assume:role/name-of-role",
    RoleSessionName="AssumeRoleSession1"
)


In [109]:
obj = boto3.client('sts').assume_role(
    RoleArn='arn:aws:iam::688520471316:role/basic-role',
    RoleSessionName='xyz'
)

In [110]:
cred = obj['Credentials']#['SessionToken']
print(cred)

{'AccessKeyId': 'ASIA2ATYEUMKOE4JBZEN', 'SecretAccessKey': 'i0ebY615Oul2WqM8P8f1GuVQoXTqNEzokZPF26w+', 'SessionToken': 'FwoGZXIvYXdzEEkaDM+uAA+FHgqsbQDj7SKnAWpOMkDTt5fGHgDdSWMziUh8ujeR2z96FtoM/Qw4SLx+rPvDgjYfof1DCZJNgQwceyQ22c2eDZ5bZx5mfO2Bkzcz9f4XeeYr0+YddydrYOytw8PL/TRlr6DlH1CHO7h/M60MR+zst6QxtRuj2eiqny13XJY7yOaQw4AkAIhUkUohXW3ws+SJ09AGJ/rUREqvt8HhI1v3RdBoT2uWtnSdV5uN4KtZiOdZKK2H5oEGMi2kC2mDEXNn2vqlbAjfwa6JVq7Ea6bUXf+FnzizvbXoE+zrDW5ZptkiXZIHps0=', 'Expiration': datetime.datetime(2021, 2, 27, 0, 26, 37, tzinfo=tzlocal())}


In [139]:
basic_s3= {
    "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                     "s3:ListBuckets"
                ],
                "Resource": [
                    "arn:aws:s3:::"
                ]
            }
        ]
    }

attach_permission('basic-role', 'basic_s3', basic_s3)

obj = boto3.client('sts').assume_role(
    RoleArn='arn:aws:iam::688520471316:role/basic-role',
    RoleSessionName='xyz'
)

cred=obj['Credentials']

sess = boto3.session.Session(
    aws_access_key_id=cred['AccessKeyId'],
    aws_secret_access_key=cred['SecretAccessKey'],
    aws_session_token=cred['SessionToken']
    )
#help(sess)

s3 = sess.client('s3')

s3.list_buckets()

{'ResponseMetadata': {'RequestId': '4SHVE5TCWD0TQMSC',
  'HostId': 'fBvbOtH6yYNFMN/gUhveJZdBcv1BmbtPy2EU/G2iAl+hp3D1r6omlY6VPV1zvhZGhl0RTWsEHm0=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amz-id-2': 'fBvbOtH6yYNFMN/gUhveJZdBcv1BmbtPy2EU/G2iAl+hp3D1r6omlY6VPV1zvhZGhl0RTWsEHm0=',
   'x-amz-request-id': '4SHVE5TCWD0TQMSC',
   'date': 'Fri, 26 Feb 2021 23:42:49 GMT',
   'content-type': 'application/xml',
   'transfer-encoding': 'chunked',
   'server': 'AmazonS3'},
  'RetryAttempts': 0},
 'Buckets': [{'Name': '688520471316-sagemaker-us-west-2',
   'CreationDate': datetime.datetime(2020, 7, 16, 23, 56, 32, tzinfo=tzlocal())},
  {'Name': 'a2i-demo-bucket-f33bea39-fa37-4a9e-8d32-2fbb370abdcc',
   'CreationDate': datetime.datetime(2019, 10, 23, 21, 21, 1, tzinfo=tzlocal())},
  {'Name': 'amazon-braket-889863f741ec',
   'CreationDate': datetime.datetime(2021, 2, 18, 23, 9, 39, tzinfo=tzlocal())},
  {'Name': 'aws-athena-query-results-688520471316-us-east-1',
   'CreationDate': datetime.dateti

In [140]:
s3.list_objects(Bucket='aws-use-case-churn')

ClientError: An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied

In [98]:
s3 = boto3.client('s3')
s3.list_buckets()

{'ResponseMetadata': {'RequestId': 'MFM1H9NJPVJ3YV1R',
  'HostId': 'UJoHxN9W3bx23kKEwtUnMsAtFyyp5QoFh9nNx6kz5LcwFlRUAGkEO19VpltbiMDsc9Kg3ocmS0U=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amz-id-2': 'UJoHxN9W3bx23kKEwtUnMsAtFyyp5QoFh9nNx6kz5LcwFlRUAGkEO19VpltbiMDsc9Kg3ocmS0U=',
   'x-amz-request-id': 'MFM1H9NJPVJ3YV1R',
   'date': 'Fri, 26 Feb 2021 23:19:53 GMT',
   'content-type': 'application/xml',
   'transfer-encoding': 'chunked',
   'server': 'AmazonS3'},
  'RetryAttempts': 0},
 'Buckets': [{'Name': '688520471316-sagemaker-us-west-2',
   'CreationDate': datetime.datetime(2020, 7, 16, 23, 56, 32, tzinfo=tzlocal())},
  {'Name': 'a2i-demo-bucket-f33bea39-fa37-4a9e-8d32-2fbb370abdcc',
   'CreationDate': datetime.datetime(2019, 10, 23, 21, 21, 1, tzinfo=tzlocal())},
  {'Name': 'amazon-braket-889863f741ec',
   'CreationDate': datetime.datetime(2021, 2, 18, 23, 9, 39, tzinfo=tzlocal())},
  {'Name': 'aws-athena-query-results-688520471316-us-east-1',
   'CreationDate': datetime.dateti

In [3]:
import boto3
sts = boto3.client('sts')
print(sts.get_caller_identity())

{'UserId': 'AIDA2ATYEUMKFHKQCSLHK', 'Account': '688520471316', 'Arn': 'arn:aws:iam::688520471316:user/hongshan', 'ResponseMetadata': {'RequestId': 'c71324ab-3fb3-4e4c-aa67-cbeb88bb11b3', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': 'c71324ab-3fb3-4e4c-aa67-cbeb88bb11b3', 'content-type': 'text/xml', 'content-length': '405', 'date': 'Fri, 26 Feb 2021 22:39:37 GMT'}, 'RetryAttempts': 0}}


To assume a role, an application calls the AWS STS AssumeRole API operation and passes the ARN of the role to use. The operation creates a new session with temporary credentials. This session has the same permissions as the identity-based policies for that role.


In [10]:
sts_client = boto3.client('sts')

assumed_role_object=sts_client.assume_role(
    RoleArn=role_res["Role"]["Arn"],
    RoleSessionName="AssumeRoleSession1"
)

ClientError: An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:iam::688520471316:user/hongshan is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::688520471316:role/basic-role

### Attach policy 

- Trust relationship
- Permissions

see some example policies

resources for policies
https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies.html

resources for ARN format
https://docs.aws.amazon.com/quicksight/latest/APIReference/qs-arn-format.html

```
arn:<partion>:<service>:<region>:<account-id>:<resource-type>/<resource-id>
```

### Attach policy 

- Trust relationship
- Permissions

see some example policies

resources for policies
https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies.html

resources for ARN format
https://docs.aws.amazon.com/quicksight/latest/APIReference/qs-arn-format.html

```
arn:<partion>:<service>:<region>:<account-id>:<resource-type>/<resource-id>
```



In [32]:
sts_client.get_caller_identity()

{'UserId': 'AROA2ATYEUMKIU3KQG7TC:botocore-session-1614303096',
 'Account': '688520471316',
 'Arn': 'arn:aws:sts::688520471316:assumed-role/RL/botocore-session-1614303096',
 'ResponseMetadata': {'RequestId': '9e8c9a66-c79e-40e3-aadb-0f3c515cd2e6',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '9e8c9a66-c79e-40e3-aadb-0f3c515cd2e6',
   'content-type': 'text/xml',
   'content-length': '463',
   'date': 'Fri, 26 Feb 2021 02:25:49 GMT'},
  'RetryAttempts': 0}}

## How Amazon SageMaker Runs Your Container
https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-training-algo-dockerfile.html


In [75]:
import boto3
sm_boto3 = boto3.client('sagemaker')
s3 = boto3.client('s3')

# try to create a local training job
training_job_name = 'test-training-job-{}'.format(current_time())


bucket='688520471316-sagemaker-us-west-2'

# put data here
data_path="s3://{}/{}/data".format(bucket, training_job_name)

# upload data to s3
train_file="boston_train.csv"
s3.upload_file(
    Filename=train_file, 
    Bucket=bucket, 
    Key='{}/data/{}'.format(training_job_name, train_file))

# location that SageMaker saves the model artifacts
output_path="s3://{}/{}/output".format(bucket, training_job_name)

algorithm_specification = {
    'TrainingImage': "688520471316.dkr.ecr.us-west-2.amazonaws.com/test:latest",
    'TrainingInputMode': 'File',
}


role_arn = "arn:aws:iam::688520471316:role/RL"
input_data_config = [
    {
        'ChannelName': 'train',
            'DataSource':{
                'S3DataSource':{
                    'S3DataType': 'S3Prefix',
                    'S3Uri': data_path,
                    'S3DataDistributionType': 'FullyReplicated',
                }
        }
        
    },
    {
        'ChannelName': 'test',
        'DataSource':{
            'S3DataSource': {
                'S3DataType': 'S3Prefix',
                'S3Uri': data_path,
                'S3DataDistributionType': 'FullyReplicated',
            }
        }
    }
]


output_data_config = {
    'S3OutputPath': output_path
}

resource_config = {
    'InstanceType': 'ml.m5.large',
    'InstanceCount':1,
    'VolumeSizeInGB':10
}

stopping_condition={
    'MaxRuntimeInSeconds':120,
    'MaxWaitTimeInSeconds': 123
}

enable_network_isolation=False

res = sm_boto3.create_training_job(
    TrainingJobName=training_job_name,
    #HyperParameters=hyperparameters,
    AlgorithmSpecification=algorithm_specification,
    RoleArn=role_arn,
    InputDataConfig=input_data_config,
    OutputDataConfig=output_data_config,
    ResourceConfig=resource_config,
    StoppingCondition=stopping_condition,
    EnableNetworkIsolation=enable_network_isolation,
    EnableManagedSpotTraining=True, # set it to False if do not want managed spot training
)



In [73]:
import pprint
pp = pprint.PrettyPrinter(indent=1)


res = sm_boto3.describe_training_job(
    TrainingJobName=training_job_name)

pp.pprint(res)

{'AlgorithmSpecification': {'EnableSageMakerMetricsTimeSeries': False,
                            'TrainingImage': '688520471316.dkr.ecr.us-west-2.amazonaws.com/test:latest',
                            'TrainingInputMode': 'File'},
 'CreationTime': datetime.datetime(2021, 2, 24, 1, 57, 13, 385000, tzinfo=tzlocal()),
 'EnableInterContainerTrafficEncryption': False,
 'EnableManagedSpotTraining': False,
 'EnableNetworkIsolation': False,
 'InputDataConfig': [{'ChannelName': 'train',
                      'CompressionType': 'None',
                      'DataSource': {'S3DataSource': {'S3DataDistributionType': 'FullyReplicated',
                                                      'S3DataType': 'S3Prefix',
                                                      'S3Uri': 's3://688520471316-sagemaker-us-west-2/test-training-job-2021-02-24-01-57-13/data'}},
                      'RecordWrapperType': 'None'},
                     {'ChannelName': 'test',
                      'CompressionType':

In [50]:
training_job_name

'test-training-job-2021-02-24-00-08-16'

In [1]:
# describe-training-job look at the parameters of an successful training job


res = sm.describe_training_job(
    TrainingJobName=training_
)

In [5]:
import pprint
pp = pprint.PrettyPrinter(indent=1)
pp.pprint(res)


{'AlgorithmSpecification': {'EnableSageMakerMetricsTimeSeries': False,
                            'MetricDefinitions': [{'Name': 'train:mae',
                                                   'Regex': '.*\\[[0-9]+\\].*#011train-mae:([-+]?[0-9]*\\.?[0-9]+(?:[eE][-+]?[0-9]+)?).*'},
                                                  {'Name': 'validation:aucpr',
                                                   'Regex': '.*\\[[0-9]+\\].*#011validation-aucpr:([-+]?[0-9]*\\.?[0-9]+(?:[eE][-+]?[0-9]+)?).*'},
                                                  {'Name': 'train:merror',
                                                   'Regex': '.*\\[[0-9]+\\].*#011train-merror:([-+]?[0-9]*\\.?[0-9]+(?:[eE][-+]?[0-9]+)?).*'},
                                                  {'Name': 'train:gamma-nloglik',
                                                   'Regex': '.*\\[[0-9]+\\].*#011train-gamma-nloglik:([-+]?[0-9]*\\.?[0-9]+(?:[eE][-+]?[0-9]+)?).*'},
                                         

In [76]:
r = sm_boto3.describe_training_job(
    TrainingJobName='my-awesome-training-job')
print(r['FailureReason'])

AlgorithmError: framework error: 
Traceback (most recent call last):
  File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_trainer.py", line 84, in train
    entrypoint()
  File "/miniconda3/lib/python3.7/site-packages/sagemaker_sklearn_container/training.py", line 39, in main
    train(environment.Environment())
  File "/miniconda3/lib/python3.7/site-packages/sagemaker_sklearn_container/training.py", line 35, in train
    runner_type=runner.ProcessRunnerType)
  File "/miniconda3/lib/python3.7/site-packages/sagemaker_training/entry_point.py", line 92, in run
    files.download_and_extract(uri=uri, path=environment.code_dir)
  File "/miniconda3/lib/python3.7/site-packages/sagemaker_training/files.py", line 131, in download_and_extract
    s3_download(uri, dst)
  File "/miniconda3/lib/python3.7/site-packages/sagemaker_training/files.py", line 167, in s3_download
    s3.Bucket(bucket).download_file(key, dst)
  File "/miniconda3/lib/python3.7/site-packages/boto3/s3/inject.p