# Using SageMaker Studio Lab with AWS Resources

[![Open In Studio Lab](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/aws/studio-lab-examples/blob/main/connect-to-aws/Access_AWS_from_Studio_Lab.ipynb)

Following guidance here
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html

### Step 0. Install AWS CLI, boto3, and configure with your AWS credentials. 
 Also create and paste in your SageMaker execution role. 

In [None]:
%pip install boto3

In [None]:
%pip install awscli

In [None]:
!mkdir ~/.aws

---
# Exercise Caution on Using AWS Credentials
The next step should only be undertaken by professionals who are already comfortable using AWS access and secret keys. These credentials are similar to the keys to a car - if someone takes them inadvertenly, they can steal your vehicle. While there are additional AWS permissions you can apply, the basic concept still stands. Under no circumstances should you share these resources publicly. 

Please refer here for getting started with your AWS credentials.
https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html 

That being said, if you are handling your keys carefully, you can in fact access your AWS account from Studio Lab. We'll walk through that here.

In [None]:
%%writefile ~/.aws/credentials

[default]
aws_access_key_id =  < paste your access key here, run this cell, then delete the cell >
aws_secret_access_key = < paste your secret key here, run this cell, then delete the cell > 

In [None]:
%%writefile ~/.aws/config

[default]
region=us-east-1

In [None]:
!pip install --upgrade sagemaker

If you are already used to using SageMaker within your own AWS account, please copy and paste the arn for your execution role below. If you are new to thise, follow the steps to create one here.

https://docs.aws.amazon.com/glue/latest/dg/create-an-iam-role-sagemaker-notebook.html

Please note, in order to complete this you will need to have already created this SageMaker IAM Execution role.

In [21]:
import sagemaker

# create a sagemaker execution role via the AWS SageMaker console, then paste in the arn here
role = 'AmazonSageMaker-ExecutionRole-20220328T145246'

### Step 1. Copy your local data to your preferred S3 bucket, or vice versa 
This notebook will assume you already have access to a training dataset relevant for language translation. If you don't, please step through this notebook to create the relevant train files locally.
- https://github.com/aws/studio-lab-examples/blob/main/natural-language-processing/NLP_Disaster_Recovery_Translation.ipynb 

We'll demonstrate copying that data up to your AWS account via the cli here, but you can also upload through the UI, or use boto3. Many good options here.

In [22]:
bucket_name = 'sagemaker-studio-195566616656-cr8w4ma55a9'
train_file_name = 'train.json'
s3_data_path = 's3://{}/data/{}'.format(bucket_name, train_file_name)

In [6]:
!aws s3 sync ./notebooks/data/ {s3_data_path}


The user-provided path ./notebooks/data/ does not exist.


In [23]:
import boto3

# Let's use Amazon S3
s3 = boto3.resource('s3')

In [24]:
# Print out bucket names
for bucket in s3.buckets.all():
    print(bucket.name)

mydata-bucket-39009
sagemaker-studio-rlawwl08e4
sagemaker-us-east-1-808640880671
studio-lab-300-220
studio-lab-300-221


In [None]:
# Upload a new file
data = open('Untitled.ipynb', 'rb')
s3.Bucket('sagemaker-studio-195566616656-cr8w4ma55a9').put_object(Key='Untitled.ipynb', Body=data)

## Amazon DynamoDB

By following this guide, you will learn how to use the DynamoDB.ServiceResource and DynamoDB.Table resources in order to create tables, write items to tables, modify existing items, retrieve items, and query/filter the items in the table.

In [25]:
import boto3

In [26]:
# Get the service resource.
dynamodb = boto3.resource('dynamodb')

In [27]:
# Create the DynamoDB table.
table = dynamodb.create_table(
    TableName='users-data',
    KeySchema=[
        {
            'AttributeName': 'username',
            'KeyType': 'HASH'
        },
        {
            'AttributeName': 'last_name',
            'KeyType': 'RANGE'
        }
    ],
    AttributeDefinitions=[
        {
            'AttributeName': 'username',
            'AttributeType': 'S'
        },
        {
            'AttributeName': 'last_name',
            'AttributeType': 'S'
        },
    ],
    ProvisionedThroughput={
        'ReadCapacityUnits': 5,
        'WriteCapacityUnits': 5
    }
)

ResourceInUseException: An error occurred (ResourceInUseException) when calling the CreateTable operation: Table already exists: users-data

In [28]:
# Wait until the table exists.
table.wait_until_exists()

In [29]:
# Print out some data about the table.
print(table.item_count)

0


This creates a table named users that respectively has the hash and range primary keys username and last_name. This method will return a DynamoDB.Table resource to call additional methods on the created table.

In [47]:
# Instantiate a table resource object without actually
# creating a DynamoDB table. Note that the attributes of this table
# are lazy-loaded: a request is not made nor are the attribute
# values populated until the attributes
# on the table resource are accessed or its load() method is called.
table = dynamodb.Table('users')

# Print out some data about the table.
# This will cause a request to be made to DynamoDB and its attribute
# values will be set based on the response.
print(table.creation_date_time)

2022-05-05 21:53:25.363000+00:00


Once you have a DynamoDB.Table resource you can add new items to the table using DynamoDB.Table.put_item():

In [50]:
table.put_item(
   Item={
        'username': 'janedoe',
        'first_name': 'Mak',
        'last_name': 'Doe',
        'age': 25,
        'account_type': 'standard_user',
        'IMX': 66
    }
)

{'ResponseMetadata': {'RequestId': '32LAH48MGIP791RMMKJ21UEUTJVV4KQNSO5AEMVJF66Q9ASUAAJG',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'server': 'Server',
   'date': 'Thu, 05 May 2022 22:09:45 GMT',
   'content-type': 'application/x-amz-json-1.0',
   'content-length': '2',
   'connection': 'keep-alive',
   'x-amzn-requestid': '32LAH48MGIP791RMMKJ21UEUTJVV4KQNSO5AEMVJF66Q9ASUAAJG',
   'x-amz-crc32': '2745614147'},
  'RetryAttempts': 0}}

In [51]:
# Getting an item
response = table.get_item(
    Key={
        'username': 'janedoe',
        'last_name': 'Mak'
    }
)
item = response['Item']
print(item)

KeyError: 'Item'

In [46]:
# Updating an item
table.update_item(
    Key={
        'username': 'janedoe',
        'last_name': 'Doe'
    },
    UpdateExpression='SET age = :val1',
    ExpressionAttributeValues={
        ':val1': 26
    }
)

{'ResponseMetadata': {'RequestId': 'NTFRN9THJ4L0BTPGL97PNJ0F1NVV4KQNSO5AEMVJF66Q9ASUAAJG',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'server': 'Server',
   'date': 'Thu, 05 May 2022 22:08:04 GMT',
   'content-type': 'application/x-amz-json-1.0',
   'content-length': '2',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'NTFRN9THJ4L0BTPGL97PNJ0F1NVV4KQNSO5AEMVJF66Q9ASUAAJG',
   'x-amz-crc32': '2745614147'},
  'RetryAttempts': 0}}

In [12]:
# Deleting an item
table.delete_item(
    Key={
        'username': 'janedoe',
        'last_name': 'Doe'
    }
)

{'ResponseMetadata': {'RequestId': 'AQROGE2NR1C60O6TQRUTSQPUDNVV4KQNSO5AEMVJF66Q9ASUAAJG',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'server': 'Server',
   'date': 'Thu, 05 May 2022 21:54:57 GMT',
   'content-type': 'application/x-amz-json-1.0',
   'content-length': '2',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'AQROGE2NR1C60O6TQRUTSQPUDNVV4KQNSO5AEMVJF66Q9ASUAAJG',
   'x-amz-crc32': '2745614147'},
  'RetryAttempts': 0}}

## Batch writing
If you are loading a lot of data at a time, you can make use of DynamoDB.Table.batch_writer() so you can both speed up the process and reduce the number of write requests made to the service.

This method returns a handle to a batch writer object that will automatically handle buffering and sending items in batches. In addition, the batch writer will also automatically handle any unprocessed items and resend them as needed. All you need to do is call put_item for any items you want to add, and delete_item for any items you want to delete:

In [13]:
with table.batch_writer() as batch:
    batch.put_item(
        Item={
            'account_type': 'standard_user',
            'username': 'johndoe',
            'first_name': 'John',
            'last_name': 'Doe',
            'age': 25,
            'address': {
                'road': '1 Jefferson Street',
                'city': 'Los Angeles',
                'state': 'CA',
                'zipcode': 90001
            }
        }
    )
    batch.put_item(
        Item={
            'account_type': 'super_user',
            'username': 'janedoering',
            'first_name': 'Jane',
            'last_name': 'Doering',
            'age': 40,
            'address': {
                'road': '2 Washington Avenue',
                'city': 'Seattle',
                'state': 'WA',
                'zipcode': 98109
            }
        }
    )
    batch.put_item(
        Item={
            'account_type': 'standard_user',
            'username': 'bobsmith',
            'first_name': 'Bob',
            'last_name':  'Smith',
            'age': 18,
            'address': {
                'road': '3 Madison Lane',
                'city': 'Louisville',
                'state': 'KY',
                'zipcode': 40213
            }
        }
    )
    batch.put_item(
        Item={
            'account_type': 'super_user',
            'username': 'alicedoe',
            'first_name': 'Alice',
            'last_name': 'Doe',
            'age': 27,
            'address': {
                'road': '1 Jefferson Street',
                'city': 'Los Angeles',
                'state': 'CA',
                'zipcode': 90001
            }
        }
    )

The batch writer is even able to handle a very large amount of writes to the table.

In [18]:
with table.batch_writer() as batch:
    for i in range(50):
        batch.put_item(
            Item={
                'account_type': 'anonymous',
                'username': 'user' + str(i),
                'first_name': 'unknown',
                'last_name': 'unknown'
            }
        )

In [15]:
from boto3.dynamodb.conditions import Key, Attr
from pprint import pprint

In [16]:
response = table.query(
    KeyConditionExpression=Key('username').eq('johndoe')
)
items = response['Items']
pprint(items)

[{'account_type': 'standard_user',
  'address': {'city': 'Los Angeles',
              'road': '1 Jefferson Street',
              'state': 'CA',
              'zipcode': Decimal('90001')},
  'age': Decimal('25'),
  'first_name': 'John',
  'last_name': 'Doe',
  'username': 'johndoe'}]
