# S3 buckets with boto3 library

The examples used to guide this notebook can be found [here](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-examples.html). These are examples from the boto3 documentation that demo basic s3 functionality.

The logging library documentation can be found [here](https://docs.python.org/3/library/logging.html). The library logs events for applications and libraries. It formats and returns error messages with the designated importance. In this case, it formats and returns ClientErrors as errors.

The boto3 library documentation can be found [here](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html). This library is the AWS SDK (Amazon Web Services Software Development Kit) that allows developers to create and interact with AWS services.

The os library documentation can be found [here](https://docs.python.org/3/library/os.html). The os library interacts with the operating system and helps python understand the file system.

The specific documentation for exception handling with the boto3 library can be found [here](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/error-handling.html). The ClientError is the most common kind of error, which is thrown anytime an AWS service returns an error response to a request made by the boto3 client.

In [1]:
#Imports
import logging
import boto3
import os
import requests
import json
from botocore.exceptions import ClientError

In [2]:
#Variable names
bucket_name = 'python-bucket112113114115'
filename = 'hello_world.rtf'

## Create buckets

This function, and all future functions in this document, follow the same format. The boto3 client is initialized to a variable, and that variable is used to perform a function. The function always incorporates exception handling. If there is an error with the AWS service, it will return a ClientError to the boto3 client. The function handles ClientError exceptions with the try/except clause, returning the error message if the action fails to complete.

In [3]:
#Define function to create bucket
def create_bucket(bucket_name, region='us-east-2'):
    
    try:
        s3_client = boto3.client('s3')
        location = {'LocationConstraint': region}
        s3_client.create_bucket(Bucket=bucket_name,
                                CreateBucketConfiguration=location)
    except ClientError as e:
        logging.error(e)
        return False
    return True

In [4]:
#Create the bucket
create_bucket(bucket_name)

True

In [5]:
#Try running a second time to observe error message
create_bucket(bucket_name)

ERROR:root:An error occurred (BucketAlreadyOwnedByYou) when calling the CreateBucket operation: Your previous request to create the named bucket succeeded and you already own it.


False

I tried to create the same bucket twice to see what kind of error message is returned by AWS. The error message is informative.

## List buckets

I can retrieve the list of buckets owned by my AWS account.

In [6]:
#Retrieve list of buckets
s3 = boto3.client('s3')
response = s3.list_buckets()
response

{'ResponseMetadata': {'RequestId': 'C62A4DR0ER84TPYN',
  'HostId': 'jjakzKInetJQGHM35Xo+UYR2R3eaJoU5I7NGdtgvWFVEEFhZy3jgEOFpLANIXaltHS/mIHvVSu4=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amz-id-2': 'jjakzKInetJQGHM35Xo+UYR2R3eaJoU5I7NGdtgvWFVEEFhZy3jgEOFpLANIXaltHS/mIHvVSu4=',
   'x-amz-request-id': 'C62A4DR0ER84TPYN',
   'date': 'Wed, 04 Jan 2023 16:28:06 GMT',
   'content-type': 'application/xml',
   'transfer-encoding': 'chunked',
   'server': 'AmazonS3'},
  'RetryAttempts': 0},
 'Buckets': [{'Name': 'cdk-hnb659fds-assets-454840057477-us-east-2',
   'CreationDate': datetime.datetime(2022, 12, 12, 22, 21, 35, tzinfo=tzutc())},
  {'Name': 'createbucketstack-testbuckete6e05abe-u3f8puhr36vt',
   'CreationDate': datetime.datetime(2022, 12, 12, 22, 46, 40, tzinfo=tzutc())},
  {'Name': 'python-bucket112113114115',
   'CreationDate': datetime.datetime(2023, 1, 4, 16, 28, 5, tzinfo=tzutc())}],
 'Owner': {'ID': '45511f3aa385088f994d52aa08800b66fed3a434b1c293ad12e3d5c28934d2cf'}}

In [7]:
#Look at keys of dictionary
response.keys()

dict_keys(['ResponseMetadata', 'Buckets', 'Owner'])

In [8]:
#Look inside buckets
response['Buckets']

[{'Name': 'cdk-hnb659fds-assets-454840057477-us-east-2',
  'CreationDate': datetime.datetime(2022, 12, 12, 22, 21, 35, tzinfo=tzutc())},
 {'Name': 'createbucketstack-testbuckete6e05abe-u3f8puhr36vt',
  'CreationDate': datetime.datetime(2022, 12, 12, 22, 46, 40, tzinfo=tzutc())},
 {'Name': 'python-bucket112113114115',
  'CreationDate': datetime.datetime(2023, 1, 4, 16, 28, 5, tzinfo=tzutc())}]

The Buckets key returns a JSON object of every bucket created in the account. The name of each bucket can be accessed by looping through the list of buckets and extracting the name from each dictionary.

In [9]:
for bucket in response['Buckets']:
    print('-----------------')
    print()
    print(bucket['Name'])
    print()

-----------------

cdk-hnb659fds-assets-454840057477-us-east-2

-----------------

createbucketstack-testbuckete6e05abe-u3f8puhr36vt

-----------------

python-bucket112113114115



## Upload files

I can upload files to my s3 bucket using the upload_file method. This method uses the S3 Transfer Manager, which will automatically handle multipart uploads. It will also upload/download files in parallel and allows for retries.

In [10]:
def upload_file(bucket, filename, object_name=None):
    
    if object_name is None:
        object_name = os.path.basename(filename)
        
    s3_client = boto3.client('s3')
    try:
        response = s3_client.upload_file(filename, bucket, object_name)
    except ClientError as e:
        logging.error(e)
        return False
    return True

In [11]:
upload_file(bucket_name, filename)

True

I can upload files to my s3 bucket using the put_object method as well. This method makes a low-level API request, so it doesn't handle multipart uploads automatically. 

In [12]:
#Define function to put object in bucket
def put_object(bucket, filename):
    
    try:
        s3_client = boto3.client('s3')
        s3_client.put_object(Bucket=bucket, Key=filename)
        
    except ClientError as e:
        logging.error(e)
        return False
    return True

In [13]:
#Put object in bucket
put_object(bucket_name, filename)

True

## Download files

I can also download files from s3 using the download_file method. This method accepts the bucket name, object name, and desired file name as the required arguments.

In [14]:
#Define function to download objects from bucket
def download_file(bucket, obj, file):
    
    s3_client = boto3.client('s3')
    try:
        s3_client.download_file(bucket, obj, file)
        
    except ClientError as e:
        logging.error(e)
        return False
    return True

In [15]:
download_file(bucket_name, filename, 'downloaded_file.rtf')

True

In [16]:
os.listdir()

['hello_world.rtf',
 's3-examples.ipynb',
 '.gitignore',
 '.ipynb_checkpoints',
 '.git',
 'downloaded_file.rtf']

As you can see here, I was able to download the object from the s3 bucket and assign it a desired filename ('downloaded_file.rtf').

## Presigned URLs

I can generate presigned URLs to allow people to temporarily access and download objects from my s3 bucket. The object can be accessed by entering the URL into a browser. The object can also be accessed programmatically by making an HTTP GET request of the URL.

Presigned URLs can also be used to grant temporary permission to perform other actions on s3 buckets and the objects they hold. They may allow users to upload objects to buckets as well.

In [17]:
def generate_presigned_url(bucket, obj, expiration=3600):
    
    s3_client = boto3.client('s3')
    try:
        response = s3_client.generate_presigned_url('get_object',
                                                    Params={'Bucket': bucket,
                                                            'Key': obj},
                                                    ExpiresIn=expiration)
        
    except ClientError as e:
        logging.error(e)
        return None
    return response

In [18]:
generate_presigned_url(bucket_name, filename)

'https://python-bucket112113114115.s3.amazonaws.com/hello_world.rtf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAWTZUUP2CR6JQV6KL%2F20230104%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20230104T162806Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=188bf6e10cdfdb02e4b0b974ad47b77528326d130ad3dc239dc81fd4dd636953'

Opening the link in a new tab automatically downloads the file to my computer. I can also access the object using the requests library.

In [19]:
url = generate_presigned_url(bucket_name, filename)
response = requests.get(url)
response

<Response [200]>

## Bucket policy

I can set, retrieve, and delete bucket policies.

In [20]:
bucket_policy = {
    'Version': '2012-10-17',
    'Statement': [{
        'Sid': 'AddPerm',
        'Effect': 'Allow',
        'Principal': '*',
        'Action': ['s3:GetObject'],
        'Resource': 'arn:aws:s3:::{}/*'.format(bucket_name)
    }]
}

bucket_policy = json.dumps(bucket_policy)

s3.put_bucket_policy(Bucket=bucket_name, Policy=bucket_policy)

{'ResponseMetadata': {'RequestId': '0BRQCEVHYXFNY978',
  'HostId': 'tzmvmpvYrzvDVDKvtIgzY2Ak0l17Up9w1z40O9kkGM4disiqHiIUXrHnKUxA08T3NSJrKrz1urA=',
  'HTTPStatusCode': 204,
  'HTTPHeaders': {'x-amz-id-2': 'tzmvmpvYrzvDVDKvtIgzY2Ak0l17Up9w1z40O9kkGM4disiqHiIUXrHnKUxA08T3NSJrKrz1urA=',
   'x-amz-request-id': '0BRQCEVHYXFNY978',
   'date': 'Wed, 04 Jan 2023 16:28:07 GMT',
   'server': 'AmazonS3'},
  'RetryAttempts': 0}}

In [21]:
s3 = boto3.client('s3')
result = s3.get_bucket_policy(Bucket=bucket_name)
result

{'ResponseMetadata': {'RequestId': '1K8QZPV1DMMAE99N',
  'HostId': '88md+k1BRvyMtBizD4+zS7liovD/8dxHE6hMrX8644TWFrzn1D3SVaJWAaVNA3u2a5gwbA+WB9s=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amz-id-2': '88md+k1BRvyMtBizD4+zS7liovD/8dxHE6hMrX8644TWFrzn1D3SVaJWAaVNA3u2a5gwbA+WB9s=',
   'x-amz-request-id': '1K8QZPV1DMMAE99N',
   'date': 'Wed, 04 Jan 2023 16:28:08 GMT',
   'content-type': 'application/json',
   'server': 'AmazonS3',
   'content-length': '167'},
  'RetryAttempts': 0},
 'Policy': '{"Version":"2012-10-17","Statement":[{"Sid":"AddPerm","Effect":"Allow","Principal":"*","Action":"s3:GetObject","Resource":"arn:aws:s3:::python-bucket112113114115/*"}]}'}

In [22]:
print(result['Policy'])

{"Version":"2012-10-17","Statement":[{"Sid":"AddPerm","Effect":"Allow","Principal":"*","Action":"s3:GetObject","Resource":"arn:aws:s3:::python-bucket112113114115/*"}]}


In [23]:
s3.delete_bucket_policy(Bucket=bucket_name)

{'ResponseMetadata': {'RequestId': '1K8PZWXF2P3QY417',
  'HostId': 'oz2rkYNu8VjMELoUiQ20M41gEacw9jdh7hHYxljCwuB7oYBsYTfJMmGYXibxNKiBMB24UWASPLs=',
  'HTTPStatusCode': 204,
  'HTTPHeaders': {'x-amz-id-2': 'oz2rkYNu8VjMELoUiQ20M41gEacw9jdh7hHYxljCwuB7oYBsYTfJMmGYXibxNKiBMB24UWASPLs=',
   'x-amz-request-id': '1K8PZWXF2P3QY417',
   'date': 'Wed, 04 Jan 2023 16:28:08 GMT',
   'server': 'AmazonS3'},
  'RetryAttempts': 0}}

## Access control list

I can retrieve the access control list (ACL) for any bucket in my account.

In [24]:
result = s3.get_bucket_acl(Bucket=bucket_name)
result

{'ResponseMetadata': {'RequestId': '1K8RQYV91BT6F7EV',
  'HostId': 'hAtKXM6/GXMD8dbUUIvOjc1Eg7LKStBoeQnMV1poQ1fF3D5MwniI3j7SX0ZfM8cXaGnQAzPNzzY=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amz-id-2': 'hAtKXM6/GXMD8dbUUIvOjc1Eg7LKStBoeQnMV1poQ1fF3D5MwniI3j7SX0ZfM8cXaGnQAzPNzzY=',
   'x-amz-request-id': '1K8RQYV91BT6F7EV',
   'date': 'Wed, 04 Jan 2023 16:28:08 GMT',
   'content-type': 'application/xml',
   'transfer-encoding': 'chunked',
   'server': 'AmazonS3'},
  'RetryAttempts': 0},
 'Owner': {'ID': '45511f3aa385088f994d52aa08800b66fed3a434b1c293ad12e3d5c28934d2cf'},
 'Grants': [{'Grantee': {'ID': '45511f3aa385088f994d52aa08800b66fed3a434b1c293ad12e3d5c28934d2cf',
    'Type': 'CanonicalUser'},
   'Permission': 'FULL_CONTROL'}]}

## Static website

I can configure an s3 bucket to act as a static website, retrieve the configuration, and delete the configuration.

In [25]:
website_configuration = {
    'ErrorDocument': {'Key': 'error.html'},
    'IndexDocument': {'Suffix': 'index.html'}
}

s3.put_bucket_website(Bucket=bucket_name,
                      WebsiteConfiguration=website_configuration)

{'ResponseMetadata': {'RequestId': '1K8TS7E8CAVQ020V',
  'HostId': 'lOqOJLCjsxwlfLMAtMj6dCP+iqKd3ENu9yroYXm4ASQUVaH8cR2yv4zDCIXXr+6sFSIgKTmLaGU=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amz-id-2': 'lOqOJLCjsxwlfLMAtMj6dCP+iqKd3ENu9yroYXm4ASQUVaH8cR2yv4zDCIXXr+6sFSIgKTmLaGU=',
   'x-amz-request-id': '1K8TS7E8CAVQ020V',
   'date': 'Wed, 04 Jan 2023 16:28:08 GMT',
   'server': 'AmazonS3',
   'content-length': '0'},
  'RetryAttempts': 0}}

In [26]:
result = s3.get_bucket_website(Bucket=bucket_name)
result

{'ResponseMetadata': {'RequestId': '1K8K71NMME45SGTY',
  'HostId': 'Ahkv4qdvsu4JbkItdqymL+smpTVoinVs67k8YHlKRIfMtc1UBwxO40uMEzZkf9QXRVVjM3u820o=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amz-id-2': 'Ahkv4qdvsu4JbkItdqymL+smpTVoinVs67k8YHlKRIfMtc1UBwxO40uMEzZkf9QXRVVjM3u820o=',
   'x-amz-request-id': '1K8K71NMME45SGTY',
   'date': 'Wed, 04 Jan 2023 16:28:08 GMT',
   'content-type': 'application/xml',
   'transfer-encoding': 'chunked',
   'server': 'AmazonS3'},
  'RetryAttempts': 0},
 'IndexDocument': {'Suffix': 'index.html'},
 'ErrorDocument': {'Key': 'error.html'}}

In [27]:
s3.delete_bucket_website(Bucket=bucket_name)

{'ResponseMetadata': {'RequestId': '1K8ZJTN5WYT3B6DQ',
  'HostId': 'WZ1fhYi46i1p/0a4nQFYJyhNVwRtzwzx4He9BVGc0ITVcVW+hmiPYD3OlVqqRjFfHMBFKxhpn/U=',
  'HTTPStatusCode': 204,
  'HTTPHeaders': {'x-amz-id-2': 'WZ1fhYi46i1p/0a4nQFYJyhNVwRtzwzx4He9BVGc0ITVcVW+hmiPYD3OlVqqRjFfHMBFKxhpn/U=',
   'x-amz-request-id': '1K8ZJTN5WYT3B6DQ',
   'date': 'Wed, 04 Jan 2023 16:28:08 GMT',
   'server': 'AmazonS3'},
  'RetryAttempts': 0}}

## Delete object

I can delete objects from the bucket.

In [28]:
#Define function to delete object from bucket
def delete_object(bucket, filename):
    
    try:
        s3_client = boto3.client('s3')
        s3_client.delete_object(Bucket=bucket, Key=filename)
        
    except ClientError as e:
        logging.error(e)
        return False
    return True

In [29]:
#Delete object
delete_object(bucket_name, filename)

True

In [30]:
#Delete object.. again?
delete_object(bucket_name, filename)

True

I tried to delete the same object twice. It does not error out when attempting to delete an object that does not exist, which I found interesting.

## Delete bucket

I can delete buckets from the notebook as well.

In [31]:
#Define function to delete bucket
def delete_bucket(bucket):
    
    try:
        s3_client = boto3.client('s3')
        s3_client.delete_bucket(Bucket=bucket)
        
    except ClientError as e:
        logging.error(e)
        return False
    return True

In [32]:
#Delete bucket
delete_bucket(bucket_name)

True

In [33]:
#Attempt to delete bucket a second time
delete_bucket(bucket_name)

ERROR:root:An error occurred (NoSuchBucket) when calling the DeleteBucket operation: The specified bucket does not exist


False

Attempting to delete the same bucket twice does return an error. The message informs me that my specified bucket does not exist.