**Amazon Simple Storage Service (Amazon S3)** is an object storage service that offers scalability, data availability, security, and performance.


Amazon S3 is designed for 99.999999999% (11 9's) of durability, and stores data for millions of applications for companies all around the world.


An **Amazon S3 bucket** is a storage location to hold files. S3 files are referred to as **objects**.



**Boto 3** is the Amazon Web Services (AWS) Software Development Kit (SDK) for Python, which allows Python developers to write software that makes use of services like Amazon S3 and Amazon EC2.


## FIRST: create a AWS account and an account user profile:

1)    create an AWS account, if you still doen't have one, at aws.amazon.com

2)    log in trhough **AWS Management Console** in **My Account** drop-down list

3)    got to **Services** -> **IAM** -> **Users** and click in **Add user**

4)    first give the user a *name*, then, in **AWS access type** check mark **Programmatic access** option

5)    in the *next* page (Permissions) select **Attach existing policies directly** from *Set permissions* options, and then, check mark **AmazonS3FullAccess** Policy name

6)    skip the *next* page (Tags)

7)    in the *next* page (Review) check if everything is as settled and click in **Create user**

8)    download the user credentials file by clicking in **Download .csv**. This will download a file called credentials.csv which contains the access key ID and the secret access key of the current user that you will need to follow with this notebook.

9)    to run this notebook upload it and the image file 'eveningsky_image.jpeg' to the JupterLab (in **Services** -> **Amazon SageMaker** -> **SageMaker Studio**)

10)    be sure to fill in your user account credentials and prefered AWS region in the lines of code bellow which are commented with a '## TODO'

## Create an Amazon S3 bucket

The name of an Amazon S3 bucket must be unique across all regions of the AWS platform.
The bucket name should be between 3 and 63 characters long, and can contain only lower-case characters, numbers, periods, and dashes. It should not contain underscores.
The bucket can be located in a specific region to minimize latency or to address regulatory requirements.

In [None]:
import logging
import boto3
from botocore.exceptions import ClientError

In [None]:
import uuid

## define the name of the bucket using unique identifier
prefix = "workingwiths3-"
id = uuid.uuid4() ## generates an universally unique identifier of 36 character long
bucket_name = "".join([prefix, str(id)])

print(f'Bucket Name: {bucket_name}')

Bucket Name: workingwiths3-7aeac0f7-1fa0-47df-b068-1c5b14eb1897


### Method 1:

hard-coding **access key ID** and **secret access key** and **region name**

In [None]:
# hardcod access key ID and secret access key and region name
AWS_ACCESS_KEY_ID = YOUR_AWS_ACCESS_KEY_ID ## TODO
AWS_SECRET_ACCESS_KEY = YOUR_AWS_SECRET_ACCESS_KEY ## TODO

# choose a AWS region (Note: boto3 doesn't accept 'us-east-1' as region_name)
my_region = YOUR_PREFERED_REGION ## TODO

s3_client = boto3.client('s3',
                         region_name=my_region,
                         aws_access_key_id=AWS_ACCESS_KEY_ID,
                         aws_secret_access_key=AWS_SECRET_ACCESS_KEY)


In [None]:
## create a s3 bucket
location = {'LocationConstraint':my_region}
s3_client.create_bucket(Bucket=bucket_name,
                       CreateBucketConfiguration=location)

### Method 2:

providing **access key ID** and **secret access key** information through a shared credential file

In [None]:
import os

# create a default directory for boto3 to access the user credentials
try:
    os.mkdir('./.aws')
except OSError as e:
    print(e)

# create a file to save your user credentials
with open('./.aws/credentials', 'w') as f:
    f.write("\t[default]\n")
    f.write("\taws_access_key_id = YOUR_AWS_ACCESS_KEY_ID\n") ## TODO
    f.write("\taws_secret_access_key = YOUR_AWS_SECRET_ACCESS_KEY\n") ## TODO


[Errno 17] File exists: './.aws'


In [None]:
## os.remove('./.aws/credentials') ## run this command line if you have to delete the credentials file

In [None]:
## create a S3 client. oss. boto3 access your credentials from the file in the default directory
s3_client = boto3.client('s3')

# retrieve your AWS region from your session using boto3
session = boto3.session.Session()
current_region = session.region_name

print(current_region)

us-east-1


In [None]:
## create a s3 bucket
## oss. exclude 'us-east-1' as region because boto3 doesn't accepts it
if current_region != 'us-east-1':
    location = {'LocationConstraint':current_region}
    ## create the bucket
    s3_client.create_bucket(Bucket=bucket_name,
                           CreateBucketConfiguration=location)
else:
    s3_client.create_bucket(Bucket=bucket_name)

## Listing Buckets

In [None]:
response = s3_client.list_buckets()

for bucket in response['Buckets']:
    print(bucket['Name'])

sagemaker-studio-yzwjzmdqpls
workingwiths3-dfa1789a-9aba-4e12-bb33-2aa3bd00639f
workingwiths3-f205b4eb-7d44-45d9-86d6-cd8aa5ff180d


## Uploading files


In [None]:
file_name = 'eveningsky_image.jpeg'
s3_object_name = file_name

response = s3_client.upload_file(Filename=file_name, Bucket=bucket_name, Key=s3_object_name)

## Upload files as File Object using the file handling

In [None]:
s3_object_name = 'eveningsky_image_fileobj_method.jpeg'

## upload file using file handler
with open(file_name, 'rb') as f:
    s3_client.upload_fileobj(Fileobj=f, Bucket=bucket_name, Key=s3_object_name)

## Extra Args
Both upload_file and upload_fileobj accept an optional ExtraArgs parameter that can be used for various purposes.

Some Important ExtraArgs


In [None]:
s3_object_name = 'eveningsky_image_public.jpeg'

response = s3_client.upload_file(Filename='eveningsky_image.jpeg', Bucket=bucket_name, Key=s3_object_name,
                                ExtraArgs={'ACL':'public-read'})

## Downloading files

The methods provided by the AWS SDK for Python to download files are similar to those provided to upload files.


The download_file method accepts the names of the bucket and object to download and the filename to save the file to.

In [None]:
s3_object_name = 'eveningsky_image.jpeg'

s3_client.download_file(Bucket=bucket_name, Key=s3_object_name, Filename='eveningsky_image_downloaded.jpeg')

In [None]:
with open('eveningsky_image_written.jpeg', 'wb') as f:
    s3_client.download_fileobj(Bucket=bucket_name, Key=s3_object_name, Fileobj=f)

## File transfer configuration


When uploading, downloading, or copying a file or S3 object, the AWS SDK for Python automatically manages retries and multipart and non-multipart transfers.

The management operations are performed by using reasonable default settings that are well-suited for most scenarios. To handle a special case, the default settings can be configured to meet requirements.

## Multipart transfers

Multipart transfers occur when the file size exceeds the value of the multipart_threshold attribute.


In [None]:
from boto3.s3.transfer import TransferConfig

GB = 1024**3 ##GB in bytes

config = TransferConfig(multipart_threshold=5*GB)

s3_client.upload_file(Filename='eveningsky_image.jpeg', Bucket=bucket_name, Key='eveningsky_image_multi_transfer.jpeg', Config=config)


## Presigned URLs

A user who does not have AWS credentials or permission to access an S3 object can be granted temporary access by using a presigned URL.

A presigned URL is generated by an AWS user who has access to the object. The generated URL is then given to the unauthorized user. The presigned URL can be entered in a browser or used by a program or HTML webpage. The credentials used by the presigned URL are those of the AWS user who generated the URL.

A presigned URL remains valid for a limited period of time which is specified when the URL is generated.

In [None]:
s3_object_name = 'eveningsky_image_multi_transfer.jpeg'
response_presigned_url = s3_client.generate_presigned_url(ClientMethod='get_object',
                                                         Params={'Bucket':bucket_name,
                                                                'Key':s3_object_name},
                                                         ExpiresIn=3600)

print(response_presigned_url)

## Bucket policies

An S3 bucket can have an optional policy that grants access permissions to other AWS accounts or AWS Identity and Access Management (IAM) users. Bucket policies are defined using the same JSON format as a resource-based IAM policy.

## Retrieve a Bucket Policy

Note: if the specified bucket doesn't have a bucket policy the following error is returned:
**'NoSuchBucketPolicy'**

In [None]:
result = s3_client.get_bucket_policy(Bucket=bucket_name)
print(result['Policy'])

ClientError: An error occurred (NoSuchBucketPolicy) when calling the GetBucketPolicy operation: The bucket policy does not exist

## Set a bucket policy

A bucket's policy can be set by calling the put_bucket_policy method.

The policy is defined in the same JSON format as an IAM policy. 



### Policy Format

The **Sid (statement ID)** is an optional identifier that you provide for the policy statement. You can assign a Sid value to each statement in a statement array.

The **Effect** element is required and specifies whether the statement results in an allow or an explicit deny. Valid values for Effect are Allow and Deny.

By default, access to resources is denied. 

Use the **Principal** element in a policy to specify the principal that is allowed or denied access to a resource.

You can specify any of the following principals in a policy:

- AWS account and root user
- IAM users
- Federated users (using web identity or SAML federation)
- IAM roles
- Assumed-role sessions
- AWS services
- Anonymous users


The **Action** element describes the specific action or actions that will be allowed or denied. 

We specify a value using a service namespace as an action prefix (iam, ec2, sqs, sns, s3, etc.) followed by the name of the action to allow or deny.

The **Resource** element specifies the object or objects that the statement covers. We specify a resource using an ARN. Amazon Resource Names (ARNs) uniquely identify AWS resources.

Let's define a policy that enables any user to retrieve any object stored in the bucket identified by the bucket_name variable.

In [None]:
import json

bucket_policy = {
    'Version':'2012-10-17',
    'Statement':[
        {
            'Sid':'AddPerm',
            'Effect':'Allow',
            'Principal':'*',
            'Action':['s3:GetObject'],
            'Resource':f'arn:aws:s3:::{bucket_name}/*'
        }
    ]
}

bucket_policy = json.dumps(bucket_policy)

s3_client.put_bucket_policy(Bucket=bucket_name, Policy=bucket_policy)

{'ResponseMetadata': {'RequestId': 'T32WTX1ZC65SNN77',
  'HostId': 'bxr9HRthoSb/cY87KTj/Vj+sJroVM7UbmwWxUWCp2oyeVFgeir+JsjdMezcZwWI8MD78YWBRht0=',
  'HTTPStatusCode': 204,
  'HTTPHeaders': {'x-amz-id-2': 'bxr9HRthoSb/cY87KTj/Vj+sJroVM7UbmwWxUWCp2oyeVFgeir+JsjdMezcZwWI8MD78YWBRht0=',
   'x-amz-request-id': 'T32WTX1ZC65SNN77',
   'date': 'Sat, 12 Mar 2022 16:31:41 GMT',
   'server': 'AmazonS3'},
  'RetryAttempts': 0}}

## Retrieve the previous setted Bucket Policy

In [None]:
result = s3_client.get_bucket_policy(Bucket=bucket_name)
print(result['Policy'])

{"Version":"2012-10-17","Statement":[{"Sid":"AddPerm","Effect":"Allow","Principal":"*","Action":"s3:GetObject","Resource":"arn:aws:s3:::workingwiths3-dfa1789a-9aba-4e12-bb33-2aa3bd00639f/*"}]}


## Delete a bucket policy


In [None]:
s3_client.delete_bucket_policy(Bucket=bucket_name)

{'ResponseMetadata': {'RequestId': 'E2FMB8N96JE6MF6G',
  'HostId': 'FoA63iPBGLwa/fye8zqjfZ7Xk78z+6CGDSfoBUc7G6TVqwOOTFE1Bh9eNKYhAvJhPuAstKPr71I=',
  'HTTPStatusCode': 204,
  'HTTPHeaders': {'x-amz-id-2': 'FoA63iPBGLwa/fye8zqjfZ7Xk78z+6CGDSfoBUc7G6TVqwOOTFE1Bh9eNKYhAvJhPuAstKPr71I=',
   'x-amz-request-id': 'E2FMB8N96JE6MF6G',
   'date': 'Sat, 12 Mar 2022 16:31:56 GMT',
   'server': 'AmazonS3'},
  'RetryAttempts': 0}}

## CORS Configuration

Cross Origin Resource Sharing (CORS) enables client web applications in one domain to access resources in another domain. An S3 bucket can be configured to enable cross-origin requests. The configuration defines rules that specify the allowed origins, HTTP methods (GET, PUT, etc.), and other elements.

## Retrieve a bucket CORS configuration

Retrieve a bucket's CORS configuration by calling the AWS SDK for Python get_bucket_cors method.

Note: if the specified bucket doesn't have a CORS configuration the following exception is returned: **'NoSuchCORSConfiguration'**


In [None]:
response = s3_client.get_bucket_cors(Bucket=bucket_name)

print(response['CORSRules'])

ClientError: An error occurred (NoSuchCORSConfiguration) when calling the GetBucketCors operation: The CORS configuration does not exist

## Set Bucket CORS

In [None]:
cors_configuration = {
    'CORSRules':[{
        'AllowedHeaders':['Authorization'],
        'AllowedMethods':['GET','PUT'],
        'AllowedOrigins':['*'],
        'ExposeHeaders':['GET','PUT'],
        'MaxAgeSeconds':3000
    }]
}

s3_client.put_bucket_cors(Bucket=bucket_name, CORSConfiguration=cors_configuration)

{'ResponseMetadata': {'RequestId': 'CJ2BYRWJRK2FRXG2',
  'HostId': 'QpTK/LYx/qlEnpphLZMF8Edh9fCtLZPDcuG8pbPqQ+a3WEUp52MCnfmDLMAd2Db86ph9zi0D7Ro=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amz-id-2': 'QpTK/LYx/qlEnpphLZMF8Edh9fCtLZPDcuG8pbPqQ+a3WEUp52MCnfmDLMAd2Db86ph9zi0D7Ro=',
   'x-amz-request-id': 'CJ2BYRWJRK2FRXG2',
   'date': 'Sat, 12 Mar 2022 16:32:14 GMT',
   'server': 'AmazonS3',
   'content-length': '0'},
  'RetryAttempts': 0}}

## Retrieve the previous sett bucket CORS configuration

In [None]:
response = s3_client.get_bucket_cors(Bucket=bucket_name)

print(response['CORSRules'])

[{'AllowedHeaders': ['Authorization'], 'AllowedMethods': ['GET', 'PUT'], 'AllowedOrigins': ['*'], 'ExposeHeaders': ['GET', 'PUT'], 'MaxAgeSeconds': 3000}]


## Deleting an object

In [None]:
s3_client.delete_object(Bucket=bucket_name, Key='eveningsky_image_multi_transfer.jpeg')

{'ResponseMetadata': {'RequestId': 'EQWGXJ24JY4CVYAX',
  'HostId': 'lRoUy4OY8pmAp0o+A6edQQ4tobBhH0ZboDQqanw/93U4peEx5sc8eMqZZ1+5+dnKrrhCNl5wIP4=',
  'HTTPStatusCode': 204,
  'HTTPHeaders': {'x-amz-id-2': 'lRoUy4OY8pmAp0o+A6edQQ4tobBhH0ZboDQqanw/93U4peEx5sc8eMqZZ1+5+dnKrrhCNl5wIP4=',
   'x-amz-request-id': 'EQWGXJ24JY4CVYAX',
   'date': 'Sat, 12 Mar 2022 16:32:41 GMT',
   'server': 'AmazonS3'},
  'RetryAttempts': 0}}

## Delete a bucket

only empty buckets can be deleted

In [None]:
## delete all the objects in a bucket
response = s3_client.delete_objects(
    Bucket=bucket_name,
    Delete={
        'Objects': [
            {
                'Key': 'eveningsky_image.jpeg',
            },
            {
                'Key': 'eveningsky_image_fileobj_method.jpeg',
            },
            {
                'Key': 'eveningsky_image_public.jpeg',
            }
        ]
    }
)


In [None]:
print(response['Deleted'])

[{'Key': 'eveningsky_image_fileobj_method.jpeg'}, {'Key': 'eveningsky_image.jpeg'}, {'Key': 'eveningsky_image_public.jpeg'}]


In [None]:
## delete the empted bucket
response = s3_client.delete_bucket(Bucket=bucket_name)

## **References**

[1]    Working with AWS S3 Buckets using Python & boto3. Guided Project, Coursera.

[2]    AWS Documentation:

https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html

[3]    Python, Boto3, and AWS S3: Demystified.

https://realpython.com/python-boto3-aws-s3/

[4]    Boto3 Documentation:

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html

[5]    Working with s3 in Python using boto3.

https://hands-on.cloud/working-with-s3-in-python-using-boto3/

[6]    AWS Documentation: Error Responses. List of Error Codes.

https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html

[7]    Boto3 Docs: Credentials.

https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#guide-credentials

[8]    Access control list (ACL) overview.

https://docs.aws.amazon.com/AmazonS3/latest/userguide/acl-overview.html

[9]    Example IAM identity-based policies.

https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_examples.html#iam-policy-example-s3

