# S3 and boto3


## Introduction

>IMPORTANT: Though AWS S3 is free tier there are some limitations, so you will get charged for AWS S3 if your use it outside the free tier usage. The free tier usage will include 12 months of 5GB of storage 20,000 to retrieve data from S3 and 20,00 requests to send data to S3. The details of pricing outside of this usage can be found at the following [link](https://aws.amazon.com/s3/pricing/). If you require S3 for any projects on the course one will be provided for you. Remember to close any AWS resources after use if using your own AWS account. 

>The Amazon simple-storage service (Amazon S3) buckets are data lakes for storing files. To learn more about data lakes, check this [website](https://en.wikipedia.org/wiki/Data_lake).

S3 buckets allow you to store up to 5Gb for free, after which you will have to pay $0.023 per gigabyte. For more information on the pricing, visit this [page](https://aws.amazon.com/es/s3/pricing/?nc=sn&loc=4).

## Creating an S3 Bucket

Here, we create an S3 bucket to upload our files. First, go to the AWS [dashboard](https://aws.amazon.com). In the search bar, type 'S3', and click on the first option:
<p align="center"> 
    <img src="images/aws_search_S3.png" width="500"/>
</p>
In the next window, click on 'Create bucket':

<p align="center">
    <img src="images/create_bucket_button.png" width="500"/>
</p>

Set a name for your bucket, and choose a region (any region from the US usually works; however, ensure that you use the same region in the subsequent steps).

## Creating an IAM User Role 

An identity and access management (IAM) user role is required to provide the necessary credentials that allow us to interact with the AWS resources.

To create an IAM user, go to the AWS dashboard, and, in the search bar, type "IAM", and click the first option:

<p align="center">
    <img src="images/IAM.png" width="500"/>
</p>

Next, click User in the left-hand side, followed by "Add User":

<p align="center">
    <img src="images/IAM_User.png" width="500"/>
</p>

Thereafter, fill the user name with the name you want, and click "Next".

On the permissions page, select Attach existing policies directly, check the AdministratorAccess box, and click "Next".

<p align="center">
    <img src="images/Policies.png" width="500"/>
</p>

On the next screen, review your selections and click “Create User”.

Now that you have created your IAM user, you will need to assign it a programmatic access keypair:

- Click on your user in the IAM users tab.

- Select the “Security Credentials” tab. 

<p align="center">
    <img src="images/security_credentials.png" width="500"/>
</p>

- Now navigate down to the field marked “Create Access Key" and select that option via the button. 

- On the subsequent screen, select "Command Line Interface (CLI)", navigate to the bottom of the page and click "I understand".

- Click "Next".

<p align="center">
    <img src="images/i_understand.png" width="500"/>
</p>

- On the next page, give the keypair a description, and click "Create Access Key".

- The following screen will display your public key and secret. Be sure to click the "Download .CSV" button, as this will not be shown again.

<p align="center">
    <img src="images/copy_keypair_2.png" width="500"/>
</p>





## Downloading and Configuring the AWS CLI


To enable communication between your computer and your AWS resources, accurate configurations are required. The `awscli` package allows you to easily configure the environment variables required by our computer to connect to AWS services.

To install the `awscli` package, follow the instructions on this [link](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html).


Once the installation is completed, you can configure the awscli  in the terminal using the `aws configure` command.
Enter the information as it appears in the .csv file you downloaded in the previous step. 

When asked to provide the region name, go to your S3 bucket and retrieve the AWS Region of your bucket. The region name should be similar to `us-east-1`.

When asked to provide the output format, you can skip this info by pressing enter.

Now, your computer should be ready to use boto3.

<details>
  <summary> <font size=+1> Things to note if you are on Google Colab </font></summary>
  
  If you are using Google Colab, you need to install the `awscli` package as you would on your local machine. The only difference is that the configuration parameters will not be prefilled in the next sessions.
  
  To install awscli, type `!pip install awscli` in a new cell.
  
  Thereafter, in the terminal, type `!aws configure`, and follow the instructions above.

</details>

Confirm that your installation is working using `aws s3 ls`. The output should be similar to that shown in the figure below:

<p align="center">
    <img src="images/AWSCLI_ls.png" width="500"/>
</p>


## boto3 for Using AWS Resources in Python

boto3 is a library that allows us to work with AWS from a python script. In this example, we will simply upload, download and explore S3 buckets. Note, however, that this library can also be used to manage other resources, such as `EC2`, `RDS` and `DynamoDB`. For more information, check out boto3's documentation [here](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html).

First, install boto3 by typing `pip install boto3` in the terminal. Consider that to use `boto3`, you need to configure AWS following the steps described above.

We start by informing boto3 that we intend to use an S3 bucket:

In [None]:
import boto3 
s3_client = boto3.client('s3')



Now, upload a file to your bucket:

In [None]:
# response = s3_client.upload_file(file_name, bucket, object_name)
response = s3_client.upload_file('cat_0.jpg', 'cat-scraper', 'cat.jpg')


The *file_name* is the directory of the file you want to upload, *bucket* is the name of your S3 bucket, and *object_name* is the name you want to give to your file once uploaded.


We encourage you to practice this to improve your understanding.

Now, we attempt to view the content(s) of the bucket:

In [None]:
import boto3
s3 = boto3.resource('s3')

my_bucket = s3.Bucket('pokemon-sprites')

for file in my_bucket.objects.all():
    print(file.key)

Once you have viewed the contents, you can download the files:

In [None]:
s3 = boto3.client('s3')

# Ofcourse, change the names of the files to match yours.
s3.download_file('pokemon-sprites', 'zubat/front.png', 'zubat.png')


## Making the Files Public


In your S3 bucket, disable the 'Block all public access' option:

<p align="center">
    <img src="images/disable.PNG" width="500"/>
</p>

Once you have created the bucket, you can access it in the bucket list. 

To make the objects public, go to http://awspolicygen.s3.amazonaws.com/policygen.html, which will help you create the necessary policy.<br>
- In 'Select Type of Policy', select S3 Bucket Policy. 
- In 'Principal', type ' * '.
- In 'Actions', select 'Get Object'.
- In 'Amazon Resource Name (ARN)', type arn:aws:s3:::{your_bucket_name}/*.
- Click on Statement.
- Click on Generate Policy, and copy the text.

<p align="center">
    <img src="images/Policy_public.png" width="500"/>
</p>

Go back to your bucket, and go to the Permissions tab. In 'Bucket Policy', click Edit. Paste the text you copied, and save the changes.<br> 
Now, your bucket should be publicly accessible, and the files should be available to download. 

In your bucket, select the file you want to download, and copy the Object URL.

<p align="center"> <img src="images/URL_public.png" width="500"></p>

Open a python editor or notebook, and use the requests library to download the image from the URL you just copied. See the example below:

In [None]:
import requests
# Change this with your URL
url = 'https://pokemon-sprites.s3.amazonaws.com/blastoise/front.png'

response = requests.get(url)
with open('blastoise.png', 'wb') as f:
    f.write(response.content)

Now, you should be able to see the file in the same working directory.

## Conclusion
At this point, you should have a good understanding of how to
- create an AWS account.
- create an Amazon S3 bucket.
- download and configure the AWS CLI.
- make the files in the bucket public.
- upload files.
- download files from the bucket.