# AWS S3 Buckets: Uploading and Downloading

AWS S3 buckets provide scalable, secure, reliable data storage in the AWS cloud.

AWS provides software to interact with the S3 API:
  - [AWS CLI](https://aws.amazon.com/cli/) for access from the command line
  - [boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html) for programmatic Python access

S3 Buckets may be **public** or **private**

---

## 1 Public S3 Buckets

### 1.1 Downloading and Uploading from the command line with `AWS CLI`
- does not require authentication
- if not authenticating, requires use of the `--no-sign-request` argument

The `aws s3 cp` command works for downloading or uploading from/to S3:

`aws s3 cp source_path destination_path`

---

#### **1.1.1 Attempt downloading an example file from a public S3 bucket without using the `--no-sign-request` argument**

- This will cause an error unless you have previously configured the aws-cli with your AWS credentials

In [None]:
!aws s3 cp s3://asf-jupyter-data-west/S3_example/example.txt example.txt

#### **1.1.2 Run the `aws s3 cp` command again, adding the `--no-sign-request` argument**

- Look for the downloaded file in the file browser

In [None]:
!aws s3 cp --no-sign-request s3://asf-jupyter-data-west/S3_example/example.txt example.txt

---
#### **1.1.3 Swap the source and destination paths to upload the file to the same location**

- uploading the file will overwrite the copy in the S3 bucket

In [None]:
!aws s3 cp --no-sign-request example.txt s3://asf-jupyter-data-west/S3_example/example.txt 

---
#### **1.1.4 There is an 8MB file size limit when uploading to S3 if not authenticated, even to a public bucket, using the `--no-sign-request` argument.**

- Files larger than 8MB default to a multi-part upload, which anonymous users do not have privileges to perform
- This size limitation **does not apply when downloading** from S3

**Create a 9MB file and try to upload it:**

In [None]:
!head -c 9MB /dev/urandom > too_big_file

In [None]:
!aws --no-sign-request s3 cp too_big_file s3://asf-jupyter-data-west/S3_example/too_big_file 

---

### 1.2 Downloading and Uploading in Python with `boto3`

You can build S3 access into your Python scripts and automated workflows with `boto3`

#### **1.2.1 To prepare for the following steps, delete the local copy of `example.txt`, which we previously downloaded**

In [None]:
from pathlib import Path

example_path = Path.cwd()/"example.txt"
example_path.unlink(missing_ok=True)

#### **1.2.2 Run the following code cell to attempt downloading `example.txt` with `boto3` from a public S3 bucket without declaring an unsigned signature or providing credentials** 

In [None]:
import boto3

s3 = boto3.resource('s3')

bucket_name = "asf-jupyter-data-west"
bucket = s3.Bucket(bucket_name)

# objects in S3 bucket storage are called keys
example_key = "S3_example/example.txt"

bucket.download_file(example_key, example_path) 

That didn't work; we received a `NoCredentialsError`. We need to provide our AWS credentials or declare ourselves as anonymous users.

---
#### **1.2.2 Try this again, but declare yourself anonymous by providing the config signature version: `UNSIGNED`**

It may take a few moments for the file to appear in the file browser. You can hit the file browser's refresh button to make it appear sooner.

In [None]:
import boto3
from botocore import UNSIGNED
from botocore.config import Config

s3 = boto3.resource('s3', config=Config(signature_version=UNSIGNED))

bucket_name = "asf-jupyter-data-west"
bucket = s3.Bucket(bucket_name)

bucket.download_file(example_key, example_path) 

#### **1.2.3 Upload `example.txt` with `boto3`**

In [None]:
bucket.upload_file(example_path, example_key)

#### **1.2.4 Create a 9MB file and try uploading it**

In [None]:
too_big_path = Path.cwd()/"too_big_file.txt"
!head -c 9MB /dev/urandom > $too_big_path

In [None]:
too_big_key = "S3_example/too_big_file.txt"

bucket.upload_file(too_big_path, too_big_key)

Once again, we cannot upload this 9MB file because we cannot initiate a multi-part upload as an anonymous user.

---

### **1.3 Configuring you AWS-cli to add your AWS credentials**

- You will need an `AWS Access Key ID` and `AWS Secret Access Key` from an AWS IAM Role with permissions to access any needed private S3 Buckets
  - https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html#Using_CreateAccessKey

These credentials are discoverable by both `aws-cli` and `boto3` and will allow you to:
- upload files larger than 8MB to public S3 buckets (without setting any special permissions)
- access private buckets (if your AWS IAM user has permission)

---

#### **1.3.1 Open a launcher, Drag this notebook tab into split-screen mode, and open a terminal:**

<img src="https://opensarlab-docs.asf.alaska.edu/opensarlab-notebook-assets/workshops/nisar_early_adopters/open_terminal.gif" width=75%/>

---

#### **1.3.2 Configure AWS**

- Enter `aws configure` in the terminal
  - Enter your AWS Access Key ID
  - Enter your AWS Secret Access Key
  - Enter a region ("us-west-2" for ASF data access)
  - Enter "json" as an output format

<img src="https://opensarlab-docs.asf.alaska.edu/opensarlab-notebook-assets/workshops/nisar_early_adopters/aws_creds.gif" width=75%/>