# Notebook: S3 Listing, Searching, Uploading and Downloading

AWS S3 buckets provide object storage in the AWS cloud.

 - **This notebook will throw some errors for demonstration purposes. Do not be alarmed.**

AWS provides software to interact with the S3 API:
  - [AWS CLI](https://aws.amazon.com/cli/) for access from the command line
    - There are 2 major version of awscli, v1 and v2, which use different authentication paradigms.
      - This notebook covers awscli v1
      - awscli v2 content will be added at a later date
  - [boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html) for programmatic Python access

S3 Buckets may be **public** or **private**. Charges may be billed to the S3 bucket's account or charged to the account of the user accessing the data (requester-pays).

---
## **0 Listing and Searching Public S3 Buckets**

```
Public:
- Does not require authentication
- If not authenticating, requires the `--no-sign-request` argument
  
Private:
- Requires permissions and authentication
```

**(AWS CLI) List public bucket contents without credentials**

Will error if credentials are not configured

In [None]:
!aws s3 ls s3://asf-jupyter-data-west/S3_example --recursive

**(AWS CLI) List public bucket as an anonymous user with the `--no-sign-request` argument**

In [None]:
!aws s3 ls --no-sign-request s3://asf-jupyter-data-west/S3_example --recursive

**(boto3) List public bucket contents without credentials**

Will error if credentials are not configured

In [None]:
import boto3

s3 = boto3.resource('s3')

bucket_name = "asf-jupyter-data-west"
bucket = s3.Bucket(bucket_name)

for obj in bucket.objects.all():
    print(obj.key)

**(boto3) List public bucket as an anonymous user with a config containing an unsigned signature**

In [None]:
import boto3
from botocore import UNSIGNED
from botocore.config import Config

s3 = boto3.resource('s3', config=Config(signature_version=UNSIGNED))

bucket_name = "asf-jupyter-data-west"
bucket = s3.Bucket(bucket_name)

for obj in bucket.objects.all():
    print(obj.key)


**(boto3) Search using a prefix**

In [None]:
import boto3
from botocore import UNSIGNED
from botocore.config import Config

s3 = boto3.resource('s3', config=Config(signature_version=UNSIGNED))

bucket_name = 'asf-jupyter-data-west'
bucket = s3.Bucket(bucket_name)

prefix = 'S3_example'

for obj in bucket.objects.filter(Prefix=prefix):
    print(f's3://{bucket_name}/{obj.key}')

---
## **1 Downloading from S3 Buckets**

```
Public:
- Does not require authentication
- If not authenticating, requires the `--no-sign-request` argument
  
Private:
- Requires permissions and authentication
  
Requester-pays:
- Requires the `RequestPayer='requester'` argument
```
The `aws s3 cp` command works for downloading or uploading from/to S3:

`aws s3 cp source_path destination_path`


**(AWS CLI) Download a file from public bucket contents without credentials**

Will error if credentials are not configured


In [None]:
!aws s3 cp s3://asf-jupyter-data-west/S3_example/example.txt example.txt

**(AWS CLI) Download a file from a public bucket as an anonymous user with the `--no-sign-request` argument**

Look for the downloaded file in the file browser

In [None]:
!aws s3 cp --no-sign-request s3://asf-jupyter-data-west/S3_example/example.txt example.txt

**Delete the downloaded file**

In [None]:
from pathlib import Path

example_path = Path.cwd()/"example.txt"
example_path.unlink(missing_ok=True)

**Download a file with `boto3` from a public S3 bucket without credentials** 

Will error if credentials are not configured

In [None]:
import boto3

s3 = boto3.resource('s3')

bucket_name = "asf-jupyter-data-west"
bucket = s3.Bucket(bucket_name)

# objects in S3 bucket storage are called keys
example_key = "S3_example/example.txt"

bucket.download_file(example_key, example_path) 

**Download a file as an anonymous user with a config containing an unsigned signature**

It may take a few moments for the file to appear in the file browser.

In [None]:
import boto3
from botocore import UNSIGNED
from botocore.config import Config

s3 = boto3.resource('s3', config=Config(signature_version=UNSIGNED))

bucket_name = "asf-jupyter-data-west"
bucket = s3.Bucket(bucket_name)

bucket.download_file(example_key, example_path) 

---

## **2 Configuring awscli v1 to add your AWS credentials**

- You will need an `AWS Access Key ID` and `AWS Secret Access Key` from an AWS IAM Role with permissions to access any needed private S3 Buckets
  - https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html#Using_CreateAccessKey

These credentials are discoverable by both `aws-cli` and `boto3` and will allow you to:
- upload files to public S3 buckets if you have an access key for the account and your IAM user has permissions
- access private buckets if you have an access key for the account and your IAM user has permissions

---

#### **Open a launcher, Drag this notebook tab into split-screen mode, and open a terminal:**

<img src="https://opensarlab-docs.asf.alaska.edu/opensarlab-notebook-assets/workshops/nisar_early_adopters/open_terminal.gif" width=75%/>

---

#### **Configure your default AWS profile**

- Enter `aws configure` in the terminal
  - Enter your AWS Access Key ID
  - Enter your AWS Secret Access Key
  - Enter a region ("us-west-2" for ASF data access)
  - Enter "json" as an output format

<img src="https://opensarlab-docs.asf.alaska.edu/opensarlab-notebook-assets/workshops/nisar_early_adopters/aws_creds.gif" width=75%/>

---

#### **Configure a non-default AWS profile**

Adding profiles will allow you to access buckets in different AWS accounts using different sets of credentials

**Repeat the steps above, changing the following:**
- change `aws configure` to  `aws configure --profile your_profile_name`
    - replace `your_profile_name` with your chosen profile name
    
This will produce a `~/.aws/credentials` file that looks something like this:

```
[default]
aws_access_key_id = key_id_for_account_1
aws_secret_access_key = access_key_for_account_1

[profile_name_for_account_2]
aws_access_key_id = key_id_for_account_2
aws_secret_access_key = access_key_for_account_2
```

---
## **3 Uploading to an S3 Bucket**

- Swapping the source and destination paths in our previous awscli download command will upload the file to the same location**
- Even though the bucket is public, you will not be able to write to it unless the IAM user owning your Access Key has permission
- Uploading the file will overwrite the copy in the S3 bucket

**(AWS CLI) Upload a file with your default AWS profile**

In [None]:
!aws s3 cp example.txt s3://asf-jupyter-data-west/S3_example/example.txt 

**(AWS CLI) Upload a file with an non-default profile**

If you added your credentials under a profile, you will need to use the `--profile` argument and provide your profile name

In [None]:
!aws s3 --profile osl cp example.txt s3://asf-jupyter-data-west/S3_example/example.txt 

**(boto3) Upload a file with your default AWS profile**

In [None]:
import boto3

s3 = boto3.resource('s3')

bucket_name = "asf-jupyter-data-west"
bucket = s3.Bucket(bucket_name)

# objects in S3 bucket storage are called keys
example_key = "S3_example/example.txt" 

bucket.upload_file(example_path, example_key)

**(boto3) Upload a file with an non-default profile**
- `session = boto3.Session(profile_name='your _profile_name')`

In [None]:
s3 = boto3.resource('s3')

bucket_name = "asf-jupyter-data-west"
session = boto3.Session(profile_name='osl')
s3 = session.client('s3')

# objects in S3 bucket storage are called keys
example_key = "S3_example/example.txt" 

s3.upload_file(example_path, bucket_name, example_key)

