## Compare Cloud Data Connector with Boto3

This notebook details the steps to upload and download a txt file using the Cloud Data Connector and the AWS SDK for Python.

### Prerequisites

1. Install Cloud Data Connector.
2. Configure you AWS credentials in environment variables.

Before running the notebook, configure environment variables and set your credentials as follows:

```
$ export AWS_ACCESS_KEY_ID=<your_key_id>
$ export AWS_SECRET_ACCESS_KEY=<your_secret_key>
```

For this example, create an AWS S3 bucket and set the bucket name in `BUCKET_NAME` environment variable as follows:

```
$ export BUCKET_NAME=<your_bucket_name>
```

You can store `BUCKET_NAME`, `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` in an .env file in the same directory as this notebook.

### Specify key value for the new file you will upload to AWS S3 bucket

In AWS S3, files are stored in buckets. S3 supports  the folder concept as a means of grouping objects, so you can specify a folder name where to put a file as a key. For example, a key should be `dir_name/file_name`.

Below cell defines a key for this example. The file will be saved in `1937` folder and its name will be `hello_world.txt`.

In [None]:
dir_name = "1937"
file_name = "hello_world.txt"
key = f"{dir_name}/{file_name}"
print(key)

### Prepare data
Create a downloads directory to save downloaded files.

In [None]:
import os
download_dir = 'downloads'
if not os.path.exists(download_dir):
    os.mkdir(download_dir)

Create a uploads directory to save all files you will upload.

In [None]:
uploads_dir = 'uploads'
if not os.path.exists(uploads_dir):
    os.mkdir(uploads_dir)

Create a txt file in uploads directory and add a plain text string.

In [None]:
file_text = "Hello World!"
file_path = f"{uploads_dir}/{file_name}"
with open(file_path, "w", encoding="UTF-8") as f:
    f.write(file_text)

Read name of the bucket to upload to.

In [None]:
try:
    aws_bucket_name = os.environ["BUCKET_NAME"]
except KeyError: 
    print("Environment variable does not exist, please set a value for aws_bucket_name")

### Upload file with Cloud Data Connector

Import `Connector`, `Downloader` and `Uploader` from cloud_data_connector package. Create a `Connector` to get a S3 client. By default, the `connect` function reads the `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` values from environment variables.

Two lines of code to get a S3 client.

In [None]:
from cloud_data_connector.aws import Connector, Downloader, Uploader
s3_client = Connector().connect()

Next step is to create an `Uploader`, add the S3 client returned by `connect` as parameter and call to `upload`. Set bucket name, file name and key parameters.

One line of code to upload a file to S3 bucket.

In [None]:
Uploader(s3_client).upload(aws_bucket_name, file_path, key)

### Download file with Cloud Data Connector

Download file `hello_world.txt` and save it in `downloads/hello_world_cloud_data_connector.txt`.

Set file name.

In [None]:
cloud_data_connector_file_name="hello_world_cloud_data_connector.txt"

Create a `Downloader` for  `s3_client` and execute `download` as follows:

One line of code is needed to download a file.

In [None]:
Downloader(s3_client).download(aws_bucket_name, key, f"{download_dir}/{cloud_data_connector_file_name}")

Inspect downloaded file

In [None]:
!cat ./{download_dir}/{cloud_data_connector_file_name}

### Upload file with boto3

For boto3 package, two lines of code to get a S3 client.

In [None]:
import boto3
s3_client = boto3.client('s3')

S3 client has the `upload_file` function, it accepts a file path, a bucket name, and an object name or key.

To upload the file `hello_world.txt` to `BUCKET_NAME`, boto3 needs six lines of code.

In [None]:
from botocore.exceptions import ClientError
import logging
try:
    response = s3_client.upload_file(file_path, aws_bucket_name, key)
except ClientError as e:
    logging.error(e)

### Download file with boto3

Your bucket has a file `1937/hello_world.txt` and you can download it. A function provided by boto3 to download files is `download_file`, its parameters are bucket name, key, and file name.

Download `1937/hello_world.txt` file and save it in `hello_world_boto3.txt`.

In [None]:
boto3_file_name="hello_world_boto3.txt"

One line of code to download a file to your local directory with boto3.

In [None]:
s3_client.download_file(aws_bucket_name, key, f"{download_dir}/{boto3_file_name}")

Inspect downloaded file

In [None]:
!cat ./{download_dir}/{boto3_file_name}