# RDA S3 Demo

***The purpose of this notebook is to demonstrate the basic functionality of rda_s3.py, which is a command line utility to interact with an object store.***
***These actions include basic object actions, as well as more powerful operations.***

***Note: This is a work in progress***

In [1]:
# Notebook reminder: '!' at the beginning of a line indicates we're shelling out the command
! ls -l | grep s3

-rwxr-xr-x  1 rpconroy  CIT\Domain Users  22251 Mar 26 08:49 rda_s3.py
-rw-r--r--  1 rpconroy  CIT\Domain Users   2700 Mar 26 08:58 rda_s3_demo.ipynb


## Tool Help

In [2]:
# Main Help
! ./rda_s3.py

usage: rda_s3 [-h] [--noprint] [--prettyprint] [--use_local_config]
              {list_buckets,lb,delete,dl,get_object,go,upload_mult,um,upload,ul,disk_usage,du,list_objects,lo,get_metadata,gm}
              ...

CLI to interact with s3.
Note: To use optional arguments, put them before sub-command.

optional arguments:
  -h, --help            show this help message and exit
  --noprint, -np        Do not print result of actions.
  --prettyprint, -pp    Pretty print result
  --use_local_config, -ul
                        Use your local credentials. (~/.aws/credentials)

Actions:
  {list_buckets,lb,delete,dl,get_object,go,upload_mult,um,upload,ul,disk_usage,du,list_objects,lo,get_metadata,gm}
                        Use `tool [command] -h` for more info on command
    list_buckets (lb)   lists Buckets
    delete (dl)         Delete objects
    get_object (go)     Pull object from store
    upload_mult (um)    Upload multiple objects.
    upload (ul)         Upload 

In [3]:
# Sub-Action help
! ./rda_s3.py um -h

usage: rda_s3 upload_mult [-h] --bucket <bucket> --local_dir <directory> [--key_prefix <prefix>]
                          [--recursive] [--dry_run] [--ignore [<ignore str> [<ignore str> ...]]]
                          [--metadata <dict str, or path to script>]

Upload multiple objects.

optional arguments:
  -h, --help            show this help message and exit
  --bucket <bucket>, -b <bucket>
                        Destination bucket.
  --local_dir <directory>, -ld <directory>
                        Directory to search for files.
  --key_prefix <prefix>, -kp <prefix>
                        Prepend this string to key
  --recursive, -r       recursively search directory
  --dry_run, -dr        Does not upload files.
  --ignore [<ignore str> [<ignore str> ...]], -i [<ignore str> [<ignore str> ...]]
                        directory to search for files
  --metadata <dict str, or path to script>, -md <dict str, or path to script>
                        Optionally p

## Credentials

By default, credentials are found in `~/.aws/credentials`.

Change environment variables to use different credentials:

`AWS_SHARED_CREDENTIALS_FILE` : Path of credential file.

`AWS_PROFILE` : Profile to use (default is [default])

In [4]:
! cat ~/.aws/credentials

[default]
aws_access_key_id = AK06XKLYCIANHSDVOSL6
aws_secret_access_key = bN3i7jcp3avElZ1/cSI1zTloy50Lbqcwz04ajD5Q

[profile Riley]
aws_access_key_id = DummyKey
aws_secret_access_key = DummySecret


In [5]:
! ./rda_s3.py -pp --use_local_config lb

[
    {
        "Name": "rda-data",
        "CreationDate": "2020-03-06 17:48:04+00:00"
    },
    {
        "Name": "rda-decsdata",
        "CreationDate": "2020-03-09 16:50:37+00:00"
    },
    {
        "Name": "rda-requests",
        "CreationDate": "2020-03-06 16:35:01+00:00"
    },
    {
        "Name": "rda-test-chifan",
        "CreationDate": "2020-03-06 16:47:08+00:00"
    },
    {
        "Name": "rda-test-dattore",
        "CreationDate": "2020-03-06 16:47:24+00:00"
    },
    {
        "Name": "rda-test-davestep",
        "CreationDate": "2020-03-06 16:46:43+00:00"
    },
    {
        "Name": "rda-test-rpconroy",
        "CreationDate": "2020-03-06 16:46:29+00:00"
    },
    {
        "Name": "rda-test-schuster",
        "CreationDate": "2020-03-06 16:43:25+00:00"
    },
    {
        "Name": "rda-test-tcram",
        "CreationDate": "2020-03-06 16:44:35+00:00"
    },
    {
        "Name": "rda-test-zji",
        "CreationDate": "202

In [1]:
# This will fail
! export AWS_PROFILE='profile Riley' && ./rda_s3.py lb 

Traceback (most recent call last):
  File "./rda_s3.py", line 723, in <module>
    main(*sys.argv[1:])
  File "./rda_s3.py", line 714, in main
    ret = do_action(args)
  File "./rda_s3.py", line 670, in do_action
    return prog(**args_dict)
  File "./rda_s3.py", line 288, in list_buckets
    response = client.list_buckets()['Buckets']
  File "/anaconda3/lib/python3.6/site-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/anaconda3/lib/python3.6/site-packages/botocore/client.py", line 661, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidAccessKeyId) when calling the ListBuckets operation: The AWS Access Key Id you provided does not exist in our records.


In [3]:
! ./rda_s3.py lb -bo 

['rda-data', 'rda-decsdata', 'rda-requests', 'rda-test-chifan', 'rda-test-dattore', 'rda-test-davestep', 'rda-test-rpconroy', 'rda-test-schuster', 'rda-test-tcram', 'rda-test-zji']


### Only two python cells. Demonstrating that you can use rda_s3 as a python module

In [13]:
import rda_s3 as rs
buckets = rs.list_buckets()
for i in buckets: 
    print(i['Name'], 'Created on the', i['CreationDate'].minute, 'minute')

rda-data Created on the 48 minute
rda-decsdata Created on the 50 minute
rda-requests Created on the 35 minute
rda-test-chifan Created on the 47 minute
rda-test-dattore Created on the 47 minute
rda-test-davestep Created on the 46 minute
rda-test-rpconroy Created on the 46 minute
rda-test-schuster Created on the 43 minute
rda-test-tcram Created on the 44 minute
rda-test-zji Created on the 46 minute


In [14]:
import rda_s3 as rs
buckets = rs.main('-np', 'lb')
print(buckets)

[{'Name': 'rda-data', 'CreationDate': datetime.datetime(2020, 3, 6, 17, 48, 4, tzinfo=tzutc())}, {'Name': 'rda-decsdata', 'CreationDate': datetime.datetime(2020, 3, 9, 16, 50, 37, tzinfo=tzutc())}, {'Name': 'rda-requests', 'CreationDate': datetime.datetime(2020, 3, 6, 16, 35, 1, tzinfo=tzutc())}, {'Name': 'rda-test-chifan', 'CreationDate': datetime.datetime(2020, 3, 6, 16, 47, 8, tzinfo=tzutc())}, {'Name': 'rda-test-dattore', 'CreationDate': datetime.datetime(2020, 3, 6, 16, 47, 24, tzinfo=tzutc())}, {'Name': 'rda-test-davestep', 'CreationDate': datetime.datetime(2020, 3, 6, 16, 46, 43, tzinfo=tzutc())}, {'Name': 'rda-test-rpconroy', 'CreationDate': datetime.datetime(2020, 3, 6, 16, 46, 29, tzinfo=tzutc())}, {'Name': 'rda-test-schuster', 'CreationDate': datetime.datetime(2020, 3, 6, 16, 43, 25, tzinfo=tzutc())}, {'Name': 'rda-test-tcram', 'CreationDate': datetime.datetime(2020, 3, 6, 16, 44, 35, tzinfo=tzutc())}, {'Name': 'rda-test-zji', 'CreationDate': datetime.datetime(2020, 3, 6, 16

# Actions

#### Get object (download)

In [21]:
! ls -l
! ./rda_s3.py go -b rda-test-rpconroy -k test.test
! echo "\nAfter pull\n"
! ls -l

total 96
-rw-r--r--  1 rpconroy  CIT\Domain Users   1251 Mar 16 07:44 __init__.py
drwxr-xr-x  4 rpconroy  CIT\Domain Users    128 Mar 26 09:11 [34m__pycache__[m[m
-rwxr-xr-x  1 rpconroy  CIT\Domain Users  22378 Mar 26 09:40 [31mrda_s3.py[m[m
-rw-r--r--  1 rpconroy  CIT\Domain Users  12994 Mar 26 09:42 rda_s3_demo.ipynb
-rw-r--r--  1 rpconroy  CIT\Domain Users    299 Mar 26 09:42 test.test

After pull

total 96
-rw-r--r--  1 rpconroy  CIT\Domain Users   1251 Mar 16 07:44 __init__.py
drwxr-xr-x  4 rpconroy  CIT\Domain Users    128 Mar 26 09:11 [34m__pycache__[m[m
-rwxr-xr-x  1 rpconroy  CIT\Domain Users  22378 Mar 26 09:40 [31mrda_s3.py[m[m
-rw-r--r--  1 rpconroy  CIT\Domain Users  12994 Mar 26 09:42 rda_s3_demo.ipynb
-rw-r--r--  1 rpconroy  CIT\Domain Users    299 Mar 26 09:42 test.test


#### Delete object

In [23]:
! ./rda_s3.py dl -b rda-test-rpconroy -k test.test

{"ResponseMetadata": {"RequestId": "425AE82765A91F49", "HostId": "fb4d6b40-74da-4f0c-be1c-12a6f173ea7bb8c382e6-4715-4823-8afe-6b03fdc5c5b6", "HTTPStatusCode": 204, "HTTPHeaders": {"x-amz-id-2": "fb4d6b40-74da-4f0c-be1c-12a6f173ea7bb8c382e6-4715-4823-8afe-6b03fdc5c5b6", "x-amz-request-id": "425AE82765A91F49", "date": "Thu, 26 Mar 2020 15:46:21 GMT", "server": "WD-ActiveScale"}, "RetryAttempts": 0}}


#### list single object

In [25]:
! ./rda_s3.py lo -b rda-test-rpconroy test.test

[]


#### upload object

In [27]:
! ./rda_s3.py ul -b rda-test-rpconroy -lf test.test -k test.test -md '{"myMetaKey":"myMetaValue"}'

Traceback (most recent call last):
  File "./rda_s3.py", line 726, in <module>
    main(*sys.argv[1:])
  File "./rda_s3.py", line 717, in main
    ret = do_action(args)
  File "./rda_s3.py", line 670, in do_action
    return prog(**args_dict)
  File "./rda_s3.py", line 452, in upload_object
    return client.upload_file(local_file, bucket, key, ExtraArgs=meta_dict)
  File "/anaconda3/lib/python3.6/site-packages/boto3/s3/inject.py", line 131, in upload_file
    extra_args=ExtraArgs, callback=Callback)
  File "/anaconda3/lib/python3.6/site-packages/boto3/s3/transfer.py", line 279, in upload_file
    future.result()
  File "/anaconda3/lib/python3.6/site-packages/s3transfer/futures.py", line 106, in result
    return self._coordinator.result()
  File "/anaconda3/lib/python3.6/site-packages/s3transfer/futures.py", line 265, in result
    raise self._exception
  File "/anaconda3/lib/python3.6/site-packages/s3transfer/tasks.py", line 126, in __call__
    return self._execute