<a href="https://colab.research.google.com/github/Mjboothaus/emmaus_walking/blob/master/Test_Scaleway_S3_Storage.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Scaleway Object Storage - Testing

https://www.simplecto.com/using-django-and-boto3-with-scaleway-object-storage/

* `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` can be obtained from the [credentials control panel](https://console.scaleway.com/project/credentials) under API Keys.
* `AWS_STORAGE_BUCKET_NAME` is the name of the bucket you create on [objects administration page](https://console.scaleway.com/object-storage/buckets)
* `AWS_DEFAULT_ACL` is set to public-read so that the objects can be pulled from a URL without any access keys or time-limited signatures.
* `AWS_S3_REGION_NAME` and `AWS_S3_ENDPOINT_URL` should be configured so that `boto3` knows to point to the Scaleway resources.

All of this is references in the Scaleways docs on Object Storage.

In [3]:
AWS_ACCESS_KEY_ID = 'SCWNAS0E0KKXVNMDW2KE'
AWS_SECRET_ACCESS_KEY = 'mysecretkey'
AWS_STORAGE_BUCKET_NAME = 'test-bucket-2047'
AWS_DEFAULT_ACL = 'public-read'
AWS_S3_REGION_NAME = 'fr-par'
AWS_S3_ENDPOINT_URL =  'https://s3.fr-par.scw.cloud'  # 'https://test-bucket-2047.s3.fr-par.scw.cloud'

Resources:

* https://www.scaleway.com/en/docs/object-storage-feature/
* https://www.scaleway.com/en/docs/how-to-migrate-object-storage-buckets-with-rclone/
* https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html

In [2]:
# !pip install boto3

In [3]:
import boto3

In [11]:
AWS_SECRET_ACCESS_KEY = 'XXXXXX'

In [6]:
s3 = boto3.client('s3', region_name=AWS_S3_REGION_NAME, 
                  endpoint_url=AWS_S3_ENDPOINT_URL, 
                  aws_access_key_id=AWS_ACCESS_KEY_ID,
                  aws_secret_access_key=AWS_SECRET_ACCESS_KEY)

In [7]:
s3

<botocore.client.S3 at 0x7f2ffaa05b90>

In [8]:
response = s3.list_buckets()

# Output the bucket names
print('Existing buckets:')
for bucket in response['Buckets']:
    print(f'  {bucket["Name"]}')

Existing buckets:
  test-bucket-2047


In [16]:
s3.list_objects(Bucket=AWS_STORAGE_BUCKET_NAME);

In [13]:
with open('sample_fit_file.fit', 'wb') as f:
    s3.download_fileobj(AWS_STORAGE_BUCKET_NAME, 'emmaus-walking/GWW/2020-05-18-110321-Walking-Michael and Ai Leen’s Apple\xa0Watch.fit', f)

In [14]:
!ls

sample_data  sample_fit_file.fit


In [20]:
from os import path, makedirs
from botocore.exceptions import ClientError
from boto3.exceptions import S3TransferFailedError

In [23]:
def download_s3_folder(s3_client, s3_folder, local_dir, aws_bucket, debug_en):
    """ Download the contents of a folder directory into a local area """

    success = True

    print('[INFO] Downloading %s from bucket %s...' % (s3_folder, aws_bucket))

    def get_all_s3_objects(s3, **base_kwargs):
        continuation_token = None
        while True:
            list_kwargs = dict(MaxKeys=1000, **base_kwargs)
            if continuation_token:
                list_kwargs['ContinuationToken'] = continuation_token
            response = s3.list_objects_v2(**list_kwargs)
            yield from response.get('Contents', [])
            if not response.get('IsTruncated'):
                break
            continuation_token = response.get('NextContinuationToken')

    #s3_client = boto3.client('s3',
    #                         aws_access_key_id=aws_access_key_id,
    #                         aws_secret_access_key=aws_secret_access_key)

    all_s3_objects_gen = get_all_s3_objects(s3_client, Bucket=aws_bucket)

    for obj in all_s3_objects_gen:
        source = obj['Key']
        if source.startswith(s3_folder):
            destination = path.join(local_dir, source)
            if not path.exists(path.dirname(destination)):
                makedirs(path.dirname(destination))
            try:
                s3_client.download_file(aws_bucket, source, destination)
            except (ClientError, S3TransferFailedError) as e:
                print('[ERROR] Could not download file "%s": %s' % (source, e))
                success = False
            if debug_en:
                print('[DEBUG] Downloading: %s --> %s' % (source, destination))

    return success

In [24]:
download_s3_folder(s3, 'emmaus-walking', '.', AWS_STORAGE_BUCKET_NAME, True)

[INFO] Downloading emmaus-walking from bucket test-bucket-2047...


NotADirectoryError: ignored

In [25]:
!pip install cloudpathlib[s3]

Collecting cloudpathlib[s3]
[?25l  Downloading https://files.pythonhosted.org/packages/07/a5/e94c7c83769db54a95f0a87286d4540f70f18d723c0f9cad230727c25dee/cloudpathlib-0.4.1-py3-none-any.whl (47kB)
[K     |███████                         | 10kB 14.6MB/s eta 0:00:01[K     |█████████████▉                  | 20kB 20.4MB/s eta 0:00:01[K     |████████████████████▊           | 30kB 17.4MB/s eta 0:00:01[K     |███████████████████████████▋    | 40kB 15.1MB/s eta 0:00:01[K     |████████████████████████████████| 51kB 3.7MB/s 
Installing collected packages: cloudpathlib
Successfully installed cloudpathlib-0.4.1


In [29]:
from cloudpathlib import S3Client

In [32]:
client = S3Client(
                  aws_access_key_id=AWS_ACCESS_KEY_ID,
                  aws_secret_access_key=AWS_SECRET_ACCESS_KEY)

In [33]:
cp1 = client.CloudPath("s3://test-bucket-2047/")

In [36]:
cp1.is_dir()

True

In [37]:
for f in cp1.glob('**/*.*'):
    print(f)

ClientError: ignored

In [26]:
from cloudpathlib import CloudPath

# dispatches to S3Path based on prefix
root_dir = CloudPath("s3://test-bucket-2047.fr-par.scw.cloud")

root_dir
#> S3Path('s3://drivendata-public-assets/')




S3Path('s3://test-bucket-2047.fr-par.scw.cloud')

In [27]:
root_dir

S3Path('s3://test-bucket-2047.fr-par.scw.cloud')

In [28]:
for f in root_dir.glob('**/*.*'):
    print(f)

NoCredentialsError: ignored

In [1]:
#!pip install s3fs

In [2]:
import s3fs

In [9]:
fs = s3fs.S3FileSystem(key=AWS_ACCESS_KEY_ID, secret=AWS_SECRET_ACCESS_KEY, url=AWS_S3_ENDPOINT_URL)

In [10]:
fs.ls(AWS_STORAGE_BUCKET_NAME)

TypeError: ignored

In [13]:
import fastai
import fastcore

ModuleNotFoundError: ignored

In [None]:
fastai.untar_data()