<a href="https://colab.research.google.com/github/Mjboothaus/personal-tan-lea-kuan/blob/main/notebooks/storage.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## S3 Object Storage (on Scaleway.com)

75GB of free object storage (S3) storage is available on Scaleway.

https://www.simplecto.com/using-django-and-boto3-with-scaleway-object-storage/

* `ACCESS_KEY_ID` and `SECRET_ACCESS_KEY` can be obtained from the [credentials control panel](https://console.scaleway.com/project/credentials) under API Keys.
* `STORAGE_BUCKET_NAME` is the name of the bucket you create on [objects administration page](https://console.scaleway.com/object-storage/buckets)
* `DEFAULT_ACL` is set to public-read so that the objects can be pulled from a URL without any access keys or time-limited signatures.
* `S3_REGION_NAME` and `S3_ENDPOINT_URL` should be configured so that `boto3` knows to point to the Scaleway resources.

All of these are references in the Scaleway's docs on Object Storage.

### Resources:
* https://www.scaleway.com/en/docs/object-storage-feature/
* https://www.scaleway.com/en/docs/how-to-migrate-object-storage-buckets-with-rclone/
* https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html

In [None]:
from dynaconf import settings
from pathlib import Path
import boto3

: 

In [4]:
ACCESS_KEY_ID = settings.S3.ACCESS_KEY_ID
SECRET_ACCESS_KEY = settings.S3.SECRET_ACCESS_KEY
BUCKET_NAME = settings.S3.BUCKET_NAME
DEFAULT_ACL = settings.S3.DEFAULT_ACL
REGION_NAME = settings.S3.REGION_NAME
ENDPOINT_URL =  settings.S3.ENDPOINT_URL

In [6]:
#from os import path, makedirs
from botocore.exceptions import ClientError
#from boto3.exceptions import S3TransferFailedError

In [11]:
s3 = boto3.client('s3', 
        region_name=REGION_NAME, 
        endpoint_url=ENDPOINT_URL, 
        aws_access_key_id=ACCESS_KEY_ID,
        aws_secret_access_key=SECRET_ACCESS_KEY)


#s3_session = boto3.Session(region_name=REGION_NAME)

#resource = s3_session.resource("s3",
#    endpoint_url=S3_URL,
#    aws_access_key_id=SCW_ACCESS_KEY_S3,
#    aws_secret_access_key=SCW_SECRET_KEY_S3
#)

In [None]:


import s3fs

s3 = s3fs.S3FileSystem(
   key=ACCESS_KEY_ID,
   secret=SECRET_ACCESS_KEY,
   client_kwargs={
      'endpoint_url': ENDPOINT_URL,
      'region_name': REGION_NAME
   }
)

In [24]:
def upload_file_to_S3(bucket, file_path, s3_bucket_filename, s3client):
    #client = resource.meta.client
    obj_list = s3client.list_objects(Bucket=bucket.name)
    
    if "Contents" not in obj_list.keys():
        print(f'uploading file: {s3_bucket_filename}')
        with open(file_path, "rb") as f:
            bucket.upload_fileobj(f, s3_bucket_filename, ExtraArgs={'ACL':'public-read'})
    else:
        obj_key = [ key["Key"] for key in client.list_objects(Bucket=bucket.name)["Contents"] ]
        if s3_bucket_filename not in obj_key:
            print(f'uploading file: {s3_bucket_filename}')
            with open(file_path, "rb") as f:
                bucket.upload_fileobj(f, s3_bucket_filename, ExtraArgs={'ACL':'public-read'})
        else:
            print(f'file already exists: {s3_bucket_filename}')
    
    return f"{bucket.name}/{s3_bucket_filename}"

In [25]:
upload_file_to_S3(BUCKET_NAME, "gitpod.env", "gitpod.env.txt", s3)

AttributeError: 'str' object has no attribute 'name'

In [12]:
def download_s3_folder(s3, bucket_name, s3_folder, local_dir=None):
    filecount = 0
    files = []
    if not local_dir.exists():
        Path.makedirs(local_dir)
    bucket_list=s3.list_objects(Bucket=bucket_name)['Contents']
    for s3_key in bucket_list:
        s3_object = s3_key['Key']
        if not s3_object.endswith("/"):
            filepath = Path(local_dir)/s3_object
            s3.download_file(bucket_name, s3_object, filepath.as_posix())
            filecount+=1
            files.append(s3_object)
        else:
            if not (Path(local_dir)/s3_object).exists():
                Path.makedirs(Path(local_dir)/s3_object)
    return filecount, files

In [15]:
# filecount, files = download_s3_folder(s3, BUCKET_NAME, REPO_NAME, Path.home()/'tmp')