# Day 2 - Working with S3

In this notebook we will use some S3 API to interact with [Minio](https://minio.io) a full-fledged service, based on *object storage* combining two protocols:
 * `s3://`, providing a multi-user service with per-user authentication for uploading and downloading files;
 * `http://`, providing a public service with per-operation authentication for uploading and downloading files.

The `s3` protocol is an **open standard** for *object storage* that was first released by Amazon Web Services (AWS) in 2006.

It provides a simple web interface that can be used to store and retrive any amount of data, at any time, from anywhere in the web.

The `s3` protocol is a popular choice for storing and managing large amounts of unstructured data such as images, videos and log files. 
It offers a range of storage classes designed for different use cases, requiring a frequent access or relying on cold storage for archiving data at the lowest cost. 

Minio, and in general object storage, is organized in ***buckets***. 
A bucket is a logical container for stored objects. It is more a flat structure that stores objects and their metadata than a file inside a folder.
Buckets are used to organize and manage objects in *object storage* systems.

Buckets can be created as needed and associated to policies determining what actions users can perform on a bucket and on all the objects in the bucket.
Example of policies include replication to other storage services (for disaster recovery) or lifecycle policies. 

In this notebook we will focus on the basics of S3, including bucket policies and metadata. 


## Accessing Minio console

Go to `https://sosc.131.154.98.182.myip.cloud.infn.it/`, start you Minio application and login with the user and password provided

## Accessing *Minio* via `s3` in Python with the boto3 library

The `boto3`  enables more complicated authorization patterns and enables developing applications which are independent of the object storage provider. In other words, if you develop your application with `boto3` you can transparently migrate from a self-hosted Minio server, to an AWS object storage solution. Enable the S3 client by running the cell.
Use `endpoint_url="http://localhost:9000` as endpoint to contact minio API.


In [3]:
import boto3
import json

miniouser="put_here_user"
miniopassword="put_here_password"

s3client = boto3.client('s3',
    aws_access_key_id=miniouser,
    aws_secret_access_key=miniopassword,
    endpoint_url="http://localhost:9000",
    region_name='default',)

Then apply the following to perform some actions

1. List your buckets

In [None]:
resp = s3client.list_buckets()
print(resp)

2. Create your own bucket (if you are allowed!)

In [None]:
bucket_name = 'bucket1'
s3bucket = s3client.create_bucket(Bucket=bucket_name)
resp = s3client.list_buckets()
print(resp)

3. Print only the Bucket name(s)

In [None]:
resp = s3client.list_buckets()
for bucket in resp['Buckets']:
        print(bucket['Name'])

4. Retrieve the policy for the specified bucket (check the MINIO console)

In [None]:
bucket_name = 'bucket1'
resp = s3client.get_bucket_policy(Bucket=bucket_name,)
print(resp)
print(resp['Policy'])

5. Create your own bucket policy

In [None]:
bucket_name = 'bucket1'
bucket_policy = {
    'Version': '2012-10-17',
    'Statement': [{
        'Sid': 'AddPerm',
        'Effect': 'Allow',
        'Principal': '*',
        'Action': ['s3:ListBucket'],
        'Resource': f'arn:aws:s3:::{bucket_name}'
    }]
}

# Convert the policy from JSON dict to string
bucket_policy = json.dumps(bucket_policy)

# Set the new policy
s3client.put_bucket_policy(Bucket=bucket_name, Policy=bucket_policy)
resp = s3client.get_bucket_policy(Bucket=bucket_name,)
print(resp)


6. Upload an object (upload or create a couple of txt file such as test1.txt and test2.txt)

In [None]:
bucket_name = 'bucket1'
upload = s3client.upload_file('test.txt', bucket_name, 'test/test.txt')
resp = s3client.list_objects(Bucket=bucket_name)
print(resp)

7. List Object in a bucket

In [None]:
bucket_name = 'bucket1'
resp = s3client.list_objects(Bucket=bucket_name)
for object in resp['Contents']:
        print(object['Key'])

8. List metadata of an Object

In [None]:
bucket_name = 'bucket1'
resp = s3client.list_objects(Bucket=bucket_name)
##print(resp)
for object in resp['Contents']:
    print(object['Key'])
    metadata = s3client.head_object(Bucket=bucket_name, Key=object['Key'])
    print(metadata)

9. Add personalized metadata

In [None]:
bucket_name = 'bucket1'
resp = s3client.list_objects(Bucket=bucket_name)
for object in resp['Contents']:
    print(object['Key'])
    metadata = s3client.head_object(Bucket=bucket_name, Key=object['Key'])
    print(metadata)
    new_meta = metadata['Metadata']
    new_meta['Costa'] = 'costa1'
    s3client.copy_object(Bucket=bucket_name, Key=object['Key'], CopySource=bucket_name + '/' + object['Key'], Metadata=new_meta, MetadataDirective='REPLACE')
    metadata = s3client.head_object(Bucket=bucket_name, Key=object['Key'])
    print(metadata)

10. Delete an Object

In [None]:
bucket_name = 'bucket1'
resp = s3client.list_objects(Bucket=bucket_name)
for object in resp['Contents']:
    print(object['Key'])
    s3client.delete_object(Bucket=bucket_name, Key=object['Key'])
resp = s3client.list_objects(Bucket=bucket_name)
print(resp)

11. Delete a bucket

In [None]:
bucket_name = 'bucket1'
s3bucket = s3client.delete_bucket(Bucket=bucket_name)
resp = s3client.list_buckets()
print(resp)