**FR** - Le présent notebook donne des exemples d'interaction avec un serveur MinIO qui expose un service S3 :
 - téléverser un fichier Zarr dans un contenant (bucket) 
 - synchroniser le contenu d'un répertoire local avec un bucket à l'aide de s3fs-fuses3fs-fuse <br>
 
**EN** - This notebook provides a few examples of how to interact with a MinIO server that exposes an S3 service :
 - upload a Zarr file to a bucket
 - synchronize the contents of a local file to a bucket with s3fs-fuse

# Upload Zarr

Zarr files really are a collection of files and directories.  The example below therefore is more like a `how to` for uploading a directory. <br> <br>
The specific case illustrated here is that of a MinIO service that has "certificate issues".  The common use case is a MinIO test set up using a self-signed certificate but this example uses the internal DNS name of one of the pods where the MinIO service is running (and its exposed `NodePort`), though this is not shown explicitly thanks to `getpass`. Of course, an external IP is necessary for production use. 

In [None]:
# We need to import urllib3 to disable certificate verification when MinIO servers were not set up with CA-level certificates (e.g. self signed certificates)
import urllib3
import pathlib
import getpass # getpass allows to pass credentials and other sensitive information securely
from minio.api import Minio # Even though doc says "from minio import Minio" this will not work with MinIO 7.1.16 (current as of 09/2023)

In [None]:
access_key = getpass.getpass(prompt='MinIO access key: ')

In [None]:
secret_access_key = getpass.getpass(prompt='MinIO secret access key: ')

In [None]:
endpoint = getpass.getpass(prompt='MinIO endpoint without http.s and with port speficied after colon: ')

In [None]:
client = Minio(endpoint,
    access_key=access_key,
    secret_key=secret_access_key,
    secure=True, # needed to ensure https; avoids "http not allowed on https" errors
    http_client=urllib3.PoolManager(cert_reqs='CERT_NONE') # explicitly not care about certificate; not for a production server!
    ) 

In [None]:
# One can then interact with the object storage service, like list buckets
bucket_list = client.list_buckets()

In [None]:
bucket_list

In [None]:
zarr_file = "hrdps-2023091112-TT.zarr" # this is an example; `zarr_file` needs to be accessible locally

In [None]:
bucket_name = getpass.getpass(prompt='Bucket name ')

In [None]:
# Given `client`, `bucket_name` defined above and
# a Zarr file (a directory that has the name of the dataset and contains the series of files and sub-directories that make up the zarr "file")

def upload_zarr_directory(client, bucket_name, local_directory):
    try:
        # Validate the arguments
        assert isinstance(client, Minio), "client must be an instance of Minio"
        assert isinstance(bucket_name, str), "bucket_name must be a string"
        assert isinstance(local_directory, str), "local_directory must be a string"

        # Check if the bucket exists
        if not client.bucket_exists(bucket_name):
            raise ValueError("Bucket '{}' does not exist on the client".format(bucket_name))
        
        # Check if the zarr_file exists

        if not pathlib.Path(zarr_file).is_file():
            raise ValueError(f"{zarr_file} is not a valid local file.")

        for file_path in pathlib.Path(local_directory).glob('**/*'):
            if file_path.is_file():
                object_name = str(pathlib.Path(zarr_filename) / file_path.relative_to(local_directory))
                client.fput_object(bucket_name, object_name, str(file_path))

    except (AssertionError, ValueError) as e:
        raise ValueError(str(e))

In [None]:
# Use function above to upload the whole bit
upload_zarr_directory(client, bucket_name, zarr_file)

# Synchronize local directory with a bucket

It may be interesting for users to have a copy of a specific folder in a bucket, as a form of backup or as a way to have access to resources through-the-web (TTW).  The MinIO client (`mc`) allows one to "[mirror](https://min.io/docs/minio/linux/reference/minio-mc/mc-mirror.html)" a local directory to a bucket but it is a one-way process and it obviously requires `mc` to be installed.  Although it is fairly easy to do so -- just [download one file](https://min.io/docs/minio/linux/reference/minio-mc.html#install-mc), make it executable and run it -- there is a more comprehensive solution : [FUSE](https://en.wikipedia.org/wiki/Filesystem_in_Userspace), more specifically in our case [s3fs](https://linuxbeast.com/aws-operations/how-to-install-s3fs-and-mount-an-s3-bucket-on-ubuntu-20-04/).<br><br>
As mentioned in the latter url : 

```
The use case for S3fs is for anyone who needs to access Amazon S3 storage in a more traditional file system interface. This can be especially useful for backing up data, archiving files, or sharing data between different systems. With S3fs, you can interact with S3 as if it were a local file system, making it much easier to automate data transfer and retrieval processes. S3fs is also useful for organizations that use Amazon S3 as their primary storage solution, as it provides a more seamless way to access and manage the data stored there.
```

> NOTE : MinIO is an implementation of AWS's S3.  As such, software like `s3fs`, which are designed primarily for working with AWS S3, works with other S3 implementations.  The FUSE system can also be used to mount a local directory on Azure object storage, but one must use Microsoft's [blobfuse2](https://learn.microsoft.com/en-us/azure/storage/blobs/blobfuse2-what-is) to do so.

In [None]:
# This is shown in a notebook cell but should be carried out diretly on the commandline (shown by the prompt sign `$`)
# With `s3fs` previously installed on your system
# $ s3fs destination_bucket_name local_directory -o passwd_file=path_to_your_creds_file -o url=url_to_your_minio_service -o use_path_request_style -o ssl_verify_hostname=0 -o no_check_certificate
# Command to verify your directory is indeed mounted onto a MinIO bucket :
# $ mount | grep s3fs
# s3fs on `local_directory` type fuse.s3fs (rw,nosuid,nodev,relatime,user_id=61144,group_id=61144)


Some of the options (`-o`).
- `passwd_file` must contain one line structured like so : access_key:Secret_access_key.  That is both credential items are separated by a colon (`:`)
- `use_path_request_style` apparently needed for MinIO
- `ssl_verify_hostname` and `no_check_certificate` : needed to bypass SSL issues.

> Note : SSL errors will most often go unnoticed.  You simply won't be able to mount directories with no indication of failure.