
# Upload and download files from bucket storage!

Bucket storage is a good way of making your data accessible from just about
anywhere, and sharing it with others without having to grant them access to your
entire Kubeflow space. While Kubeflow data volumes can only be attached to one
notebook server at a time, you can interact with bucket storage from all your
servers without moving anything.

Under the hood, these examples are using the
[Minio Python SDK](https://github.com/minio/minio-py). We'll be using our
`daaas_storage` module to automatically configure the client, but take a look at
`daaas_storage.py` (in the same folder as this notebook) if you want to see how
that's done.

## Get connected

Getting connected to bucket storage is as easy as a call to
`get_minimal_client()`. We also have premium storage for use cases that require
very high data throughput.

In [None]:
import daaas_storage

storage = daaas_storage.get_minimal_client()
# minio_client = daaas_storage.get_premium_client()

## Create your bucket

You have access to two buckets:

  * Personal: Your own bucket, visible only to you. You can create this bucket,
    and must name it using the form `first_name-last_name`
    (e.g. `blair-drummond`).
  * Shared: A bucket for sharing with others. You can write objects to paths
    prefixed using the form `first_name-last_name`
    (e.g. `blair-drummond/my-file.txt`). Everyone can read from this bucket.

First, we need to create your personal bucket.

In [None]:
# In your own notebook, you might just do something like:
# BUCKET='first_name-last_name'
bucket = input('Personal bucket name:')

In [None]:
# If the bucket does not follow the convention, this will throw an AccessDenied
# exception.

if not storage.bucket_exists(bucket):
    storage.make_bucket(bucket, storage._region)
    print(f'Created bucket: {bucket}')
else:
    print("Your bucket already exists. 👍")

## Upload a file

Now that your personal bucket exists you can upload your files! We can use
`example.txt` from the same folder as this notebook.

**Note:** Bucket storage doesn't actually have real directories, so you won't
find any functions for creating them. But some software will show you a
directory structure by looking at the slashes (`/`) in the file names. We'll use
this to put `example.txt` under an `examples/` faux directory.

In [None]:
# File we want to upload
LOCAL_FILE='example.txt'
# Desired location in the bucket
REMOTE_FILE='examples/Happy-DAaaS-Bird.txt'

storage.fput_object(bucket, REMOTE_FILE, LOCAL_FILE)

## List objects

If you want to list the files in a bucket, you can do that with the storage
client too! Let's do that now and see the file we just uploaded. We'll add a
prefix to limit the results to files beginning with `examples/`, which is akin
to searching within a particular directory.

In [None]:
# List all object paths in bucket that begin with "examples/"
objects = storage.list_objects(bucket, prefix='examples/', recursive=True)

for obj in objects:
    print(f'Name: {obj.object_name}, Size: {obj.size} bytes')
    # Also available: bucket_name, last_modified, etag, content_type

## Download a file

Finally, let's close the loop and download the file we just uploaded, and print
it's contents.

In [None]:
from shutil import copyfileobj

DL_FILE='downloaded_example.txt'

storage.fget_object(bucket, REMOTE_FILE, DL_FILE)
with open(DL_FILE, 'r') as file:
    print(file.read())

# That's it!

You've seen how to upload, list, and download files. You can do more things! For
more advanced usage, check out the full API documentation for the
[Minio Python SDK](https://github.com/minio/minio-py).

And don't forget that you can also do this all on the commandline with `mc`.