# S3Fs Notebook Example

S3Fs is a Pythonic file interface to S3. It builds on top of botocore.

The top-level class S3FileSystem holds connection information and allows typical file-system style operations like cp, mv, ls, du, glob, etc., as well as put/get of local files to/from S3.

The connection can be anonymous - in which case only publicly-available, read-only buckets are accessible - or via credentials explicitly supplied or in configuration files.

API Version 2021.06.0
https://buildmedia.readthedocs.org/media/pdf/s3fs/latest/s3fs.pdfhttps://buildmedia.readthedocs.org/media/pdf/s3fs/latest/s3fs.pdf

Note: If you get errors like `ModuleNotFoundError: No module named 's3fs'`, try `pip install s3fs` in a terminal and then restart your notebook:


In [82]:
import json
import os
import s3fs

Load the credentials file .json to make a connection to `S3FileSystem`

In [83]:
tenant="standard"
with open(f'/vault/secrets/minio-{tenant}-tenant-1.json') as f:
     creds = json.load(f)


The connection can be anonymous- in which case only publicly-available, read-only buckets are accessible - or via credentials explicitly supplied or in configuration files. 

Calling open() on a S3FileSystem (typically using a context manager) provides an S3File for read or write access to a particular key. The object emulates the standard File protocol (read, write, tell, seek), such that functions expecting a file can access S3.

In [84]:
HOST = creds['MINIO_URL']
SECURE = HOST.startswith('https')
fs = s3fs.S3FileSystem(
    anon=False,
    use_ssl=SECURE,
    client_kwargs=
    {
        "region_name": "us-east-1",
        "endpoint_url": creds['MINIO_URL'],
        "aws_access_key_id": creds['AWS_ACCESS_KEY_ID'],
        "aws_secret_access_key": creds['AWS_SECRET_ACCESS_KEY']
    }
)

## Upload a file

Now that your personal bucket exists you can upload your files! We can use
`example.txt` from the same folder as this notebook.

**Note:** Bucket storage doesn't actually have real directories, so you won't
find any functions for creating them. But some software will show you a
directory structure by looking at the slashes (`/`) in the file names. We'll use
this to put `example.txt` under an `/s3fs-examples` faux directory.

In [85]:
# Desired location in the bucket
#NB_NAMESPACE: namespace of user e.g. rohan-katkar
LOCAL_FILE='example.txt'
REMOTE_FILE= os.environ['NB_NAMESPACE']+'/s3fs-examples/Happy-DAaaS-Bird.txt'

fs.put(LOCAL_FILE,REMOTE_FILE)

## Check path exists in bucket

In [27]:
fs.exists(os.environ['NB_NAMESPACE']+'/s3fs-examples')

True

## List objects in bucket

In [28]:
fs.ls(os.environ['NB_NAMESPACE'])

['rohan-katkar/happy-bird.txt',
 'rohan-katkar/example-folder',
 'rohan-katkar/examples',
 'rohan-katkar/map-reduce-output-lw',
 'rohan-katkar/map-reduce-output',
 'rohan-katkar/s3fs-examples']

## List objects in path


In [29]:
x = []
x= fs.ls(os.environ['NB_NAMESPACE'] +'/s3fs-examples')
for obj in x:
    print(f'Name: {obj}')

Name: rohan-katkar/s3fs-examples/Happy-DAaaS-Bird.txt


## Download a file
There is another method `download(rpath, lpath[, recursive])`. S3Fs has issues with this method. Get is an equivalent method.

In [16]:
from shutil import copyfileobj
DL_FILE='downloaded_s3fsexample.txt'
fs.get(os.environ['NB_NAMESPACE']+'/s3fs-examples/Happy-DAaaS-Bird.txt', DL_FILE)
with open(DL_FILE, 'r') as file:
    print(file.read())

                   ________________
                  /                \
                  |  Go DAaaS!!!!  |
                  | _______________/
                  |/
         ^____,      
         /`  `\    
        /   ^  >      
       /  / , /
  «^` // /=/ %
   ««.~ «_/ %
    ««\,___%
      ``\  \
         ^  ^



# That's it!

You've seen how to upload, list, and download files. You can do more things! For
more advanced usage, check out the full API documentation for the
[S3Fs Python SDK](https://s3fs.readthedocs.io/en/latest/api.html).

And don't forget that you can also do this all on the commandline with `mc`.