# Reading and writing data on Cloud object storage 
Reading from and writing to Cloud object storage (e.g. AWS S3, Google Cloud Storage, Azure Blob Storage) is a bit different than regular filesystems.   Here we access public read buckets and write to an S3-API-compatible Pangeo@EOSC MinIO bucket.  We use `fsspec`, which makes many types of data storage (including S3) look like filesystems. 

In [None]:
import fsspec
import pandas as pd
import os
import xarray as xr

List files on a public read bucket

In [None]:
fs = fsspec.filesystem('s3', anon=True)

In [None]:
fs.ls('anaconda-public-datasets')

Reading CSV from a public read bucket

In [None]:
df = pd.read_csv(fs.open("s3://anaconda-public-datasets/iris/iris.csv"))
df

Write CSV to an S3 bucket

In [None]:
from dotenv import load_dotenv
_ = load_dotenv('/home/jovyan/dotenv/rsignell4.env')  # create AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY env vars

In [None]:
username = os.environ['JUPYTERHUB_USER']
print(username)

In [None]:
fs = fsspec.filesystem('s3', anon=False, skip_instance_cache=True, use_listings_cache=False,
                       endpoint_url='https://pangeo-eosc-minioapi.vm.fedcloud.eu')

In [None]:
bucket = 's3://protocoast-school'

In [None]:
fs.ls(bucket)

In [None]:
outfile = fs.open(f"s3://{bucket}/{username}/testing/iris.csv", 
                      mode='wt')

with outfile as f:
    df.to_csv(f)

List files on restricted S3 bucket

In [None]:
fs.ls(f'{bucket}/{username}/testing/')

In [None]:
df = pd.read_csv(fs.open(f"s3://{bucket}/{username}/testing/iris.csv"))
df

The rest of the examples will use xarray, which follows the NetCDF data model

Read NetCDF data from THREDDS OPeNDAP Service  

In [None]:
ds = xr.open_dataset('http://geoport.usgs.esipfed.org/thredds/dodsC/silt/usgs/Projects/stellwagen/CF-1.6/BUZZ_BAY/2651-A.cdf')
ds

Visualation interlude: plot a time range of data with hvplot

In [None]:
import hvplot.xarray

ds['T_20'].sel(time=slice('1982-10-01','1982-10-31')).hvplot(grid=True)

Write NetCDF data to s3 bucket

In [None]:
local_file = '2651-A-v3.nc'
s3_url = f's3://{bucket}/{username}/testing/2651-A-v3.nc'

ds.to_netcdf(local_file, mode='w')

_ = fs.upload(local_file, s3_url)

Read NetCDF data from s3 bucket

In [None]:
xr.open_dataset(fs.open(s3_url))