# Using public data on NIRD using s3 and saving results in private s3 object storage

<div class="alert alert-success alert-info">
    <b>How to discover (spatial and temporal search and subsetting) Obs-CERES-EBAF model output prepared for obs4MIPs OBSERVATIONS dataset</b>
    <ul>
    <li>We show how to access s3 anonymous data in netCDF, make a geographical area selection and store into zarr on s3 private object storage</li>
        <li>We do not address dask (chunking optimization, etc.)</li>
    </ul>
</div>

In [None]:
import s3fs
import xarray as xr

## Connect to bucket (anonymous login for public data only)

In [None]:
fs = s3fs.S3FileSystem(
    anon=True, client_kwargs={"endpoint_url": "https://climate.uiogeo-apps.sigma2.no/"}
)

### List bucket content

In [None]:
fs.ls("ESGF")

In [None]:
fs.ls("ESGF/obs4MIPs/CERES-EBAF/")

## Access data files 
- if netCDF format is used, data access can be slow)
- you should try to use cloud zarr format 

In [None]:
s3path = "s3://ESGF/obs4MIPs/CERES-EBAF/*.nc"

In [None]:
remote_files = fs.glob(s3path)

In [None]:
remote_files

In [None]:
# Iterate through remote_files to create a fileset
fileset = [fs.open(file) for file in remote_files]

# This works
dset = xr.open_mfdataset(fileset, combine="by_coords")

In [None]:
dset

## Shift longitude from 0 to 360 to -180 to 180 for convenience (when subsetting)

In [None]:
dset = dset.assign_coords(lon=(((dset.lon + 180) % 360) - 180)).sortby("lon")

## Plot a single time

In [None]:
!pip install cmaps

In [None]:
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import cmaps

In [None]:
fig = plt.figure(figsize=(20, 10))

# We're using cartopy and are plotting in Orthographic projection
# (see documentation on cartopy)
ax = plt.subplot(
    1,
    1,
    1,
    projection=ccrs.AlbersEqualArea(central_longitude=20.0, central_latitude=40.0),
)
ax.coastlines(resolution="10m")

# custom colormap

lcmap = cmaps.BlueYellowRed

# We need to project our data to the new Mercator projection and for this we use `transform`.
# we set the original data projection in transform (here PlateCarree)
# we only plot values greather than 0
dset["rlut"].sel(time="2011-10-16").sel(lat=slice(50, 90), lon=slice(-10, 50)).plot(
    ax=ax, transform=ccrs.PlateCarree(), cmap=lcmap
)
ax.set_title(
    "Obs-CERES-EBAF model output prepared for obs4MIPs OBSERVATIONS\n ", fontsize=20
)
plt.savefig("Obs-CERES-EBAF_rlut_2011-10-16.png")

## Save results in zarr on NIRD for further analysis
- your credentials are in `$HOME/.aws/credentials` 
- check with your instructor to get the secret access key (replace XXX by the right key)

```
[default]
aws_access_key_id=forces2021-work
aws_secret_access_key=XXXXXXXXXXXX
aws_endpoint_url=https://forces2021.uiogeo-apps.sigma2.no/
```

In [None]:
import fsspec

### Set the path to your group's location (ask your instructor)

In [None]:
target = fsspec.get_mapper(
    "s3://work/obs4MIPs_CERES-EBAFObs_rlut_rsut.zarr",
    client_kwargs={"endpoint_url": "https://forces2021.uiogeo-apps.sigma2.no/"},
)

In [None]:
dset.sel(lat=slice(50, 90), lon=slice(-10, 50)).to_zarr(
    store=target, mode="w", consolidated=True, compute=True
)

## Check what you have stored in s3

- we use https://forces2021.uiogeo-apps.sigma2.no/ as an endpoint
- we need to authenticate to access data (anon=False)

### Initilize the S3 file system

In [None]:
fsg = s3fs.S3FileSystem(
    anon=False,
    client_kwargs={"endpoint_url": "https://forces2021.uiogeo-apps.sigma2.no/"},
)

In [None]:
fsg.ls("work")

### Set path to s3 data

In [None]:
s3_path = "s3://work/obs4MIPs_CERES-EBAFObs_rlut_rsut.zarr"

### Initialize the S3 filesystem 

In [None]:
store = s3fs.S3Map(root=s3_path, s3=fsg, check=False)

In [None]:
ds = xr.open_zarr(store=store, consolidated=True)

In [None]:
ds

### Plot TOA outgoing shortwave Radiation
- Note that there is no need to select an area because the s3 dataset only cover the area of interest (selected when saving dataset in s3 storage)

In [None]:
fig = plt.figure(figsize=(20, 10))

# We're using cartopy and are plotting in Orthographic projection
# (see documentation on cartopy)
ax = plt.subplot(
    1,
    1,
    1,
    projection=ccrs.AlbersEqualArea(central_longitude=20.0, central_latitude=40.0),
)
ax.coastlines(resolution="10m")

# custom colormap

lcmap = cmaps.BlueYellowRed

# We need to project our data to the new Mercator projection and for this we use `transform`.
# we set the original data projection in transform (here PlateCarree)
# we only plot values greather than 0
ds["rsut"].sel(time="2011-10-16").plot(ax=ax, transform=ccrs.PlateCarree(), cmap=lcmap)
ax.set_title(
    "Obs-CERES-EBAF model output prepared for obs4MIPs OBSERVATIONS\n TOA outgoing shortwave Radiation\n"
    + str(ds.time.sel(time="2011-10-16").dt.strftime("%B %d, %Y, %r").values[0]),
    fontsize=20,
)
plt.savefig("Obs-CERES-EBAF_rsut_2011-10-16.png")