# I/O

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#I/O" data-toc-modified-id="I/O-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>I/O</a></span><ul class="toc-item"><li><span><a href="#Learning-Objectives" data-toc-modified-id="Learning-Objectives-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Learning Objectives</a></span></li><li><span><a href="#Reading-and-Writing-Files" data-toc-modified-id="Reading-and-Writing-Files-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Reading and Writing Files</a></span></li><li><span><a href="#Opening-xarray-datasets" data-toc-modified-id="Opening-xarray-datasets-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Opening xarray datasets</a></span></li><li><span><a href="#Saving-xarray-datasets-as-netcdf-files" data-toc-modified-id="Saving-xarray-datasets-as-netcdf-files-1.4"><span class="toc-item-num">1.4&nbsp;&nbsp;</span>Saving xarray datasets as netcdf files</a></span></li><li><span><a href="#Multifile-datasets" data-toc-modified-id="Multifile-datasets-1.5"><span class="toc-item-num">1.5&nbsp;&nbsp;</span>Multifile datasets</a></span></li><li><span><a href="#Zarr" data-toc-modified-id="Zarr-1.6"><span class="toc-item-num">1.6&nbsp;&nbsp;</span>Zarr</a></span></li><li><span><a href="#Going-Further" data-toc-modified-id="Going-Further-1.7"><span class="toc-item-num">1.7&nbsp;&nbsp;</span>Going Further</a></span></li></ul></li></ul></div>

## Learning Objectives

- Write xarray objects to netCDF files
- Load xarray datasets from netCDF files
- Provide a brief overview on Zarr

## Reading and Writing Files


Xarray supports direct serialization and I/O to several file formats including pickle, netCDF, OPeNDAP (read-only), GRIB1/2 (read-only), Zarr, and HDF by integrating with third-party libraries. Additional serialization formats for 1-dimensional data are available through pandas.

File types
- Pickle
- NetCDF 3/4
- RasterIO
- Zarr
- PyNio

Interoperability
- Pandas
- Iris
- CDMS
- dask DataFrame


## Opening xarray datasets

Xarray's `open_dataset` and `open_mfdataset` are the primary functions for opening local or remote datasets such as netCDF, GRIB, OpenDap, and HDF. These operations are all supported by third party libraries (engines) for which xarray provides a common interface. 

In [None]:
!ncdump -h ../../../data/rasm.nc

In [None]:
import xarray as xr
from glob import glob

In [None]:
ds = xr.open_dataset('../../../data/rasm.nc')
ds

## Saving xarray datasets as netcdf files

Xarray provides a high-level method for writing netCDF files directly from Xarray Datasets/DataArrays.

In [None]:
ds.to_netcdf('../../../data/rasm_test.nc')

## Multifile datasets

Xarray can read/write multifile datasets using the `open_mfdataset` and `save_mfdataset` functions. 

In [None]:
paths = glob('./data/19*.nc')
paths

In [None]:
ds2 = xr.open_mfdataset(paths, combine="by_coords")
ds2

## Zarr

Zarr is a Python package providing an implementation of chunked, compressed, N-dimensional arrays. Zarr has the ability to store arrays in a range of ways, including in memory, in files, and in cloud-based object storage such as Amazon S3 and Google Cloud Storage. Xarray’s Zarr backend allows xarray to leverage these capabilities.

In [None]:
# save to a Zarr dataset
ds.to_zarr('./data/rasm.zarr', mode='w')

In [None]:
!ls ./data/rasm.zarr

In [None]:
!du -h ./data/rasm.zarr

## Going Further
    
- Xarray I/O Documentation: http://xarray.pydata.org/en/latest/io.html

- Zarr Documentation: https://zarr.readthedocs.io/en/stable/



<div class="alert alert-block alert-success">
  <p>Previous: <a href="01_getting_started_with_xarray.ipynb">Getting Started with Xarray</a></p>
  <p>Next: <a href="03_indexing.ipynb">Indexing</a></p>
</div>