# File Input and Output in Xarray

Xarray supports direct serialization and I/O to several file formats including pickle, netCDF, OPeNDAP (read-only), GRIB1/2 (read-only), and HDF by integrating with third-party libraries. Additional serialization formats for 1-dimensional data are available through pandas.

File types
- Pickle
- NetCDF 3/4
- RasterIO
- Zarr
- PyNio

Interoperability
- Pandas
- Iris
- CDMS
- dask DataFrame

### Tutorial Duriation
10 minutes

### Going Further

Xarray I/O Documentation: http://xarray.pydata.org/en/latest/io.html

### Import library

In [25]:
%matplotlib inline

import glob
import pandas as pd
import xarray as xr
import os

###  Function for creating pandas DatetimeIndex for your raster files

In [26]:
def dummytime(flist):
    datetimecollect=[]
    for eachfile in flist:
        obj=os.path.basename(eachfile).split('_')[1]
        datetimecollect.append(pd.datetime.strptime(obj,'%Y%m').strftime('%Y-%m-%d'))
    return(pd.DatetimeIndex(datetimecollect))


### Loading all your raster files 

In [27]:
os.chdir('../data')
os.getcwd()

'/mnt/d/UW_work/geohack18/Xarray/data'

In [28]:
filenames = glob.glob('*.tif')
filenames
dummytime(filenames)

DatetimeIndex(['1980-10-01', '1980-11-01', '1980-12-01', '1980-01-01',
               '1980-02-01', '1980-03-01', '1980-04-01', '1980-05-01',
               '1980-06-01', '1980-07-01', '1980-08-01', '1980-09-01'],
              dtype='datetime64[ns]', freq=None)

### Create time dimension for xarray dataset

In [29]:
time = xr.Variable('time', dummytime(filenames))

### Define x, y dimension in xarrary dataset

In [30]:
chunks = {'x': 5490, 'y': 5490, 'band': 1} # x: your data arrays # y: your data arrays

### Concat data arrays along time dimension 

In [31]:

da = xr.concat([xr.open_rasterio(f, chunks=chunks) for f in filenames], dim=time)

### Export xarray dataset to netCDF format

In [32]:
da.to_netcdf('test.nc')

## Interoperability

Xarray objects include exports methods that allow users to transform data from the Xarray data model to other data models such as Pandas, Iris, and CDMS. 

Below is a quick example of how to export a time series from Xarray to Pandas.  

In [36]:
# select certain spatial subset to pandas dataframe
t_series = da.isel(x=200, y=200).to_pandas()
t_series.head()

band,1
time,Unnamed: 1_level_1
1980-10-01,25.946098
1980-11-01,19.9055
1980-12-01,17.784698
1980-01-01,10.6084
1980-02-01,8.540501


In [34]:
# export pandas dataframe to csv format
t_series.to_csv('test.csv')