## Remapping training data to the cubed sphere

The novel addition in DLWP-CS is the ability to train convolutional neural networks on data mapped to the cubed sphere. The re-mapping is performed offline from the model training/inference. 

#### Required packages

We use the TempestRemap library for cubed sphere remapping which is available as a pre-compiled conda package. Let's start by installing it.

In [None]:
%conda install -c conda-forge tempest-remap

Let's use the DLWP CubeSphereRemap class on the data we processed earlier.

In [None]:
from DLWP.remap import CubeSphereRemap

data_directory = '/home/disk/wave2/jweyn/Data/ERA5'
processed_file = '%s/tutorial_z500_t2m.nc' % data_directory
remapped_file = '%s/tutorial_z500_t2m_CS.nc' % data_directory

csr = CubeSphereRemap()

Generate the offline maps. Since we used 2 degree data, we have 91 latitude points and 180 longitude points. We are mapping to a cubed sphere with 48 points on the side of each cube face. Since data from CDS comes with monotonically decreasing latitudes, we specify the `inverse_lat` option. In the future, I will add capability to read the coordinates from a netCDF file.

In [None]:
csr.generate_offline_maps(lat=91, lon=180, res=48, inverse_lat=True)

Apply the forward map, saving to a temporary file. We specify to operate on the variable `predictors`, which is the only variable in the processed data. If this crashes, read the next cell.

In [None]:
csr.remap(predictor_file, '%s/temp.nc' % data_directory, '--var', 'predictors')

By default, TempestRemap has a 1-dimensional spatial coordinate. We convert the file to 3-dimensional faces (face, height, width). A few other points here:  
- TempestRemap is very finicky about metadata in netCDF files, sometimes failing with segmentation faults for no apparent reason. I've found that the most common crash is because it does not like the string coordinate values in the `'varlev'` coordinate. If it crashes, try deleting the coordinate values using xarray and saving the processed data to a new file. Then, the parameter `coord_file` in this function can reinstate the coordinate to the final remapped file. Even if TempestRemap does not crash, it will probably delete the string coordinates, and sometimes the sample time coordinate as well, so it's a good idea to use this feature.  
- We also take advantage of the `chunking` parameter to save data with ideal chunking when using the file for training and evaluating models.

In [None]:
csr.convert_to_faces('%s/temp.nc' % data_directory, 
                     remap_file,
                     coord_file=processed_file,
                     chunking={'sample': 1, 'varlev': 1})

In [None]:
import os
os.remove('%s/temp.nc' % data_directory)