<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Create-xarray-dataset-with-dummy-data" data-toc-modified-id="Create-xarray-dataset-with-dummy-data-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Create xarray dataset with dummy data</a></span></li><li><span><a href="#Print-the-help-for-the-netCDF-writer-method:" data-toc-modified-id="Print-the-help-for-the-netCDF-writer-method:-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Print the help for the netCDF writer method:</a></span></li><li><span><a href="#No-encoding-options-are-set-by-default" data-toc-modified-id="No-encoding-options-are-set-by-default-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>No encoding options are set by default</a></span></li><li><span><a href="#Setting-fletcher32-to-True" data-toc-modified-id="Setting-fletcher32-to-True-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Setting <code>fletcher32</code> to <code>True</code></a></span></li><li><span><a href="#Write-to-disk-(test.nc)" data-toc-modified-id="Write-to-disk-(test.nc)-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Write to disk (<code>test.nc</code>)</a></span></li><li><span><a href="#Test-with-docker-OPeNDAP-via-hyrax" data-toc-modified-id="Test-with-docker-OPeNDAP-via-hyrax-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Test with docker OPeNDAP via hyrax</a></span></li><li><span><a href="#OTHER-CHECKS->>>" data-toc-modified-id="OTHER-CHECKS->>>-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>OTHER CHECKS &gt;&gt;&gt;</a></span></li></ul></div>

#### Create xarray dataset with dummy data

Make an xarray dataset with one variable. 

`100^3`, hopefully that's big enough to trigger the `fletcher32` filter, whatever that is.

In [1]:
from pandas import date_range, Timestamp
from numpy import dtype
import xarray as xr

temp = 15 + 8 * np.random.randn(100, 100, 100)
x = np.arange(0, 100)
y = np.arange(0, 100)

ds = xr.Dataset(
    {
        "temp": (["x", "y", "time"], temp),
    },
    coords={
        "x": (["x"], x),
        "y": (["y"], y),
        "time": date_range("2000-01-01", periods=100),
    },
)

ds

#### Print the help for the netCDF writer method:

In [2]:
#help(ds.to_netcdf)

`scipy` and `h5netcdf` could restrict the encoding options. Not sure. Here are the relevant options from the docstring:

```python
    engine : {'netcdf4', 'scipy', 'h5netcdf'}, optional
        Engine to use when writing netCDF files. If not provided, the
        default engine is chosen based on available dependencies, with a
        preference for 'netcdf4' if writing to a file on disk.
    encoding : dict, optional
        Nested dictionary with variable names as keys and dictionaries of
        variable specific encodings as values, e.g.,
        ``{'my_variable': {'dtype': 'int16', 'scale_factor': 0.1,
        'zlib': True}, ...}`
```

#### No encoding options are set by default

Not that `encoding` is set at the variable level, at export or as a python attribute to the xarray variable:

In [3]:
ds.temp.encoding

{}

#### Setting `fletcher32` to `True` 

(and other stuff)

This is the recommended way to set encoding options. Test with the temperature variable:

In [4]:
# Set integers to 32 bits to agree with OPeNDAP: (xarray defaults to int64)
ds.time.encoding = {'dtype': dtype('int32')}
ds.x.encoding = {'dtype': dtype('int32')}
ds.y.encoding = {'dtype': dtype('int32')}

# Set temperature encoding to use problematic fletcher32 property:
ds.temp.encoding = {
    'zlib': True,
    'complevel': 4,
    'fletcher32': True,
}

#### Write to disk (`test.nc`)

http://xarray.pydata.org/en/stable/generated/xarray.Dataset.to_netcdf.html#xarray.Dataset.to_netcdf

Defaults: 

* engine=netcdf4 & 
* encoding=None/Inherited from variable attributes

In [5]:
#!mkdir tmp

In [6]:
ds.to_netcdf("tmp/test-netcdf4.nc", engine="netcdf4")
ds.to_netcdf("tmp/test-h5netcdf.nc", engine="h5netcdf")
ds.to_netcdf("tmp/test-scipy.nc", engine="scipy")

In [7]:
ds = None

#### Test with docker OPeNDAP via hyrax

https://hub.docker.com/r/opendap/hyrax

Test image can be pulled above. Build an image according to the linked github's instructions:

In [8]:
!docker run \
    -d --rm \
    --publish 8080:8080 \
    --volume ${PWD}/tmp:/usr/share/hyrax \
    opendap/hyrax:latest

740e97e775240c613bbebb89014b74b00b2bc836e4d5e18c2d5e793e68e8ca2e


In [9]:
!docker container ls

CONTAINER ID        IMAGE                  COMMAND              CREATED                  STATUS                  PORTS                                                    NAMES
740e97e77524        opendap/hyrax:latest   "/entrypoint.sh -"   Less than a second ago   Up Less than a second   8443/tcp, 10022/tcp, 0.0.0.0:8080->8080/tcp, 11002/tcp   nice_nobel


**Click link, cross fingers:**
[`http://localhost:8080/opendap`](http://localhost:8080/opendap)

```
The netCDF handler does not currently support 64 bit integers.
```

In [10]:
!docker stop 690c7a1aa1a9b1463925cb97a1f1b284643c490d8dff8f9c6c1c8144cb7e0341

Error response from daemon: No such container: 690c7a1aa1a9b1463925cb97a1f1b284643c490d8dff8f9c6c1c8144cb7e0341


#### OTHER CHECKS >>>

In [11]:
!ls tmp

ECCOv4r4_THETA_SALT_draft_v3_20200707 (2).nc
test-h5netcdf.nc
test-netcdf4.nc
test-scipy.nc
test.nc


In [12]:
from netCDF4 import Dataset
ecco = Dataset("tmp/ECCOv4r4_THETA_SALT_draft_v3_20200707 (2).nc", "r")
list(ecco.variables)

['time', 'depth', 'lat', 'lon', 'THETA', 'SALT']

In [13]:
for v in list(ecco.variables):
    display(ecco[v].filters())

{'zlib': True, 'shuffle': True, 'complevel': 5, 'fletcher32': True}

{'zlib': True, 'shuffle': True, 'complevel': 5, 'fletcher32': True}

{'zlib': True, 'shuffle': True, 'complevel': 5, 'fletcher32': True}

{'zlib': True, 'shuffle': True, 'complevel': 5, 'fletcher32': True}

{'zlib': True, 'shuffle': True, 'complevel': 5, 'fletcher32': True}

{'zlib': True, 'shuffle': True, 'complevel': 5, 'fletcher32': True}

Test a hunch about `scipy`:

In [14]:
ds = xr.load_dataset("tmp/test-scipy.nc")

# Set temperature encoding to use problematic fletcher32 property:
ds.temp.encoding = {
    'zlib': True,
    'complevel': 4,
    'fletcher32': False,
}

ds.temp.encoding

{'zlib': True, 'complevel': 4, 'fletcher32': False}

In [15]:
ds.to_netcdf("tmp/test-scipy.nc", engine="scipy")
ds = None

In [16]:
from netCDF4 import Dataset
test = Dataset("tmp/test-scipy.nc", "r")
list(test.variables)

['temp', 'x', 'y', 'time']

In [17]:
test['temp'].filters()