# Xncml Usage

In [1]:
import xncml
from pathlib import Path

## Modify an NcML document

``xncml`` can add or remove global and variable attributes, and remove variables and dimensions. It can also be used to create NcML files from scratch.

### Create an Ncml Dataset object from a local NcML file

In [2]:
p = Path(xncml.__file__).parent.parent / "tests" / "data"
nc = xncml.Dataset(p / "exercise1.ncml")
nc

<?xml version="1.0" encoding="utf-8"?>
<netcdf location="nc/example1.nc" xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
	<dimension name="time" length="2" isUnlimited="true"></dimension>
	<dimension name="lat" length="3"></dimension>
	<dimension name="lon" length="4"></dimension>
	<attribute name="title" type="String" value="Example Data"></attribute>
	<variable name="rh" shape="time lat lon" type="int">
		<attribute name="long_name" type="String" value="relative humidity"></attribute>
		<attribute name="units" type="String" value="percent"></attribute>
	</variable>
	<variable name="T" shape="time lat lon" type="double">
		<attribute name="long_name" type="String" value="surface temperature"></attribute>
		<attribute name="units" type="String" value="C"></attribute>
	</variable>
	<variable name="lat" shape="lat" type="float">
		<attribute name="units" type="String" value="degrees_north"></attribute>
		<values>41.0 40.0 39.0</values>
	</variable>
	<variable name="lon" s

### Create an NcML Dataset modifying a netCDF file

Here we're creating an empty NcML dataset from scratch, in which we can include modifying statements that will apply to an existing netCDF dataset identified by the `location` argument.

In [3]:
new = xncml.Dataset(location="nc/example1.nc")
new

<?xml version="1.0" encoding="utf-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2" location="nc/example1.nc"></netcdf>

### Rename the variable `T` to `Temp`

In [4]:
nc.rename_variable('T', 'Temp')
nc

<?xml version="1.0" encoding="utf-8"?>
<netcdf location="nc/example1.nc" xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
	<dimension name="time" length="2" isUnlimited="true"></dimension>
	<dimension name="lat" length="3"></dimension>
	<dimension name="lon" length="4"></dimension>
	<attribute name="title" type="String" value="Example Data"></attribute>
	<variable name="rh" shape="time lat lon" type="int">
		<attribute name="long_name" type="String" value="relative humidity"></attribute>
		<attribute name="units" type="String" value="percent"></attribute>
	</variable>
	<variable name="Temp" shape="time lat lon" type="double" orgName="T">
		<attribute name="long_name" type="String" value="surface temperature"></attribute>
		<attribute name="units" type="String" value="C"></attribute>
	</variable>
	<variable name="lat" shape="lat" type="float">
		<attribute name="units" type="String" value="degrees_north"></attribute>
		<values>41.0 40.0 39.0</values>
	</variable>
	<variab

### Remove the variable `Temp` from the dataset

In [5]:
nc.remove_variable('Temp')
nc

<?xml version="1.0" encoding="utf-8"?>
<netcdf location="nc/example1.nc" xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
	<dimension name="time" length="2" isUnlimited="true"></dimension>
	<dimension name="lat" length="3"></dimension>
	<dimension name="lon" length="4"></dimension>
	<attribute name="title" type="String" value="Example Data"></attribute>
	<variable name="rh" shape="time lat lon" type="int">
		<attribute name="long_name" type="String" value="relative humidity"></attribute>
		<attribute name="units" type="String" value="percent"></attribute>
	</variable>
	<variable name="Temp" shape="time lat lon" type="double" orgName="T">
		<attribute name="long_name" type="String" value="surface temperature"></attribute>
		<attribute name="units" type="String" value="C"></attribute>
	</variable>
	<variable name="lat" shape="lat" type="float">
		<attribute name="units" type="String" value="degrees_north"></attribute>
		<values>41.0 40.0 39.0</values>
	</variable>
	<variab

### Remove the attribute `units` from the variable `Temp`

In [6]:
nc.remove_variable_attribute(variable='Temp', key='units')
nc

<?xml version="1.0" encoding="utf-8"?>
<netcdf location="nc/example1.nc" xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
	<dimension name="time" length="2" isUnlimited="true"></dimension>
	<dimension name="lat" length="3"></dimension>
	<dimension name="lon" length="4"></dimension>
	<attribute name="title" type="String" value="Example Data"></attribute>
	<variable name="rh" shape="time lat lon" type="int">
		<attribute name="long_name" type="String" value="relative humidity"></attribute>
		<attribute name="units" type="String" value="percent"></attribute>
	</variable>
	<variable name="Temp" shape="time lat lon" type="double" orgName="T">
		<attribute name="long_name" type="String" value="surface temperature"></attribute>
		<attribute name="units" type="String" value="C"></attribute>
		<remove name="units" type="attribute"></remove>
	</variable>
	<variable name="lat" shape="lat" type="float">
		<attribute name="units" type="String" value="degrees_north"></attribute>
		<va

### Remove the global `title` attribute from the dataset

In [7]:
nc.remove_dataset_attribute('title')
nc

<?xml version="1.0" encoding="utf-8"?>
<netcdf location="nc/example1.nc" xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
	<dimension name="time" length="2" isUnlimited="true"></dimension>
	<dimension name="lat" length="3"></dimension>
	<dimension name="lon" length="4"></dimension>
	<attribute name="title" type="String" value="Example Data"></attribute>
	<variable name="rh" shape="time lat lon" type="int">
		<attribute name="long_name" type="String" value="relative humidity"></attribute>
		<attribute name="units" type="String" value="percent"></attribute>
	</variable>
	<variable name="Temp" shape="time lat lon" type="double" orgName="T">
		<attribute name="long_name" type="String" value="surface temperature"></attribute>
		<attribute name="units" type="String" value="C"></attribute>
		<remove name="units" type="attribute"></remove>
	</variable>
	<variable name="lat" shape="lat" type="float">
		<attribute name="units" type="String" value="degrees_north"></attribute>
		<va

### Add a global `history` attribute

In [8]:
nc.add_dataset_attribute(key='Conventions', value='CF-2.0')
nc

<?xml version="1.0" encoding="utf-8"?>
<netcdf location="nc/example1.nc" xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
	<dimension name="time" length="2" isUnlimited="true"></dimension>
	<dimension name="lat" length="3"></dimension>
	<dimension name="lon" length="4"></dimension>
	<attribute name="title" type="String" value="Example Data"></attribute>
	<attribute name="Conventions" type="String" value="CF-2.0"></attribute>
	<variable name="rh" shape="time lat lon" type="int">
		<attribute name="long_name" type="String" value="relative humidity"></attribute>
		<attribute name="units" type="String" value="percent"></attribute>
	</variable>
	<variable name="Temp" shape="time lat lon" type="double" orgName="T">
		<attribute name="long_name" type="String" value="surface temperature"></attribute>
		<attribute name="units" type="String" value="C"></attribute>
		<remove name="units" type="attribute"></remove>
	</variable>
	<variable name="lat" shape="lat" type="float">
		<attr

### Rename a global attribute

In [9]:
nc.rename_dataset_attribute(old_name="Source", new_name="source")
nc

<?xml version="1.0" encoding="utf-8"?>
<netcdf location="nc/example1.nc" xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
	<dimension name="time" length="2" isUnlimited="true"></dimension>
	<dimension name="lat" length="3"></dimension>
	<dimension name="lon" length="4"></dimension>
	<attribute name="title" type="String" value="Example Data"></attribute>
	<attribute name="Conventions" type="String" value="CF-2.0"></attribute>
	<attribute name="source">
		<orgName>Source</orgName>
	</attribute>
	<variable name="rh" shape="time lat lon" type="int">
		<attribute name="long_name" type="String" value="relative humidity"></attribute>
		<attribute name="units" type="String" value="percent"></attribute>
	</variable>
	<variable name="Temp" shape="time lat lon" type="double" orgName="T">
		<attribute name="long_name" type="String" value="surface temperature"></attribute>
		<attribute name="units" type="String" value="C"></attribute>
		<remove name="units" type="attribute"></remove>

### Add a variable attribute

In [10]:
nc.add_variable_attribute(variable='Temp', key='units', value='Kelvin')
nc.add_variable_attribute(variable='Temp', key='Fill_value', value=-999999999.)
nc

<?xml version="1.0" encoding="utf-8"?>
<netcdf location="nc/example1.nc" xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
	<dimension name="time" length="2" isUnlimited="true"></dimension>
	<dimension name="lat" length="3"></dimension>
	<dimension name="lon" length="4"></dimension>
	<attribute name="title" type="String" value="Example Data"></attribute>
	<attribute name="Conventions" type="String" value="CF-2.0"></attribute>
	<attribute name="source">
		<orgName>Source</orgName>
	</attribute>
	<variable name="rh" shape="time lat lon" type="int">
		<attribute name="long_name" type="String" value="relative humidity"></attribute>
		<attribute name="units" type="String" value="percent"></attribute>
	</variable>
	<variable name="Temp" shape="time lat lon" type="double" orgName="T">
		<attribute name="long_name" type="String" value="surface temperature"></attribute>
		<attribute name="units" type="String" value="Kelvin"></attribute>
		<attribute name="Fill_value" type="String"

### Write Dataset back to an ncml file

In [11]:
import tempfile
fn = Path(tempfile.mkdtemp()) / "exercise1_modified.ncml"
nc.to_ncml(fn)

### Export metadata to a dictionary
`Dataset` has a `to_cf_dict` method that returns a dictionary following the [CF-JSON](https://cf-json.org/specification) specifications. The output may not always be fully compliant with the CF-JSON specification because NcML files used to create virtual datasets do not always include all information that CF-JSON expects.


In [12]:
nc.to_cf_dict()

OrderedDict([('@location', 'nc/example1.nc'),
             ('@xmlns',
              {'': 'http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2'}),
             ('dimensions',
              OrderedDict([('time', 2), ('lat', 3), ('lon', 4)])),
             ('attributes',
              OrderedDict([('title', 'Example Data'),
                           ('Conventions', 'CF-2.0'),
                           ('source', None)])),
             ('variables',
              OrderedDict([('time',
                            OrderedDict([('shape', ['time']),
                                         ('type', 'int'),
                                         ('attributes',
                                          OrderedDict([('units', 'hours')])),
                                         ('data', [6, 18])])),
                           ('lat',
                            OrderedDict([('shape', ['lat']),
                                         ('type', 'float'),
                                    

## Open an NcML document as an ``xarray.Dataset``

``xncml`` can parse NcML instructions to create an ``xarray.Dataset``. Note that not all NcML instructions are implemented.

In [13]:
xncml.open_ncml(p / "exercise1.ncml")

In [14]:
%load_ext watermark
%watermark --iversion -g -m -v -u -d


Last updated: 2023-02-08

Python implementation: CPython
Python version       : 3.10.4
IPython version      : 8.4.0

Compiler    : GCC 10.3.0
OS          : Linux
Release     : 5.4.0-137-generic
Machine     : x86_64
Processor   : x86_64
CPU cores   : 8
Architecture: 64bit

Git hash: 

xncml: 0.1.dev49+gc713ea3.d20230131



fatal: not a git repository (or any of the parent directories): .git
