Skip to content

nasa/ncompare

Repository files navigation

ncompare


Project Status: Active – The project has reached a stable, usable state and is being actively developed Code coverage Documentation Status Python Versions Package version Code style Mypy checked Contributions welcome Zenodo pyOpenSci DOI badge

Compare the structure of two NetCDF files at the command line. ncompare generates a view of the matching and non-matching groups and variables between two NetCDF datasets.

Installing

The latest release of ncompare can be installed with mamba, conda or pip.

Using mamba

mamba install -c conda-forge ncompare

Using conda

conda install -c conda-forge ncompare

Using pip

pip install ncompare

Usage

To compare two netCDF files, pass the filepaths for each of the two netCDF files directly to ncompare, as follows:

ncompare <netcdf file #1> <netcdf file #2>

With an additional --file-text argument specified, a common use of ncompare may look like this example:

ncompare S001G01.nc S001G01_SUBSET.nc --file-text subset_comparison.txt

A more complete usage demonstration with example output is shown in this example notebook.

Options

  • -h, --help : Show this help message and exit.
  • --file-text [FILE_PATH]: Text file to write output to.
  • --file-csv [FILE_PATH]: Comma-separated values (CSV) file to write output to.
  • --file-xlsx [FILE_PATH]: Excel file to write output to.
  • --only-diffs : Only display variables and attributes that are different
  • --no-color : Turn off all colorized output.
  • --show-attributes : Include variable attributes in the table that compares variables.
  • --show-chunks : Include chunk sizes in the table that compares variables.
  • -v (--comparison_var_name) [VAR_NAME]: Compare specific values for this variable.
  • -g (--comparison_var_group) [VAR_GROUP]: Group that contains the comparison_var_name.
  • --column-widths [WIDTH, WIDTH, WIDTH]: Width, in number of characters, of the three columns in the comparison report
  • --version : Show the current version and then exit.

Contributing

Contributions are welcome! For more information, see CONTRIBUTING.md. ncompare is licensed under the Apache License 2.0, which is included in the LICENSE file.

Developing

Development within this repository should occur on a feature branch. Pull Requests (PRs) are created with a target of the develop branch before being reviewed and merged.

Installing locally

For local development, one can clone the repository and then use poetry or pip from the local directory:

git clone https://github.com/nasa/ncompare.git
(Option A) using poetry:

ii) Follow the instructions for installing poetry here.

iii) Run poetry install from the repository directory.

(Option B) using pip:

ii) Run pip install . from the repository directory.

Testing locally

If installed using a poetry environment, the tests can be run with:

poetry run pytest tests

Or from another virtual environment, one can use:

pytest tests

To run as a locally installed poetry module

poetry run ncompare <netcdf file #1> <netcdf file #2>

Why ncompare?

The cdo (climate data operators) tool does not support NetCDF4 groups. Moreover, nco operators' ncdiff function computes value differences, but --- as far as the developers of this tool are aware --- nco does not have a simple function to show structural differences between NetCDF4 datasets. Note that h5diff, provided in the HDF5 software, can also be used to find differences. In comparison to h5diff, ncompare is written and runnable in Python; ncompare provides aligned and colorized difference report for quicker assessments of groups, variable names, types, shapes, and attributes; and can generate report files formatted for other applications. However, note that h5diff provides comparison of some otherwise "hidden" hdf5 properties, such as _Netcdf4Dimid or _Netcdf4Coordinates, which are not currently assessed by ncompare.

Known limitations

  • ncompare uses xarray to access the root-level dimensions. In some cases, xarray will miss dimensions whose names do not also exist as variable names in the dataset (also known as non-coordinate dimensions).
  • Some underlying HDF5 properties, such as _Netcdf4Dimid or _Netcdf4Coordinates, are not currently assesssed by ncompare.

Notices:

Copyright 2023 United States Government as represented by the Administrator of the National Aeronautics and Space Administration. All Rights Reserved.

This software calls the following third-party software, which is subject to the terms and conditions of its licensor, as applicable at the time of licensing. The third-party software is not bundled with this software but may be available from the licensor.

License hyperlinks are provided here for information purposes only.

Title license link
colorama BSD-3-Clause https://opensource.org/licenses/BSD-3-Clause
netCDF4 MIT License https://opensource.org/licenses/MIT
numpy BSD-3-Clause https://opensource.org/licenses/BSD-3-Clause
openpyxl MIT License https://opensource.org/licenses/MIT
xarray Apache License, version 2.0 https://www.apache.org/licenses/LICENSE-2.0
Python Standard Library Python Software Foundation (PSF) License Agreement https://docs.python.org/3/license.html#psf-licenseDisclaimers

The ncompare: NetCDF structural comparison tool framework is licensed under the Apache License, Version 2.0 (the "License"); you may not use this application except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.


This package is NASA Software Release Authorization (SRA) # LAR-20274-1