# Brief demonstration of `ncompare`: to compare the structure, groups, variables, and attributes of two netCDF files"

Installation instructions for `ncompare` can be found in either of these locations:

- [GitHub repository](https://github.com/nasa/ncompare)
- [Pip entry](https://pypi.org/project/ncompare/)

## `ncompare`'s command line arguments, provided by the `--help` description

***✍️ Syntax Note:*** Commands preceeded by an exclamation point "!" 
(which is needed to [run shell commands in a Jupyter notebook](https://stackoverflow.com/a/48529220)) can be run from a terminal.  
In a shell/terminal, the exclamation point should not be used.

In [1]:
! ncompare --help

usage: ncompare [-h] [-v COMPARISON_VAR_NAME] [-g COMPARISON_VAR_GROUP]
                [--only-diffs] [--file-text FILE_TEXT] [--file-csv FILE_CSV]
                [--file-xlsx FILE_XLSX] [--no-color] [--show-attributes]
                [--show-chunks]
                [--column-widths COLUMN_WIDTHS COLUMN_WIDTHS COLUMN_WIDTHS]
                [--version]
                nc_a nc_b

Compare the variables contained within two different NetCDF datasets

positional arguments:
  nc_a                  First NetCDF file
  nc_b                  First NetCDF file

options:
  -h, --help            show this help message and exit
  -v COMPARISON_VAR_NAME, --comparison_var_name COMPARISON_VAR_NAME
                        Comparison variable name
  -g COMPARISON_VAR_GROUP, --comparison_var_group COMPARISON_VAR_GROUP
                        Comparison variable group
  --only-diffs          Only display variables and attributes that are
                        different
  --file-text FILE_TEXT
      

## Example 1: Two netCDF files with the same groups, variables, and attributes
----

Data files are first defined. The examples here rely on two versions of data from NASA's Integrated Multi-satellitE Retrievals (IMERG) for Global Precipitation Measurement (GPM). The data provide estimates of global surface precipitation rates at a high resolution.

- The first, earlier version is the Late Precipitation L3 1 day 0.1 degree x 0.1 degree V06 (GPM_3IMERGDL; Collection ID = [C1598621098-GES_DISC](https://cmr.earthdata.nasa.gov/search/concepts/C1598621098-GES_DISC.html)).
- The second, later version is Final Precipitation L3 1 day 0.1 degree x 0.1 degree V07 (GPM_3IMERGDF; Collection ID = [C2723754864-GES_DISC](https://cmr.earthdata.nasa.gov/search/concepts/C2723754864-GES_DISC.html)).

The data files can be downloaded from these URLs:
- https://gpm1.gesdisc.eosdis.nasa.gov/data/GPM_L3/GPM_3IMERGDL.06/2024/01/3B-DAY-L.MS.MRG.3IMERG.20240110-S000000-E235959.V06.nc4
- https://gpm1.gesdisc.eosdis.nasa.gov/data/GPM_L3/GPM_3IMERGDL.06/2024/01/3B-DAY-L.MS.MRG.3IMERG.20240109-S000000-E235959.V06.nc4
- https://data.gesdisc.earthdata.nasa.gov/data/GPM_L3/GPM_3IMERGDF.07/2022/12/3B-DAY.MS.MRG.3IMERG.20221231-S000000-E235959.V07B.nc4

In [2]:
filepath_1 = "3B-DAY-L.MS.MRG.3IMERG.20240109-S000000-E235959.V06.nc4"
filepath_2 = "3B-DAY-L.MS.MRG.3IMERG.20240110-S000000-E235959.V06.nc4"
filepath_3 = "3B-DAY.MS.MRG.3IMERG.20221231-S000000-E235959.V07B.nc4"

Next, we pass the two filepaths to `ncompare`, and any differences would be printed in red. In this case, there are no differences; therefore, all of the variables are printed in black.

***✍️ Syntax Note:*** the curly brackets, "{" and "}", that follow are simply a way to [substitute python variables into a shell command](https://stackoverflow.com/a/35497161). 
In a shell/terminal, one can just write out the full arguments, separated by spaces.
For example, the following command would be run at the terminal as `ncompare notebook_example_data/MOP03JM-202205-L3V95.6.3.he5 notebook_example_data/MOP03JM-202205-L3V95.9.3.he5`

***✍️ `ncompare` Options Note:*** the `--column-widths 33 26 26` arguments are optional, and they are being used here to shrink the columns width-wise from their defaults to a size that fits better in the GitHub notebook renderer.

In [3]:
! ncompare --column-widths 33 26 26 {filepath_1} {filepath_2}

[37m[0mFile A: 3B-DAY-L.MS.MRG.3IMERG.20240109-S000000-E235959.V06.nc4[0m
[0m[37m[0mFile B: 3B-DAY-L.MS.MRG.3IMERG.20240110-S000000-E235959.V06.nc4[0m
[0m[37m[0m[94m
Root-level Dimensions:[0m
[0m[37m[0m	[36mAre all items the same? ---> True.[0m
[0m[37m[0m	[36m[('lat', 1800), ('lon', 3600), ('nv', 2), ('time', 1)][0m
[0m[37m[0m[94m
Root-level Groups:[0m
[0m[37m[0m	[36mAre all items the same? ---> True.  (No items exist.)[0m
[0m[37m[0m[90m
No variable group selected for comparison. Skipping..[0m
[0m[37m[0m[94m
All variables:[0m
[0m                                                       File A                     File B[0m
[0m                     All Variables                                                      [0m
[0m                                 - -------------------------- --------------------------[0m
[0m                                                                                        [0m
[0m                         GROUP #00

## Example 2: Two netCDF files with different groups, variables, and attributes
----

In [4]:
! ncompare --column-widths 33 30 30 {filepath_1} {filepath_3}

[37m[0mFile A: 3B-DAY-L.MS.MRG.3IMERG.20240109-S000000-E235959.V06.nc4[0m
[0m[37m[0mFile B: 3B-DAY.MS.MRG.3IMERG.20221231-S000000-E235959.V07B.nc4[0m
[0m[37m[0m[94m
Root-level Dimensions:[0m
[0m[37m[0m	[36mAre all items the same? ---> True.[0m
[0m[37m[0m	[36m[('lat', 1800), ('lon', 3600), ('nv', 2), ('time', 1)][0m
[0m[37m[0m[94m
Root-level Groups:[0m
[0m[37m[0m	[36mAre all items the same? ---> True.  (No items exist.)[0m
[0m[37m[0m[90m
No variable group selected for comparison. Skipping..[0m
[0m[37m[0m[94m
All variables:[0m
[0m                                                           File A                         File B[0m
[0m                     All Variables                                                              [0m
[0m                                 - ------------------------------ ------------------------------[0m
[0m                                                                                                [0m
[0m   

#### More file details can be examined by using the `--show-attributes` and `--show-chunks` options

In [5]:
! ncompare --show-attributes --show-chunks --column-widths 33 30 30 {filepath_1} {filepath_3}

[37m[0mFile A: 3B-DAY-L.MS.MRG.3IMERG.20240109-S000000-E235959.V06.nc4[0m
[0m[37m[0mFile B: 3B-DAY.MS.MRG.3IMERG.20221231-S000000-E235959.V07B.nc4[0m
[0m[37m[0m[94m
Root-level Dimensions:[0m
[0m[37m[0m	[36mAre all items the same? ---> True.[0m
[0m[37m[0m	[36m[('lat', 1800), ('lon', 3600), ('nv', 2), ('time', 1)][0m
[0m[37m[0m[94m
Root-level Groups:[0m
[0m[37m[0m	[36mAre all items the same? ---> True.  (No items exist.)[0m
[0m[37m[0m[90m
No variable group selected for comparison. Skipping..[0m
[0m[37m[0m[94m
All variables:[0m
[0m                                                           File A                         File B[0m
[0m                     All Variables                                                              [0m
[0m                                 - ------------------------------ ------------------------------[0m
[0m                                                                                                [0m
[0m   

END of Notebook.