Skip to content

yaml diff

William W. Kimball, Jr., MBA, MSIS edited this page Oct 29, 2020 · 10 revisions
  1. Introduction
  2. Self-Help Documentation

Introduction

The yaml-diff command-line tool is akin to the GNU diff tool except it compares two YAML/JSON/Compatible documents one node at a time rather than one line at a time. Hash keys can be in different orders between the two documents and unless the values of the keys differ, the documents will still perfectly match. Arrays and Arrays of Hashes can be treated in several optional ways to either consider or ignore the ordinal position of their elements or records. One of the two documents can be read from STDIN. EYAML (encrypted) values can be compared, provided the appropriate keys are available; the decrypted data is never revealed.

Like GNU diff, this command-line tool exits with a 0 state when the compared documents are functionally identical (contain precisely the same data, disregarding immaterial differences). When there are any differences -- or there is any issue with your command-line arguments -- it instead exits with a 1.

When you need to know whether two documents have immaterial differences in comments, white-space, or value demarcation (like " verses '), use GNU diff or any of its clones. For all practical purposes however, you really only need to use yaml-diff. This is because to YAML and JSON parsers -- which are the ultimate consumers of your data files -- these differences don't matter unless they corrupt the nodes or the data they contain. The yaml-diff command-line tool will pick up issues caused by improper white-space, demarcation, or even interfering comments. When such differences don't harm any of the nodes or their data and the functional result is thus identical between the two documents, they will not be reported as differences.

This page explores the various command-line arguments understood by yaml-diff. For real-world examples of using it, please check yaml-diff Examples. Advanced options are discussed in detail at:

Self-Help Documentation

When the --help (-h) flag is passed into yaml-diff, it produces this output:

usage: yaml-diff [-h] [-V] [-c CONFIG] [-A {position,value}]
                 [-O {deep,dpos,key,position,value}] [-s | -o]
                 [-t ['.', '/', 'auto', 'dot', 'fslash']] [-x EYAML]
                 [-r PRIVATEKEY] [-u PUBLICKEY] [-E] [-d | -v | -q]
                 YAML_FILE YAML_FILE

Compare YAML/JSON/Compatible documents node by node.  EYAML can be employed to
compare encrypted values.

positional arguments:
  YAML_FILE             exactly two YAML/JSON/compatible files to compare; use
                        - to read one document from STDIN

optional arguments:
  -h, --help            show this help message and exit
  -V, --version         show program's version number and exit
  -c CONFIG, --config CONFIG
                        INI syle configuration file for YAML Path specified
                        comparison control options
  -A {position,value}, --arrays {position,value}
                        default means by which Arrays are compared (overrides
                        [defaults]arrays but is overridden on a YAML Path
                        basis via --config|-c); default=position
  -O {deep,dpos,key,position,value}, --aoh {deep,dpos,key,position,value}
                        default means by which Arrays-of-Hashes are compared
                        (overrides [defaults]aoh but is overridden on a YAML
                        Path basis in [rules] set via --config|-c);
                        default=position
  -s, --same            Show all nodes which are the same in addition to
                        differences
  -o, --onlysame        Show only nodes which are the same, still reporting
                        that differences exist -- when they do -- with an
                        exit-state of 1
  -t ['.', '/', 'auto', 'dot', 'fslash'], --pathsep ['.', '/', 'auto', 'dot', 'fslash']
                        indicate which YAML Path seperator to use when
                        rendering results; default=dot
  -d, --debug           output debugging details
  -v, --verbose         increase output verbosity
  -q, --quiet           suppress all output except system errors

EYAML options:
  Left unset, the EYAML keys will default to yoursystem or user defaults.
  Both keys must be set either here or inyour system or user EYAML
  configuration file when using EYAML.

  -x EYAML, --eyaml EYAML
                        the eyaml binary to use when it isn't on the PATH
  -r PRIVATEKEY, --privatekey PRIVATEKEY
                        EYAML private key
  -u PUBLICKEY, --publickey PUBLICKEY
                        EYAML public key
  -E, --ignore-eyaml-values
                        Do not use EYAML to compare encrypted data; rather,
                        treat ENC[...] values as regular strings

configuration file:
  The CONFIG file is an INI file with up to three sections:
  [defaults] Sets equivalents of --arrays|-A and --aoh|-O.
  [rules]    Each entry is a YAML Path assigning --arrays|-A or --aoh|-O for
             precise nodes.
  [keys]     Wherever --aoh=key (or -O key) or --aoh=deep (or -O deep), each
             entry is treated as a record with an identity key.  In order to
             match RHS records to LHS records, a key must be known and is
             identified on a YAML Path basis via this section.  Where not
             specified, the first attribute of the first record in the
             Array-of-Hashes is presumed the identity key for all records in
             the set.

input files:
  Only one input file may be the - pseudo-file (read from STDIN).  Because the
  relative position of the two input files is important, this will not be
  inferred; you must use - to indicate which document is read from STDIN.

  It doesn't make any sense to compare multi-document files, so only single-
  document files are supported.

For more information about YAML Paths, please visit
https://github.com/wwkimball/yamlpath/wiki.

To report issues with this tool or to request enhancements, please visit
https://github.com/wwkimball/yamlpath/issues.

For a deeper dive into these options:

  • There are two mandatory YAML_FILE positional arguments. Each are the YAML/JSON/EYAML/Compatible files to compare with the first being the basis. Exactly one of them may be the - pseudo-file, causing the document in that position to be read from STDIN.
  • --config (-c) is discussed at yaml-diff Configuration File.
  • --arrays (-A) is discussed at yaml-diff Array Options.
  • --aoh (-O) is discussed at yaml-diff Array-of-Hashes Options.
  • --same (-s) causes same values to be reported along with all differences.
  • --onlysame (-o) causes only same value to be reported, discarding all differences. Note that -- despite changes not being reported to STDOUT -- this command-line tool will still report 1 as its exit-state when differences exist (rather than 0).
  • --pathsep (-t) controls which separator is used when reporting the YAML Paths for each difference. It is a dot (.) by default and can be changed to a forward-slash (/).
  • --eyaml (-x) specifies the fully-qualified path to the external eyaml command. This is useful whenever you need to employ a custom version of eyaml or the command is not on the system PATH.
  • --privatekey (-r) specifies the EYAML private key to use with the external eyaml command when querying encrypted data. This value is necessary only when your user or system EYAML configuration does not already supply this key or you need to override it.
  • --publickey (-u) specifies the EYAML public key to use with the external eyaml command. This value is necessary only when your user or system EYAML configuration does not already supply this key or you need to override it and your version of the eyaml command requires it to decrypt data.
  • --ignore-eyaml-values (-E) disables EYAML value processing. This is useful only when your document(s) contain EYAML values and you specifically need to compare the "ENC[...]" string as-is, without decrypting it. This prevent the encrypted value(s) from being compared, so two different encryptions of exactly the same value will register as a document change.
  • --debug (-d) generates a vast amount of detailed information as the documents are evaluated. It is particularly helpful when troubleshooting other options.
  • --verbose (-v) generates slightly more status messages as the value is changed.
  • --quiet (-q) suppresses normal status and processing messages, including the usual report.
Clone this wiki locally