-
-
Notifications
You must be signed in to change notification settings - Fork 23
yaml diff Array Options
This document is part of the body of knowledge about yaml-diff, one of the reference command-line tools provided by the YAML Path project.
The yaml-diff
command-line tool enables users to control how Arrays (AKA Lists or Sequences) are compared. This is different from merging Arrays-of-Hashes, discussed elsewhere. By default, the elements from both documents are compared based on their ordinal position in each Array. While this is ideal for many use-cases, it is not so for every use-case. As such, yaml-diff
offers some options for how it compares Array elements. These options include:
-
position
(the default) tests the equality of each element in the document pair by its ordinal position. Differences are reported as changes. When the left-hand document (LHS) has more elements than the right-hand document (RHS), the additional LHS elements are reported as deletions. When the LHS has fewer elements than RHS, the additional RHS elements are reported as additions. -
value
synchronizes the two Arrays by the values of their elements and then comparing the result. This is especially helpful when you are more interested in elements unique to the two Arrays, regardless their relative ordinal positions. Changes are possible in this mode but only when two elements at exactly the same position are both different and otherwise unmatched across both Arrays. Otherwise, only additions and deletions are possible because all other elements will have been matched up.
Each of these scenarios will be explored through comparisons of different arrangements of Array elements.
File: LHS1.yaml
---
same_elements:
- alpha
- bravo
one_change:
- alpha
- bravo
one_addition:
- alpha
one_deletion:
- alpha
- bravo
File: RHS1.yaml
same_elements:
- alpha
- bravo
one_change:
- alpha
- charlie
one_addition:
- alpha
- bravo
one_deletion:
- alpha
By default or when using position
against each of these Arrays, the difference becomes:
c one_change[1]
< bravo
---
> charlie
a one_addition[1]
> bravo
d one_deletion[1]
< bravo
File: LHS2.yaml
---
rearranged_array:
- alpha
- bravo
- charlie
with_duplicates:
- alpha
- bravo
- alpha
with_additions:
- alpha
- bravo
with_deletions:
- alpha
- bravo
- charlie
with_change:
- alpha
- bravo
- delta
File: RHS2.yaml
---
rearranged_array:
- charlie
- alpha
- bravo
with_duplicates:
- bravo
- alpha
- alpha
with_additions:
- bravo
- charlie
- alpha
- delta
with_deletions:
- bravo
with_change:
- alpha
- charlie
- delta
When using the value
option against these arrays, the differences are revealed as:
a with_additions[1]
> charlie
a with_additions[3]
> delta
d with_deletions[0]
< alpha
d with_deletions[2]
< charlie
c with_change[1]
< bravo
---
> charlie
For contrast, a position
comparison would produce a very different report:
c rearranged_array[0]
< alpha
---
> charlie
c rearranged_array[1]
< bravo
---
> alpha
c rearranged_array[2]
< charlie
---
> bravo
c with_duplicates[0]
< alpha
---
> bravo
c with_duplicates[1]
< bravo
---
> alpha
c with_additions[0]
< alpha
---
> bravo
c with_additions[1]
< bravo
---
> charlie
a with_additions[2]
> alpha
a with_additions[3]
> delta
c with_deletions[0]
< alpha
---
> bravo
d with_deletions[1]
< bravo
d with_deletions[2]
< charlie
c with_change[1]
< bravo
---
> charlie
As you can see, a comparison by value is far smaller than by position for these documents. Once the elements are synchronized, there are actually far fewer differences to report. When it is more informative to compare Arrays by the distinctiveness of their elements rather than the order of them, use the value
option.
The yaml-diff
tool can read per YAML Path comparison options from an INI-Style configuration file via its --config
(-c
) argument. Whereas the --arrays
(-A
) argument supplies an overarching mode for comparing Arrays, using a configuration file permits far more precise control whenever you need a different mode for specific parts of the comparison documents.
The [defaults]
section permits a key named, arrays
, which behaves identically to the --arrays
(-A
) command-line argument to the yaml-diff
tool. The [defaults]arrays
setting is overridden by the same-named command-line argument, when supplied. In practice, this file may look like:
File diff-options.ini
[defaults]
arrays = position
Note the spaces around the =
sign are optional but only an =
sign may be used to separate each key from its value.
The [rules]
section takes any YAML Paths as keys and any of the Array comparison modes that are available to the --arrays
(-A
) command-line argument. This enables extremely fine precision for applying the available modes.
Using the LHS2.yaml and RHS2.yaml documents as all prior examples, adding a configuration file with these contents:
[defaults]
arrays = value
[rules]
rearranged_array = position
with_change = position
... changes the difference report to:
c rearranged_array[0]
< alpha
---
> charlie
c rearranged_array[1]
< bravo
---
> alpha
c rearranged_array[2]
< charlie
---
> bravo
a with_additions[1]
> charlie
a with_additions[3]
> delta
d with_deletions[0]
< alpha
d with_deletions[2]
< charlie
c with_change[1]
< bravo
---
> charlie
Notice the following:
- The default comparison mode for all Arrays was set to
value
; different from the internal default mode,position
. - The Arrays at "rearranged_array" ("/rearranged_array") and "with_change" ("/with_change") were compared using the
position
mode.