Skip to content

yaml merge

William W. Kimball, Jr., MBA, MSIS edited this page Jul 26, 2021 · 19 revisions
  1. Introduction
  2. Self-Help Documentation

Introduction

The yaml-merge command-line tool supports merging multiple YAML/JSON/Compatible files together. It also enables writing arbitrary simple or complex data to a YAML/JSON/Compatible document at any YAML Path, accepting such input from STDIN (the - pseudo-file), other compatible files, or both. Any one input file may be the - pseudo-file (explicit or implied). Any of the files may contain multiple documents, properly demarcated. Content is merged from the files, one at a time in sequence from top-to-bottom, left-to-right, into the first document's content. This is known as "RHS into LHS" merging (Right-Hand-Side content is merged into the Left-Hand-Side document). No files are changed by this process unless you specifically instruct the tool to overwrite a pre-existing file; merging is performed entirely in memory.

The yaml-merge tool defaults to writing its merged results to STDOUT, suppressing all non-error messaging. In order to direct the results to another file, you can set the --output (-o) argument (will not overwrite existing files), --overwrite (-w) argument (will overwrite an existing file), or redirect output using the operators specific to your operating system, like >. When using >>, be sure to inject an appropriate document stop marker (...) when appending to a YAML document to create a multi-document YAML file; yaml-merge will add the following document start marker (---). Whether the merged document is written to a file or STDOUT, it will have been stripped of all comments and empty lines unless --preserve-lhs-comments (-l) is set. This is necessary because there isn't a simple, sensible way to merge the arbitrary text of comments while retaining their relative position to other nodes. Future versions of ruamel.yaml, and therefore yamlpath may be better about preserving comments during merge operations by default.

Not all merges are possible. There is no logical way to merge a Scalar or Array into a Hash without specifying a key to receive the new data (say, by using --mergeat/-m). Two documents with conflicting Anchors cannot be merged unless you select one of the --anchors (-a) options other than stop (the default). When an impossible merge condition is met, the tool will stop and emit an informative error message. Other conditions will be coallesced, like merging a Hash or Scalar into an Array will result in merely adding the new data as a new element of the Array.

Users have complete control over how the merge is performed. Default merging behaviors can be adjusted via CLI arguments to yaml-merge. Where greater precision is required, an INI-Style configuration file can be provided via the --config (-c) argument. These advanced merge control behaviors are explored on these pages:

This page explores the various command-line arguments understood by yaml-merge. For real-world examples of using it, please check yaml-merge Examples.

Self-Help Documentation

When the --help (-h) flag is passed to yaml-merge, it generate the following documentation:

usage: yaml-merge [-h] [-V] [-c CONFIG] [-a {stop,left,right,rename}]
                  [-A {all,left,right,unique}] [-E {left,right,unique}]
                  [-H {deep,left,right}] [-O {all,deep,left,right,unique}]
                  [-m YAML_PATH] [-o OUTPUT | -w OVERWRITE] [-b]
                  [-D {auto,json,yaml}]
                  [-M {condense_all,merge_across,matrix_merge}] [-l] [-S]
                  [-d | -v | -q]
                  [YAML_FILE [YAML_FILE ...]]

Merges two or more single- or multi-document YAML/JSON/Compatible documents
together, including complex data provided via STDIN.

positional arguments:
  YAML_FILE             one or more YAML files to merge, order-significant;
                        omit or use - to read from STDIN

optional arguments:
  -h, --help            show this help message and exit
  -V, --version         show program's version number and exit
  -c CONFIG, --config CONFIG
                        INI syle configuration file for YAML Path specified
                        merge control options
  -a {stop,left,right,rename}, --anchors {stop,left,right,rename}
                        means by which Anchor name conflicts are resolved
                        (overrides [defaults]anchors set via --config|-c and
                        cannot be overridden by [rules] because Anchors apply
                        to the whole file); default=stop
  -A {all,left,right,unique}, --arrays {all,left,right,unique}
                        default means by which Arrays are merged together
                        (overrides [defaults]arrays but is overridden on a
                        YAML Path basis via --config|-c); default=all
  -E {left,right,unique}, --sets {left,right,unique}
                        default means by which Sets are merged together
                        (overrides [defaults]sets but is overridden on a
                        YAML Path basis via --config|-c); default=unique
  -H {deep,left,right}, --hashes {deep,left,right}
                        default means by which Hashes are merged together
                        (overrides [defaults]hashes but is overridden on a
                        YAML Path basis in [rules] set via --config|-c);
                        default=deep
  -O {all,deep,left,right,unique}, --aoh {all,deep,left,right,unique}
                        default means by which Arrays-of-Hashes are merged
                        together (overrides [defaults]aoh but is overridden on
                        a YAML Path basis in [rules] set via --config|-c);
                        default=all
  -m YAML_PATH, --mergeat YAML_PATH
                        YAML Path indicating where in left YAML_FILE the right
                        YAML_FILE content is to be merged; default=/
  -o OUTPUT, --output OUTPUT
                        write the merged result to the indicated nonexistent
                        file
  -w OVERWRITE, --overwrite OVERWRITE
                        write the merged result to the indicated file; will
                        replace the file when it already exists
  -b, --backup          save a backup OVERWRITE file with an extra .bak
                        file-extension; applies only to OVERWRITE
  -D {auto,json,yaml}, --document-format {auto,json,yaml}
                        force the merged result to be presented in one of the
                        supported formats or let it automatically match the
                        known file-name extension of OUTPUT|OVERWRITE (when
                        provided), or match the type of the first document;
                        default=auto
  -M {condense_all,merge_across,matrix_merge}, --multi-doc-mode {condense_all,merge_across,matrix_merge}
                        control how multi-document files and streams are
                        merged together, with or without condensing them as
                        part of the merge
  -l, --preserve-lhs-comments
                        while all comments are normally dicarded during a
                        merge, this option will attempt to preserve
                        comments in the left-most YAML_FILE; may produce
                        unexpected comment-to-data associations or
                        spurious new-lines and all other document comments
                        are still discarded
  -S, --nostdin         do not implicitly read from STDIN, even when there are
                        no - pseudo-files in YAML_FILEs with a non-TTY session
  -d, --debug           output debugging details
  -v, --verbose         increase output verbosity
  -q, --quiet           suppress all output except errors (implied when
                        -o|--output is not set)

configuration file:
  The CONFIG file is an INI file with up to three sections:
  [defaults] Sets equivalents of --anchors|-a, --arrays|-A, --hashes|-H, and
             --aoh|-O.
  [rules]    Each entry is a YAML Path assigning --arrays|-A, --hashes|-H,
             or --aoh|-O for precise nodes.
  [keys]     Wherever --aoh=DEEP (or -O deep), each entry is treated as a
             record with an identity key.  In order to match RHS records to
             LHS records, a key must be known and is identified on a YAML
             Path basis via this section.  Where not specified, the first
             attribute of the first record in the Array-of-Hashes is presumed
             the identity key for all records in the set.

input files:
  The left-to-right order of YAML_FILEs is significant.  Except when this
  behavior is deliberately altered by your options, data from files on the
  right overrides data in files to their left.

  Only one input file may be the - pseudo-file (read from STDIN).  When no
  YAML_FILEs are provided, - will be inferred as long as you are running
  this program without a TTY (unless you set --nostdin|-S).

  Any file, including input from STDIN, may be a multi-document YAML, JSON,
  or compatible file.

For more information about YAML Paths, please visit
https://github.com/wwkimball/yamlpath/wiki.

To report issues with this tool or to request enhancements, please visit
https://github.com/wwkimball/yamlpath/issues.

For a deep dive into each of these options:

  • --config (-c) is discussed at yaml-merge Configuration File.
  • --anchors (-a) is discussed at yaml-merge Anchor Options.
  • --arrays (-A) is discussed at yaml-merge Array Options.
  • --sets (-E) is discusssed at yaml-merge Set Options.
  • --hashes (-H) is discussed at yaml-merge Hash Options.
  • --aoh (-O) is discussed at yaml-merge Array-of-Hash Options.
  • --mergeat (-m) enables directing all RHS content to one or more LHS destinations indicated via YAML Path. This enables users to merge data fragments or even arbitrary data structure from the RHS rather than premade, otherwise complete documents. Specifying a YAML Path which matches zero nodes will result in the missing structure being created on-the-fly, if possible.
  • --output (-o) is a safe document creation option; it will refuse to overwrite any pre-existing file. Setting this option enables seeing all status messages as the merge is performed.
  • --overwrite (-w) is an unsafe document creation option; it will overwrite the target file if it already exists. This is particularly useful when you wish to deliberately change one of the files which you are also using as an input file. Setting this option enables seeing all status messages as the merge is performed. When the target file does not already exist, it will be created.
  • --backup (-b) is useful only when also setting --overwrite (-w) and the target file already exists, causing the original file to be renamed before it is overwritten, giving it a .bak file extension.
  • --document-format (-D) enables overriding any inference of the output document type, whether it is written to STDOUT or any target file specified via --output (-o) or --overwrite (-w). This is particularly useful when you wish to convert between document formats, like YAML to JSON or JSON to YAML.
  • --nostdin (-S) blocks this tool from reading STDIN content, even when it is available.
  • --debug (-d) is effective only when --output (-o) or --overwrite (-w) are also set. This option generates a vast amount of detailed information about the documents as they are read and merged. It is particularly helpful when tracing YAML Path behavior or troubleshooting other merge options.
  • --verbose (-v) is effective only when --output (-o) or --overwrite (-w) are also set. This option may generate slightly more status messages as documents are read and merged. This can be helpful when tracing the sequence of document merges.
  • --quiet (-q) is the default messaging level when the merge result is written to STDOUT. This keeps all other messaging out of the document. It can be manually set when also setting --output (-o) or --overwrite (-w) to mute normal messages.
Clone this wiki locally