Skip to content

Release 0.5.0

Compare
Choose a tag to compare
@github-actions github-actions released this 18 Dec 16:07

Breaking changes

  • Make the node API const-correct (PR#267): added ConstNodeRef to hold a constant reference to a node. As the name implies, a ConstNodeRef object cannot be used in any tree-mutating operation. It is also smaller than the existing NodeRef, and faster because it does not need to check its own validity on every access. As a result of this change, there are now some constraints when obtaining a ref from a tree, and existing code is likely to break in this type of situation:
    const Tree const_tree = ...;
    NodeRef nr = const_tree.rootref(); // ERROR (was ok): cannot obtain a mutating NodeRef from a const Tree
    ConstNodeRef cnr = const_tree.rootref(); // ok
    
    Tree tree = ...;
    NodeRef nr = tree.rootref(); // ok
    ConstNodeRef cnr = tree.rootref(); // ok (implicit conversion from NodeRef to ConstNodeRef)
    // to obtain a ConstNodeRef from a mutable Tree
    // while avoiding implicit conversion, use the `c`
    // prefix:
    ConstNodeRef cnr = tree.crootref();
    // likewise for tree.ref() and tree.cref().
    
    nr = cnr; // ERROR: cannot obtain NodeRef from ConstNodeRef
    cnr = nr; // ok
    The use of ConstNodeRef also needs to be propagated through client code. One such place is when deserializing types:
    // needs to be changed from:
    template<class T> bool read(ryml::NodeRef const& n, T *var);
    // ... to:
    template<class T> bool read(ryml::ConstNodeRef const& n, T *var);
    • The initial version of ConstNodeRef/NodeRef had the problem that const methods in the CRTP base did not participate in overload resolution (#294), preventing calls from const NodeRef objects. This was fixed by moving non-const methods to the CRTP base and disabling them with SFINAE (PR#295).
    • Also added disambiguation iteration methods: .cbegin(), .cend(), .cchildren(), .csiblings() (PR#295).
  • Deprecate emit() and emitrs() (#120, PR#303): use emit_yaml() and emitrs_yaml() instead. This was done to improve compatibility with Qt, which leaks a macro named emit. For more information, see #120.
    • In the Python API:
      • Deprecate emit(), add emit_yaml() and emit_json().
      • Deprecate compute_emit_length(), add compute_emit_yaml_length() and compute_emit_json_length().
      • Deprecate emit_in_place(), add emit_yaml_in_place() and emit_json_in_place().
      • Calling the deprecated functions will now trigger a warning.
  • Location querying is no longer done lazily (#260, PR#307). It now requires explicit opt-in when instantiating the parser. With this change, the accelerator structure for location querying is now built when parsing:
    Parser parser(ParserOptions().locations(true));
    // now parsing also builds location lookup:
    Tree t = parser.parse_in_arena("myfile.yml", "foo: bar");
    assert(parser.location(t["foo"]).line == 0u);
    • Locations are disabled by default:
    Parser parser;
    assert(parser.options().locations() == false);
  • Deprecate Tree::arena_pos(): use Tree::arena_size() instead (PR#290).
  • Deprecate pointless has_siblings(): use Tree::has_other_siblings() instead (PR#330.

Performance improvements

  • Improve performance of integer serialization and deserialization (in c4core). Eg, on Linux/g++11.2, with integral types:

    • c4::to_chars() can be expected to be roughly...
      • ~40% to 2x faster than std::to_chars()
      • ~10x-30x faster than sprintf()
      • ~50x-100x faster than a naive stringstream::operator<<() followed by stringstream::str()
    • c4::from_chars() can be expected to be roughly...
      • ~10%-30% faster than std::from_chars()
      • ~10x faster than scanf()
      • ~30x-50x faster than a naive stringstream::str() followed by stringstream::operator>>()
        For more details, see the changelog for c4core 0.1.10.
  • Fix #289 and #331 - parsing of single-line flow-style sequences had quadratic complexity, causing long parse times in ultra long lines PR#293/PR#332.

    • This was due to scanning for the token : before scanning for , or ], which caused line-length scans on every scalar scan. Changing the order of the checks was enough to address the quadratic complexity, and the parse times for flow-style are now in line with block-style.
    • As part of this changeset, a significant number of runtime branches was eliminated by separating Parser::_scan_scalar() into several different {seq,map}x{block,flow} functions specific for each context. Expect some improvement in parse times.
    • Also, on Debug builds (or assertion-enabled builds) there was a paranoid assertion calling Tree::has_child() in Tree::insert_child() that caused quadratic behavior because the assertion had linear complexity. It was replaced with a somewhat equivalent O(1) assertion.
    • Now the byte throughput is independent of line size for styles and containers. This can be seen in the table below, which shows parse troughputs in MB/s of 1000 containers of different styles and sizes (flow containers are in a single line):
    Container Style 10elms 100elms 1000elms
    1000 Maps block 50.8MB/s 57.8MB/s 63.9MB/s
    1000 Maps flow 58.2MB/s 65.9MB/s 74.5MB/s
    1000 Seqs block 55.7MB/s 59.2MB/s 60.0MB/s
    1000 Seqs flow 52.8MB/s 55.6MB/s 54.5MB/s
  • Fix #329: complexity of has_sibling() and has_child() is now O(1), previously was linear (PR#330).

Fixes

  • Fix #233 - accept leading colon in the first key of a flow map (UNK node) PR#234:
    :foo:           # parse error on the leading colon
      :bar: a       # parse error on the leading colon
      :barbar: b    # was ok
      :barbarbar: c # was ok
    foo:            # was ok
      bar: a        # was ok
      :barbar: b    # was ok
      :barbarbar: c # was ol
  • Fix #253: double-quoted emitter should encode carriage-return \r to preserve roundtrip equivalence:
    Tree tree;
    NodeRef root = tree.rootref();
    root |= MAP;
    root["s"] = "t\rt";
    root["s"] |= _WIP_VAL_DQUO;
    std::string s = emitrs<std::string>(tree);
    EXPECT_EQ(s, "s: \"t\\rt\"\n");
    Tree tree2 = parse_in_arena(to_csubstr(s));
    EXPECT_EQ(tree2["s"].val(), tree["s"].val());
  • Fix parsing of empty block folded+literal scalars when they are the last child of a container (part of PR#264):
    seq:
      - ""
      - ''
      - >
      - |  # error, the resulting val included all the YAML from the next node
    seq2:
      - ""
      - ''
      - |
      - >  # error, the resulting val included all the YAML from the next node
    map:
      a: ""
      b: ''
      c: >
      d: |  # error, the resulting val included all the YAML from the next node
    map2:
      a: ""
      b: ''
      c: |
      d: >  # error, the resulting val included all the YAML from the next node
    lastly: the last
  • Fix #274 (PR#296): Lists with unindented items and trailing empty values parse incorrectly:
    foo:
    - bar
    -
    baz: qux
    was wrongly parsed as
    foo:
    - bar
    - baz: qux
  • Fix #277 (PR#340): merge fails with duplicate keys.
  • Fix #337 (PR#338): empty lines in block scalars shall not have tab characters \t.
  • Fix #268 (PR#339): don't override key type_bits when copying val. This was causing problematic resolution of anchors/references.
  • Fix #309 (PR#310): emitted scalars containing @ or ` should be quoted.
    • The quotes should be added only when they lead the scalar. See #320 and PR#334.
  • Fix #297 (PR#298): JSON emitter should escape control characters.
  • Fix #292 (PR#299): JSON emitter should quote version string scalars like 0.1.2.
  • Fix #291 (PR#299): JSON emitter should quote scalars with leading zero, eg 048.
  • Fix #280 (PR#281): deserialization of std::vector<bool> failed because its operator[] returns a reference instead of value_type.
  • Fix #288 (PR#290): segfault on successive calls to Tree::_grow_arena(), caused by using the arena position instead of its length as starting point for the new arena capacity.
  • Fix #324 (PR#328): eager assertion prevented moving nodes to the first position in a parent.
  • Fix Tree::_clear_val(): was clearing key instead (PR#335).
  • YAML test suite events emitter: fix emission of inheriting nodes. The events for {<<: *anchor, foo: bar} are now correctly emitted as:
    =VAL :<<  # previously was =ALI <<
    =ALI *anchor
    =VAL :foo
    =VAL :bar
  • Fix #246: add missing #define for the include guard of the amalgamated header.
  • Fix #326: honor runtime settings for calling debugbreak, add option to disable any calls to debugbreak.
  • Fix cmake#8: SOVERSION missing from shared libraries.

Python

  • The Python packages for Windows and MacOSX are causing problems in the CI, and were mostly disabled. The problematic packages are successfully made, but then fail to be imported. This was impossible to reproduce outside of the CI, and they were disabled since they were delaying the release. As a consequence, the Python release will have very limited compiled packages for Windows (only Python 3.6 and 3.7) or MacOSX. Help would be appreciated from those interested in these packages.

Thanks