New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintaining order of Nodes on saving #169

Open
GoogleCodeExporter opened this Issue Mar 30, 2015 · 22 comments

Comments

Projects
None yet
5 participants
@GoogleCodeExporter
Copy link

GoogleCodeExporter commented Mar 30, 2015

What steps will reproduce the problem?
AML::Node config = YAML::LoadFile("config.yaml");
config["lastLogin"] = getCurrentDateTime();
std::ofstream fout("config.yaml");
fout << config;

What is the expected output? What do you see instead?
It would be nice if we could maintain the order of Nodes in 
saving/output-streaming.

What version of the product are you using? On what operating system?
0.3.0 on Ubuntu 11.x

Please provide any additional information below.
While we save/output-stream the order of Nodes in original files were not 
maintained. It would be really nice if we could maintained original order 
because most of the time we also do manual edit on yaml files and grouping some 
variables together makes more sense.

Original issue reported on code.google.com by rudrapou...@gmail.com on 26 Jul 2012 at 11:33

@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Mar 30, 2015

Issue 187 has been merged into this issue.

Original comment by jbe...@gmail.com on 29 Jan 2013 at 2:22

  • Added labels: ****
  • Removed labels: ****
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Mar 30, 2015

This is sorta intentional. YAML doesn't specify the ordering of key/value pairs 
in a map, so you're not supposed to rely on it.

However, I do see some value to it, so I will consider it. I'm just not 
convinced yet :)

Original comment by jbe...@gmail.com on 29 Jan 2013 at 2:23

  • Changed state: Accepted
  • Added labels: Priority-Low, Type-Enhancement
  • Removed labels: Priority-Medium, Type-Defect
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Mar 30, 2015

The way I see it, yaml-cpp is a tool for parsing files. A file is a (ordered) 
sequence of information. This is especially true with YAML as it is meant to be 
streamable. Therefore a YAML parser should somehow keep this information.

Of course there is the issue of creating a non-standard behavior people might 
start relying on, and the issue of implementation overhead to keep the ordering 
while still allowing fast search for keys.

Your suggestion to turn it into a sequence of single-element maps is not a bad 
one. I'll consider it.

Original comment by oster.t...@gmail.com on 29 Jan 2013 at 7:04

  • Added labels: ****
  • Removed labels: ****
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Mar 30, 2015

I appreciate that it's a potentially useful thing to do; but yaml-cpp is a tool 
to parse YAML, which does not specify the ordering of key/value pairs in a map. 
(A related situation: yaml-cpp also does not preserve stray whitespace. It is 
*not* a goal of the library to be able to write a file exactly as it read it.)

The fact that YAML is streamable is a red herring - the YAML spec *explicitly* 
says that key order is an implementation detail 
(http://www.yaml.org/spec/1.2/spec.html#id2765608).

Again, though, this is one of those things that I do see some value in, so I'll 
think about it. One problem is that some people want to maintain order and 
others want it to be ordered alphabetically, and I'm not a huge fan of 
proliferation of configuration details.

Original comment by jbe...@gmail.com on 29 Jan 2013 at 9:08

  • Added labels: ****
  • Removed labels: ****
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Mar 30, 2015

It would be great to have order preserved, especially in configuration files. 
So one could provide some common options to be changed from UI and other leave 
to edit manually in configuration file.

Original comment by perch...@gmail.com on 24 Jan 2014 at 6:18

  • Added labels: ****
  • Removed labels: ****
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Mar 30, 2015

Currently you can:
1. Save whole doc using Node::Clone
2. During save, check Node::GetMark().line

Original comment by end...@gmail.com on 24 Apr 2014 at 1:47

  • Added labels: ****
  • Removed labels: ****
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Mar 30, 2015

Maybe you could offer a non-standard extension of some sort? That would make it 
clear that it's not part of the standard, yet allow you to provide useful 
behavior for those who want it.

Original comment by jwbdec...@gmail.com on 9 Jul 2014 at 7:45

  • Added labels: ****
  • Removed labels: ****
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Mar 30, 2015

> Currently you can:
> 1. Save whole doc using Node::Clone
> 2. During save, check Node::GetMark().line

Could you please explain this solution in detail?

Original comment by gim6...@gmail.com on 19 Jul 2014 at 11:19

  • Added labels: ****
  • Removed labels: ****
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Mar 30, 2015

There's already a struct Mark that tells you a location in a file.

You can change 
NodeBuilder::OnScalar/NodeBuilder::OnSequenceStart/NodeBuilder::OnMapStart by 
uncommenting "mark" from the parameters and telling "node" to save the "mark".

To tell the node to save the mark, you will need to add a new member variable 
"mark" to node_data and implement getters/setters up the chain (node_ref, 
detail::node, Node).

If you want Clone() to work, you will also need to change NodeEvents::Emit to 
pass in the actual mark instead of constructing an empty mark.

Original comment by henear...@gmail.com on 23 Jul 2014 at 5:19

  • Added labels: ****
  • Removed labels: ****
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Mar 30, 2015

Does anyone know how it reorders the nodes? I can't seem to figure out a 
pattern.

Original comment by justin.j...@gmail.com on 19 Feb 2015 at 9:47

  • Added labels: ****
  • Removed labels: ****
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Mar 30, 2015

For save nodes in determining order I use Yaml::Emitter. For example:

void Author::customToYaml(YAML::Emitter &out) const
{
    out << YAML::Key << "companyName" << YAML::Value << companyName();
    out << YAML::Key << "companySite" << YAML::Value << companySite();
}

Original comment by gil.il...@gmail.com on 19 Feb 2015 at 9:55

  • Added labels: ****
  • Removed labels: ****
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Mar 30, 2015

@justin, #10: it's arbitrary-ish, based on pointer keys.

Original comment by jbe...@gmail.com on 19 Feb 2015 at 3:38

  • Added labels: ****
  • Removed labels: ****
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Mar 30, 2015

I also would like more ordered output. I fully get that YAML is using unordered 
maps, but it's very weird when hand-editting YAML files, and each object is in 
a different order.

Example:

YAML::Node root;
for(const auto &pair : this->Details.MaterialDetails)
{
    Engine::MaterialID materialID = pair.first;
    const Engine::MaterialDetails &materialDetails = pair.second;

    YAML::Node material;
    material["MaterialID"] = materialID;
    material["DisplayName"] = materialDetails.displayName;
    material["SoundID"] = materialDetails.soundID;

    root["Materials"].push_back(material);
}

I find it really odd that iterating over elements in my map, each element of 
the map will have its members outputted arbitrarily from one another.

Materials:
  - DisplayName: Fallthrough
    SoundID: 0
    MaterialID: 2
  - SoundID: 0
    DisplayName: Default
    MaterialID: 1
  - MaterialID: 0
    DisplayName: None
    SoundID: 0

Each element is in a different order. I don't expect the elements in any 
particular order, but a consistent order would be nice. Maybe you can output 
mapped values by alphabetical (or numerical) order of the key, or some other 
arbitrary-but-consistent order?

Original comment by JaminThe...@gmail.com on 26 Feb 2015 at 9:35

  • Added labels: ****
  • Removed labels: ****
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Mar 30, 2015

It's on the table to output a consistent ordering. It's not on the table to 
iterate in a consistent ordering.

Original comment by jbe...@gmail.com on 26 Feb 2015 at 9:49

  • Added labels: ****
  • Removed labels: ****
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Mar 30, 2015

Yes, consistent output is what I'm asking for, not consistent iteration. The 
iteration of 'MaterialDetails' in my example above is an std::unordered_map not 
yaml-cpp, sorry for the confusion.

I was trying to demonstrate that outputting while in any kind of loop still 
produces unordered (but correct) output, despite using the exact same functions 
in the exact same order. Which you're aware of.

Original comment by JaminThe...@gmail.com on 27 Feb 2015 at 6:38

  • Added labels: ****
  • Removed labels: ****
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Mar 30, 2015

Hmm, so I tried using the emitter to emit the key and value one by one to get 
the correct order. My code looks like this:

for (YAML::const_iterator it = baseNode.begin(); it != baseNode.end(); ++it) {
        YAML::Node key = it->first;
        YAML::Node value = it->second;

        if (value.IsScalar()) {

            emitter << YAML::Key << key.as< std::string >() << YAML::Value
                    << value.as< std::string >();

        } else {

            // Otherwise a map
            emitter << YAML::Key << key.as< std::string >();
            emitter << YAML::Value << YAML::BeginMap;
            SaveBaseNode(value, emitter);
            emitter << YAML::EndMap;

        }

    }

The output is still not ordered. I'm guessing there is no solution to this...?

Original comment by justin.j...@gmail.com on 2 Mar 2015 at 9:05

  • Added labels: ****
  • Removed labels: ****
@GoogleCodeExporter

This comment has been minimized.

Copy link

GoogleCodeExporter commented Mar 30, 2015

I would be interested in this as well, since it has forced me to use the "old" 
Emitter API. In my mind, YAML should be easily readable, and arbitrary or 
inconsistent ordering hinders that.

That said, the old Emitter API does the trick.

Original comment by Skimm...@gmail.com on 11 Mar 2015 at 5:45

  • Added labels: ****
  • Removed labels: ****
@hersh

This comment has been minimized.

Copy link

hersh commented May 16, 2016

This is a high priority for me. I will have to figure out a workaround before I can proceed.

The config files read and written by yaml-cpp are committed to a git repo, and our company does code reviews of all changes. Because yaml-cpp writes maps in an unpredictable order, every time the config file is written, then entire file appears to change even if no content is different. This will make reviews unfeasible.

For my use case, I just need repeatable ordering, not controllable ordering. I'm imagining a new Emitter manipulator called SortedMaps or something.

@hersh

This comment has been minimized.

Copy link

hersh commented May 17, 2016

Here is my workaround, in case anyone is interested. I call writeYamlOrderedMaps(out, node) instead of out << node.

#include <algorithm>
#include <string>
#include <vector>
#include <yaml-cpp/yaml.h>

// Recursive helper function that does all the work
void writeNode(const YAML::Node& node, YAML::Emitter& emitter)
{
  switch (node.Type())
  {
    case YAML::NodeType::Sequence:
    {
      emitter << YAML::BeginSeq;
      for (size_t i = 0; i < node.size(); i++)
      {
        writeNode(node[i], emitter);
      }
      emitter << YAML::EndSeq;
      break;
    }
    case YAML::NodeType::Map:
    {
      emitter << YAML::BeginMap;

      // First collect all the keys
      std::vector<std::string> keys(node.size());
      int key_it = 0;
      for (YAML::const_iterator it = node.begin(); it != node.end(); ++it)
      {
        keys[key_it++] = it->first.as<std::string>();
      }

      // Then sort them
      std::sort(keys.begin(), keys.end());

      // Then emit all the entries in sorted order.
      for(size_t i = 0; i < keys.size(); i++)
      {
        emitter << YAML::Key;
        emitter << keys[i];
        emitter << YAML::Value;
        writeNode(node[keys[i]], emitter);
      }
      emitter << YAML::EndMap;
      break;
    }
    default:
      emitter << node;
      break;
  }
}

// Main function that emits a yaml node to an output stream.
void writeYamlOrderedMaps(std::ostream& out, const YAML::Node& node)
{
  YAML::Emitter emitter;
  writeNode(node, emitter);
  out << emitter.c_str() << std::endl;
}
@ivannaz

This comment has been minimized.

Copy link

ivannaz commented Mar 7, 2017

Keeping the order in the same sensible way we originally write them would be nice.

For instance, from this nice example in http://www.yaml.org/spec/1.2/spec.html

--- !tag:clarkevans.com,2002:invoice
invoice: 34843
date : 2001-01-23
bill-to: &id001
given : Chris
family : Dumars
address:
lines: |
458 Walkman Dr.
Suite #292
city : Royal Oak
state : MI
postal : 48046
ship-to: *id001
product:
- sku : BL394D
quantity : 4
description : Basketball
price : 450.00
- sku : BL4438H
quantity : 1
description : Super Hoop
price : 2392.00
tax : 251.42
total: 4443.52
comments:
Late afternoon is best.
Backup contact is Nancy
Billsmer @ 338-4338.

We know that after out << node, what we will really get is something like:

tax : 251.42
date : 2001-01-23
ship-to: *id001
product:
- description : Basketball
sku : BL394D
price : 450.00
quantity : 4
- price : 2392.00
description : Super Hoop
quantity : 1
sku : BL4438H
total: 4443.52
comments:
Late afternoon is best.
Backup contact is Nancy
Billsmer @ 338-4338.
invoice: 34843
bill-to: &id001
family : Dumars
address:
state : MI
postal : 48046
lines: |
458 Walkman Dr.
Suite #292
city : Royal Oak
given : Chris

From the YAML page: "YAML is a human friendly data serialization".

This unnatural order removes that "human friendly" part of the data serialization process. In my config file I have some 20 main map entries, each one being itself a map of 5 to 20 entries. The resulting file could be manually edited much faster if order is preserved, just because one knows where to look for the items and the mapping order makes human sense.

Yes, "YAML doesn't specify the ordering of key/value pairs in a map, so you're not supposed to rely on it". Neither C++ ISO/IEC 14882:2011 specify some form of code indentation, yet we rely heavely on it for code readability.

@ivannaz

This comment has been minimized.

Copy link

ivannaz commented Mar 7, 2017

ops. Seems the spaces in my last comment were eaten away somehow. Sorry for that.

@ibaned

This comment has been minimized.

Copy link
Contributor

ibaned commented Apr 26, 2017

does #386 and in particular f0b15cd have any effect on this ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment