Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing multiple files #92

Closed
gsauthof opened this issue Mar 28, 2021 · 5 comments
Closed

Parsing multiple files #92

gsauthof opened this issue Mar 28, 2021 · 5 comments
Assignees
Labels
feature New feature or enhancement request

Comments

@gsauthof
Copy link

gsauthof commented Mar 28, 2021

Is your feature request related to a problem? Please describe.

I would like to use tomlplusplus to parse multiple .toml files into a single parse_result.

Describe the solution you'd like

Basically I want to use it like this (or similar):

vector<string> toml_files;
// push filename_1.toml ... filename_n.toml to toml_files.
auto result = toml:parse_files(toml_files);

The result should be equivalent to concatenating the files first and then parsing it, e.g.:

// cat filename_1.toml ... filename_n.toml > res.toml
auto result = toml:parse_file("res.toml");

Additional context

So why would that be useful?

Because splitting larger configurations into multiple toml files might simplify maintenance.

Example:

/etc/mydaemon/config.toml
/etc/mydaemon/config.d/01-users.toml
/etc/mydaemon/config.d/02-groups.toml
/etc/mydaemon/config.d/10-separately-installed-addon.toml
@gsauthof gsauthof added the feature New feature or enhancement request label Mar 28, 2021
@marzer
Copy link
Owner

marzer commented Mar 28, 2021

While being a reasonable idea, there's a few barriers to overcome:

  1. If it were equivalent to a concatenation as you suggest then any open [table] or [[table_array]] in filename_1.toml would then become the parent of whatever root-level keys are in filename_2.toml, and so-on, suggesting that an as-if concatenation approach isn't actually correct, but some recursive tree-merge solution would be necessary. Which leads me to...
  2. Conflict resolution. If multiple TOML documents contain the same key but different values, which one wins? The first one? The last one? Some conflict resolution API? There's no single answer that will work for all cases without also being a pain to work with.
  3. Finally, the TOML spec does not currently specify that TOML files may be consumed in this manner, but who knows, it may eventually (not impossible to imagine), in which case my premature implementation of it is likely to not work according to spec (since in this situation the spec is likely to indicate how conflicts should be handled, etc.).

Given the above I don't think this feature belongs in the library. You're better off implementing this in your own code so you can handle conflicts exactly as you need them handled in your specific case.

@marzer marzer closed this as completed Mar 28, 2021
@gsauthof
Copy link
Author

Hm, doesn't look like the specification people want to add this: toml-lang/toml#397

I don't get your first point. For me, it would be totally fine if some file creates a table array and another adds to it. As if they were in a single file. Having the same [table] in multiple files would yield an error, as it's also invalid in a single file.

So, no, I don't need a recursive tree-merge solution.

I mean with the simple concatenation one could have tables like [plugin.nginx] and [plugin.apache] in different files. Or e.g. [[addons]]\nname=foo... and [[addons]]\nname=bar....

Regarding the 2nd point: without a complicated recursive tree-merge solution there is no need for special conflict resolution.

So in one aspect the proposed feature isn't equivalent to concatenating the files: element location reporting, i.e. line numbers and filenames need to be reset for each file.

What would also work for me is some slightly lower-lever parse-file method, such as:

vector<string> toml_files;
// fill the vector ...
toml::parse_result res;
for (auto &filename : toml_files)
    toml::parse_file(filename, res);

That means each call would just reuse what's already there.

@marzer
Copy link
Owner

marzer commented Mar 28, 2021

So, no, I don't need a recursive tree-merge solution.

You don't, no, but if this is added to the library, it's not just for you. For it to work in a sane way it would have to be a tree merge; concatenation simply wouldn't work in the general case. I'm sorry but I'm not going to implement this.

@gsauthof
Copy link
Author

gsauthof commented May 2, 2021

I'm sorry but I'm not going to implement this.

Fair enough.

I don't see how anyone would want to have the tree-merge approach you think would be necessary, though.

Such a tree-merge would be anything but sane.

Perhaps your assumption that there is a 'general case' that requires it is flawed.

@marzer
Copy link
Owner

marzer commented May 2, 2021

I don't just think it would be necessary, it's easy to show that any 'consume multiple TOML files as one virtual one' approach would boil down to that out of necessity. Consider your assertion that "the result should be equivalent to concatenating the files first and then parsing it", and use the following two TOML files as inputs:

# file1.toml

a = 1

[b]
c = 3
# file2.toml

d = 0

[e]
f = 0

If you just do an as-if concatenation, then effectively you have this:

a = 1

[b]
c = 3
d = 0

[e]
f = 0

Note that d is now a member of the table [b], not a top-level element as it was in the original document. That's clearly not what was intended. To get the obvious, desired behaviour, you'd need to parse the documents separately, resetting the root node between each one. Hence: a recursive tree merge. That's exactly what it reduces to.

Perhaps your assumption that there is a 'general case' that requires it is flawed.

Nope. Not an assumption at all. See above. Obviously the above example is only a problem for documents with top-level keys; if you could guarantee that you never had top-level keys or conflicting names, a concatenation approach would work. But I, as the library author, can't mandate that! That's a very limiting restriction for the general case. You might be able to guarantee that in your own files as a domain-specific expectation, but then in that scenario you should just do the concatenation yourself in your own application.

Repository owner locked as resolved and limited conversation to collaborators May 2, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature New feature or enhancement request
Projects
None yet
Development

No branches or pull requests

2 participants