Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copying nodes #49

Closed
whiterabbit963 opened this issue Jul 26, 2020 · 37 comments
Closed

Copying nodes #49

whiterabbit963 opened this issue Jul 26, 2020 · 37 comments
Assignees
Labels
feature New feature or enhancement request

Comments

@whiterabbit963
Copy link
Contributor

whiterabbit963 commented Jul 26, 2020

Is your feature request related to a problem? Please describe.
I am currently in the process of replacing a json library with your toml library, and I have encountered this issue a couple times, where I needed to make a copy of a node. In the first case, I was able to do a move. In other cases, I need a copy.
Specifically, I have a table with one element that is an array. I want to be able to copy that array to a second element in that table.

Existing table:
[table]
foo = [1, 2, 3]

In pseudo-code:
auto *array = table.get("foo");
table["bar"] = *array;

Describe the solution you'd like
I am pretty that I can accomplish this by manually iterating over the elements in the toml::node.

Describe alternatives you've considered
I was surprised to find that copy constructors were deleted.

Additional context
Are there, or will there be any plans to include node copying in the future?

@whiterabbit963 whiterabbit963 added the feature New feature or enhancement request label Jul 26, 2020
@bobfang1992
Copy link

Why std::move is not an option for you?

@whiterabbit963
Copy link
Contributor Author

whiterabbit963 commented Jul 26, 2020

@bobfang1992 In my example, I want the final table to look like:
[table]
foo = [1, 2, 3]
bar = [1, 2, 3]

In this case, std::move would create undefined behavior, unless foo is properly erased from the table.

@Reedbeta
Copy link
Contributor

As it happens, I've just encountered this issue as well. In my case, I wanted to create a shadow copy of a table, so that I can detect changes potentially made by other code (so that I know when to save the file).

I can understand deleting the copy and assignment operators, since it could be expensive to accidentally copy a large table; but maybe there could be a deep_copy method available for when it's needed.

@whiterabbit963
Copy link
Contributor Author

@Reedbeta I thought about something similar via a clone() (same idea as deep_copy()), but I really do think copy ctors and operator=() provide the simplest mechanics to understand and use.
In my previous example, doing the following feels simpler:

auto *array = table.get("foo");
table["bar"] = *array;

rather than

auto *array = table.get("foo");
table.insert_or_assign("bar", std::move(array->clone()));

Having the copy ctors won't prevent possible errors/optimizations like void foo(toml::table t); vs void foo(const toml::table &t);, but without them, one doesn't have the freedom to choose a particular implementation.

Since toml tends to be more focused on human readability, I would not expect files > 10MB to be common place. For that reason, copy performance should be less concerning than ease of use.

@Reedbeta
Copy link
Contributor

@whiterabbit963 I think with a clone method it would just be:

table["bar"] = array->clone();

That = is now a move assign since the return value of clone() is an rvalue. Or so I'm guessing, since I haven't tried it. 🙂

@whiterabbit963
Copy link
Contributor Author

@Reedbeta Yes a good compiler would know. From what I remember of past experiences fighting with deleted copy ctors, that didn't work, but I am open to being wrong on this one.
To further complicate things, I believe I noticed the toml::table::operator[] returns a toml::node_view. Which is confusing because we have toml::node_view (obviously non-copyable), and toml::node (less obviously non-copyable).

@whiterabbit963
Copy link
Contributor Author

I also noticed, that the operator[] does not function the same as the STL's map::operator[].

Remark
std::map::operator[]'s behaviour of default-constructing a value at a key if it didn't exist is a crazy bug factory so I've deliberately chosen not to emulate it. This is not an error.

I have come to expect this behavior, and it is forcing me to rewrite some code where it was simpler to utilize this expectation.

@marzer
Copy link
Owner

marzer commented Jul 27, 2020

@whiterabbit963

Are there, or will there be any plans to include node copying in the future?

The only reason the copy constructors are deleted is because I never needed them, and in general I prefer to explicitly prohibit accidental copies of things where the heap is involved. I have no problem with this functionality being enabled.

It's complicated a bit by the fact that a node is not a concrete type; toml::node is an abstract base class, so a copy constructor on toml::node couldn't work, but I could allow copying in other places in the library. Additionally I could enable copy constructors on the concrete subclasses toml::value, toml::array and toml::table so you could perform copying if you cast toml::node to one of it's specializations (see toml::node::as(), toml::node::visit()).

In a potential future version of the library (when I have the time to do a pretty significant chunk of work on it) I'd like to revisit this design and move from inheritance to composition (as discussed in #31), which would allow me to do away with that particular caveat, but alas, that won't be for a while. Let's tentatively say that would be v3.0.

To further complicate things, I believe I noticed the toml::table::operator[] returns a toml::node_view. Which is confusing because we have toml::node_view (obviously non-copyable), and toml::node (less obviously non-copyable).

See above.

I also noticed, that the operator[] does not function the same as the STL's map::operator[].
I have come to expect this behavior, and it is forcing me to rewrite some code where it was simpler to utilize this expectation.

Well, that's unfortunate, but the current design is here to stay. The library is designed to be optimal for config querying (i.e. just reading a value), with ease of modifying/writing coming second. (also, as you point out, the behavior is clearly documented; anyone being surprised by this has just not looked at the docs).

@marzer
Copy link
Owner

marzer commented Jul 27, 2020

I thought about something similar via a clone() (same idea as deep_copy()), but I really do think copy ctors and operator=() provide the simplest mechanics to understand and use.

I agree with this sentiment, incidentally. If I'm going to provide copying, it'll be via copy construction/assignment, since that's the idiomatic C++ way of doing it. Providing a clone() method or similar just feels very... 'Java'...

@marzer
Copy link
Owner

marzer commented Jul 27, 2020

Thanks for the feedback @whiterabbit963 and @Reedbeta! I'll spend a little bit of time on this today and come up with a basic implementation for you to use, and then iterate on it later in the week if you have further feedback.

marzer added a commit that referenced this issue Jul 27, 2020
@marzer
Copy link
Owner

marzer commented Jul 27, 2020

Alright, this is tentatively implemented, pending any feedback I get here.

  • tables, values and arrays now allow copy-construction and copy-assignment
  • table and array insertion methods (push_back, insert etc.) support insertion-by-copy of tables and arrays, not just by moving
  • table and array insertion methods also now allow insertion of toml::node& and toml::node&& directly, without requiring you to cast (they'll internally do a visit() and 'do the right thing' with a reference to the concrete type)

@whiterabbit963
Copy link
Contributor Author

I tested the following. Most things work as expected. Had some trouble with table::emplace() and node_view.

toml::table t1;
t1.insert_or_assign("bar1", toml::array{1, 2, 3});
t1.insert_or_assign("foo1", *t1.get("bar1"));

t1["foo1"] = t1["bar1"]; // does nothing should this fail to compile?
//*t1["foo1"].node() = *t1["bar1"].node(); // compile failure; copying node_view would be a bad thing
toml::array *array1 = t1["foo1"].node()->as_array();
array1->push_back(4);

//t1.insert_or_assign("foo3", t1["foo1"]); // fails to compile; a little unexpected
t1.insert_or_assign("foo2", *t1["foo1"].node());
toml::array *array2 = t1["foo2"].node()->as_array();
array2->push_back("wrench");

toml::table t2 = t1;
//t2.emplace("bar", toml::array{6, 7}); // fails to compile? not sure what I did wrong
t2.insert_or_assign("bar2", toml::array{6, 7});
cout << t1 << endl << t2;

sample output

bar1 = [ 1, 2, 3 ]
foo1 = [ 1, 2, 3, 4 ]
foo2 = [ 1, 2, 3, 4, 'wrench' ]

bar1 = [ 1, 2, 3 ]
bar2 = [ 6, 7 ]
foo1 = [ 1, 2, 3, 4 ]
foo2 = [ 1, 2, 3, 4, 'wrench' ]

@marzer
Copy link
Owner

marzer commented Jul 27, 2020

t1["foo1"] = t1["bar1"]; // does nothing should this fail to compile?

Hmmn. If that compiles, it probably shouldn't! Node views are just views; currently you can't use them on the LHS of an assignment at all (or you're not supposed to be able to).

//t1.insert_or_assign("foo3", t1["foo1"]); // fails to compile; a little unexpected

Hmmn, yeah. This is a bit tricky; node_views represent a node, but they can also represent nothing (i.e. tbl["foo"] where tbl doesn't contain element foo creates a 'null' node_view). This requires a bit of plumbing, but I could figure something out.

//t2.emplace("bar", toml::array{6, 7}); // fails to compile? not sure what I did wrong

You didn't do anything wrong; that should work fine. I've likely fucked up somewhere. See my follow-up below.

Thanks for testing! I'll take your example here and build on it to improve what I have.

marzer added a commit that referenced this issue Jul 27, 2020
marzer added a commit that referenced this issue Jul 27, 2020
@marzer
Copy link
Owner

marzer commented Jul 27, 2020

@whiterabbit963 Pushed implementation pass 2. Things should now work as you expect from your example, with the exception of this line:

//t2.emplace("bar", toml::array{6, 7}); // fails to compile? not sure what I did wrong

I didn't realize when first replying, but that is actually the wrong way to call emplace(); because TOML data is not homogeneous, you need to specify the emplacement type:

t2.emplace<toml::array>("bar", 6, 7);

Which is effectively the same as:

t2.insert("bar", toml::array{ 6, 7 });

Also, a fun FYI: I've enshrined your example in a unit test.

@whiterabbit963
Copy link
Contributor Author

whiterabbit963 commented Jul 27, 2020

Ok, that makes sense. The following also compiles.

t2.emplace<toml::array>("bar", toml::array{6, 7});

I am guessing the fact that it is both a node and an array, makes the type deduction is ambiguous.

@marzer
Copy link
Owner

marzer commented Jul 27, 2020

Yah, that'd be invoking the array's move constructor internally.

@whiterabbit963
Copy link
Contributor Author

Oh fun with templates. These also work.

t2.emplace<toml::array>("bar", toml::array{6, 7}, toml::array{1, 2});
t2.emplace<toml::node>("bar", toml::array{6, 7}, toml::table{});

Also, as you pointed out, insert() takes two arguments a key and a value. emplace() takes a variable number of arguments to insert multiple items at once. The issue comes down to deducing the ValueType.

@marzer
Copy link
Owner

marzer commented Jul 27, 2020

Oh well it's not an issue, it's by design. The function expects you to be explicit.

@marzer
Copy link
Owner

marzer commented Jul 27, 2020

(It's noted as such in the docs)

@marzer
Copy link
Owner

marzer commented Jul 27, 2020

Having said that, line 2 in that toy example shouldn't work... Guess that gives me something to dig into.

@whiterabbit963
Copy link
Contributor Author

This should not work?

t2.emplace<toml::node>("bar", toml::array{6, 7}, toml::table{});

@whiterabbit963
Copy link
Contributor Author

whiterabbit963 commented Jul 27, 2020

I pulled the latest, and that last example no longer compiles. I guess it never compiled. Moved the HEAD back to where I was testing last, and it was broken there as well. It appears that MSVC 2019 won't compile but Clang doesn't flag it as an error. As an aside, I am using qtcreator on windows to compile, and it uses Clang 6 or 7 (it will be v10 soon) to do type analysis checks.

@marzer
Copy link
Owner

marzer commented Jul 28, 2020

This should not work?

No that shouldn't work; nominally speaking it's saying "construct an instance of toml::node with these constructor arguments", but constructing a concrete instance of toml::node is impossible because it's abstract, so if it works there's some type deduction happening that I didn't intend.

As an aside, I am using qtcreator on windows to compile, and it uses Clang 6 or 7 (it will be v10 soon) to do type analysis checks

I test on Clang 7 - 11, GCC 7 - 10 and VS2019 before pushing the code to the remote, so I'll try that example myself and hopefully it'll make sense..

(I haven't been able to test with clang 6 because I can't get Catch2 to compile on it)

@whiterabbit963
Copy link
Contributor Author

That makes sense. I wouldn't worry too much about it. That example fails to compile on my system (as it should).

@marzer
Copy link
Owner

marzer commented Jul 28, 2020

Alright, well fair enough. I've still got a few more test cases to add for the new functionality, but let me know if you encounter some other unexpected behavior in the mean time.

@whiterabbit963
Copy link
Contributor Author

Will do. I should have time to be able to investigate more later today.

@marzer
Copy link
Owner

marzer commented Jul 28, 2020

Hmmn, I've just realized that the TOML source information contained by tables, arrays and values that are created by parsing will be copied into newly-constructed instances generated by the new functionality discussed in this thread:

// TOML:
foo = 1

// C++:
auto tbl = toml::parse_file("config.toml");
std::cout << tbl["foo"].source() << "\n";

tbl.insert("bar", tbl["foo"]);
std::cout << tbl["bar"].source() << "\n";

// output:
line 1, column 6 of config.toml
line 1, column 6 of config.toml

I can't decide if this is 'correct' to be honest. It's reasonable to argue either way; it's not strictly correct for the new copy-constructed value (in that the new instance wasn't derived by parsing), but it was copied directly and exactly from one that was... what do you think @whiterabbit963 (and/or anyone else who cares to chime in)?

I'm leaning towards not copying the source information.

@bobfang1992
Copy link

I am also leaning towards not copying these information as they are strictly linked to the original document but when we are using the copying constructor/assignment operators we are mostly likely manipulating the TOML objects to create a new document. Copying these does not make much sense to me.

@marzer
Copy link
Owner

marzer commented Jul 28, 2020

I am also leaning towards not copying these information as they are strictly linked to the original document [...] Copying these does not make much sense to me.

Yeah, that's pretty much my intuitive sense of it, though I figured I better sanity-check my reasoning.

@whiterabbit963
Copy link
Contributor Author

I would argue for not copying as well.

@whiterabbit963
Copy link
Contributor Author

Is there a source clear method? Would it make sense to nuke all source locations if a copy into a table occurs?

@marzer
Copy link
Owner

marzer commented Jul 29, 2020

There's not, but it's easy to disable to copying without needing to add one.

@marzer marzer closed this as completed in b024ee6 Aug 2, 2020
@marzer
Copy link
Owner

marzer commented Aug 2, 2020

Source information copying: disabled
Tests: added

Feel free to re-open this if you identify anything I've missed ^_^

@Reedbeta
Copy link
Contributor

Reedbeta commented Aug 2, 2020

Neat, thanks for jumping on this so quickly Mark!

@marzer
Copy link
Owner

marzer commented Aug 2, 2020

Neat, thanks for jumping on this so quickly Mark!

Heh. Gotta do something while stuck inside on lockdown, right?

Reedbeta added a commit to Reedbeta/tomlplusplus that referenced this issue Aug 8, 2020
- Creates a node_view pointing to that node. This is similar to how std::string has a conversion operator creating a string_view of the string.
- Also, enabled the move assignment operator for node_view, but made both copy and move assignment operators lvalue-only to avoid the "misleading assignment" issue (see marzer#49 (comment))
marzer pushed a commit that referenced this issue Aug 8, 2020
- Creates a node_view pointing to that node. This is similar to how std::string has a conversion operator creating a string_view of the string.
- Also, enabled the move assignment operator for node_view, but made both copy and move assignment operators lvalue-only to avoid the "misleading assignment" issue (see #49 (comment))
@ned14
Copy link

ned14 commented Aug 11, 2020

Just wanted to thank you for implementing table copying, I literally came here to report the need, and it was already done. Thanks!

@marzer
Copy link
Owner

marzer commented Aug 11, 2020

No worries @ned14!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or enhancement request
Projects
None yet
Development

No branches or pull requests

5 participants