-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Go implementation #25
Comments
Please let us know when it reaches maturity. |
I have recently seen the issue on the new release 2.0. It's much too late to state my opinion now, but I disgree with the direction of NestedText. It's primary value was it's extreme simplicity, and I see two of the three stated changes as massive regressions, namely:
The other change I do welcome as a much needed improvement:
As such, I feel at an impass. I am not willing enough to implement the new official release of "NestedText", but rather a fork of it, including only the quoteless keys feature. I mention this to inform you and others in the interest of the spreading of NestedText or it's derivatives, such as the derivative I described. Perhaps it might have it's own name to distinguish itself. I don't know. Anyway, thank you for creating this format and for you new release. I'll be sure to comment when go-nestedtext has developments. |
Hi @torresjrjr, just thought I'd say that my instinct was the same with respect to these new features deviating from the simplicity of NestedText. However, having implemented a Zig parser, I consider this to be a simplification in some sense in that it is now possible to represent any JSON datastructure in NestedText format, whereas before empty lists/objects could not be represented and neither could certain object keys. I don't feel great about the syntax of multiline keys, but on balance I think allowing them in some form is better than not. There are other practicality arguments for allowing 'flow-style' lists and objects mentioned on some of the other recent issues, and on the whole I think they're a positive addition that don't break the fundamentals of NestedText (such as each line having a clear type, independent of context on surrounding lines). Anyway, that's just my perspective. There are certainly some elements of the recent changes I'm not 100% happy with (primarily that Good luck with your implementation. |
There seems to a tension in NestedText. Some people are drawn to it by the simplicity of the format and its implementation. Others are drawn to it because it can hold any text without the need for quoting or escaping. For me, I was looking for a format that could hold code snippets in a form that was simple for the end user, so the simplicity of format and the lack of quoting and escaping are what drove me. I was never concerned about the simplicity of the implementation. Hence, NestedText 2.0. It makes the language complete, in that it can hold any hierarchical collection of dictionaries, lists, and strings using a simple and consistent set of rules (with the proviso that dictionary keys must be strings) at the expense of a more complicated format and implementation. I was okay making the format more complicated because most users would never need or see the new features. Only the power users would ever use, or even know about, the multiline keys and inline lists and dictionaries. Having said that, I still understand the appeal of NestedText in its simplest form. It is tempting to define a subset that would be very easy to implement and would be sufficient for almost all applications. It seems to me that subset would be NestedText 2.0 with the multiline keys and inline dictionaries and lists removed. Then the question becomes: would this bifurcation of NestedText be good or bad for NestedText adoption? I kind of think it would be good for adoption as long as it remained a pure subset. And it would certainly be better than having people creating variants on a ad hoc basis. |
We have decided not to explicitly declare a NestedText subset at this time. If you do choose to implement a subset, please do not allow keys to start with [ or {, which could cause upward compatibility issues for those wishing to use both your subset and the full version of NestedText. |
OK, understood. The current go-nestedtext repo is set at version 1 of NestedText, so there should be no foreseable changes. I'm not extremely invested in the immediate development of go-nestedtext, nor do I wish to "hijack" the spec, so please don't hesitate to let me know what you'd like to see from go-nestedtext if anything. |
This is an interesting idea, but I'm not sure how well it would tackle the problem. I personally think flow-style containers improve the format quite significantly for reading/writing (although do make the implementation noticeably more complex), and it's other parts of the spec I'm uneasy about. I think a simplified version of the spec would have to go further, also removing multiline keys. The problem is that this removes the ability to specify any valid JSON - certain characters not allowed in object keys, and empty lists/objects not expressible. I find multiline object keys the strangest part of the format, but I have come around to realising their value, and while I think the use of colons can be confusing I can see some merit in using a familiar character for it. The one thing in the spec I'm very reluctant to implement (and have no plans to - this is the only deviation from the 2.0 spec in |
I probably should've weighed in on this in the previous thread, but for what it's worth I think that forbidding empty values in flow-style containers is a mistake. Perhaps this is subjective, but my feeling is that users would be more surprised to find that empty values can't be specified than to find that empty values following trailing commas don't count:
My perspective on this is also influenced by the fact that I use nestedtext extensively for writing unit tests, where empty values are very common. So I don't see empty values as an edge-case that can be excluded in the name of avoiding a minor clarity issue. I do see where you're coming from with the arguments that (i) flow-style containers are a shorthand syntax and (ii) it's easier to add features than to remove them. But I think the empty-value syntax is well-justified. |
I can't imagine why users would be surprised about empty values not being allowed - I don't know of any other languages/formats that allow empty values inside bracketed structures without the use of quotes (e.g. this appears to be disallowed in yaml). I simply disagree that the syntax is intuitive, it's a shame that we don't see eye to eye on this. Maybe I have more of a mind to users who aren't super familiar with programming. E.g. I know of a user who was confused by the need for a space after the '-' for a list in yaml, hence my previously-voiced concern about allowing a leading hyphen in object keys. The precedent for trailing commas comes exclusively from cases where empty values are not allowed (without quotes), as far as I can tell, which is the case I'm saying is confusing. Interesting to hear that empty values are common for you, could you link to an example NestedText file? I'd be interested to compare the flow-style to non-flow-style equivalent. With enough persuasive examples I could start to be more convinced. Without, I have no intention of adding this to the Zig implementation, to discourage people from using it. |
Well, it is not a programming language, but Verilog is the primary language used for representing electronic hardware designs, and it allows empty values in bracketed structures. Specifically, in argument lists you can pass values by name or by order. If you are passing by order you may simply skip a value to indicate that it should take its default value. For example:
Other than NestedText, I can think of no other language that does not support quoting or escaping, and this is a primary feature of NestedText. Certainly you are not suggesting that quoting be added back in? And popular languages such as Python ignore trailing commas. Once you accept the idea that trailing commas are ignored and that quoting is not available, you end up with the current behavior of NestedText. As for whether it is intuitive, that is personal judgement. The current approach is simple and self consistent and largely familiar. I would also argue that it is obvious once you remember that trailing commas are ignored. Consider a few cases: So this whole debate comes down to whether we ignore trailing commas. As you pointed out earlier, the case for trailing commas is not strong in NestedText as our commas only act as separators on lists and dictionaries that are contained on a single line. However, ignoring trailing commas allows us to distinguish between empty lists and lists that contain an empty value. If you use NestedText to hold code snippets, something that NestedText is particularly good at, those two case are very important. You need to be able to support both, and you need to distinguish them. But say we simply disallow trailing commas. Then consider three cases:
Now, again, this time allowing trailing commas:
To me, the second case is simply better. The first has a big hole in it. The only other alternative I can think of is to disallow empty values all together in inline lists, but that seems like too big a restriction. I am somewhat surprised that this particular aspect of NestedText is causing consternation. The vast majority of users will never encounter inline dictionaries and lists, and those that do will be the more sophisticated users who will not be troubled by this aspect of NestedText. Even among those, few will encounter lists of empty values. But those that do will likely really appreciate the fact that they are available. I am much more concerned with fragmentation of the spec. NestedText is still very young, and its adoption will be harmed if we bicker over the definition and put out versions that are mutually incompatible. I am okay with minimal implementations of NestedText, like the Go implementation by Byron, because they are substantially simpler and are sufficient to handle the primary use case for NestedText, that of being a simple and clean configuration or data file format for casual users. I only ask that such implementations make it clear that they are subsets, that they implement the base set of features, and that they remain upward compatible. But I think it is damaging to put out implementations that add, remove, or modify language features based on personal preference. If you do not like this aspect of the language, I encourage you to simply implement your dump function so that it does not generate it while implementing your load function to support it so your version is compatible with all legal NestedTest files. |
Here's one of my tests that uses a lot of empty values: This file doesn't quite have any good examples of where empty values would be used with flow style. The early tests (
With flow-style:
I actually think this is a good example of how flow-style can improve a document (the original file was written pre-flow-style). |
Note: maybe we should move this discussion to a separate issue if it's going to drag out?
Out of interest, does Verilog allow trailing commas? Does it treat this as an empty value after the comma? I had a quick search and found this, which seems to suggest it treats it as an empty value after the comma (which seems sensible to me), but it also doesn't surprise me that this has caused confusion with the error message!
That's exactly my point - there is no precedent we're leaning on when making the decision here.
I assume it's clear I'm not :)
This is a reach as far as I'm concerned. I accept that there's no quoting in NestedText. I accept that languages with quoting accept trailing commas in general. I reject that languages without quoting accepting empty values should allow trailing commas (especially when the usual benefit to allowing them - diffs when spanning multiple lines - does not apply!).
I can see the logic here - I understood it the first time you explained it in the other issue. Do you understand that I'm seeing it from the other direction - from the user's perspective? The current spec says that
I then think about how to tackle this, and as per your examples, it seems the only consistent option would be to remove the ability to specify empty values in flow-style. I'm yet to be convinced that empty values in flow-style provides sufficient value to outweigh this entirely non-obvious aspect of the language (and this is a pretty high bar as far as I'm concerned).
I'm not sure how this snippet was relevant, perhaps you could elaborate?
I don't see this as a reasonable assumption. When a language feature exists, any user may encounter it (e.g. reading someone else's data file), and there will be users that get confused (even by the most carefully, well-designed features - that's just life 😃). We should seek to minimise this confusion.
I completely agree with this - this is why I'm continuing to try so hard to put my case forward! If I didn't care about fragmentation I'd just live with the difference in the zig implementation and move on :)
Yes, I've been meaning to make this clear, and have created LewisGaul/zig-nestedtext#16.
I hope you understand that this is not driven by personal preference. This is driven by what I believe to be the best choice for the language in the long run. I am always open to being persuaded otherwise, but I haven't been so far on this point.
As I stated in my previous message, I have no plans to implement support for empty values in flow-style. Ideally we would be able to come to an agreement on whether or not there should be language support for them. Otherwise, I would certainly consider adding support if users of my library came to me asking for the feature with convincing use-cases. Currently, I feel there's not enough people involved in the discussion for any of us to really make a fully informed decision. Some research into what users want and what they find confusing could be very helpful in resolving this. |
Sorry, that was poorly worded. What I meant was 'personal beliefs' rather than 'personal preference'. In particular, our personal beliefs as to what is best for users, developers, and the long term success of NestedText. I think we are all considering each of these things, and nobody knows for sure what is best in each case. What is clear, is that we disagree. I did misunderstand your proposal. I thought you wanted to eliminate empty lists rather than eliminating empty values in lists. I'm sure that was due to me not reading your responses carefully. Sorry about that. So to summarize, you feel that allowing empty values in inline lists is confusing and think they should not be allowed, at least in the short term until a larger consensus forms, whereas I think that disallowing empty values in inline lists is too big a restriction. A good example supporting your view is At this point I still believe the current approach as documented in NestedText 2.0 is the best choice, and I assume that you remain unconvinced. So, I'm not sure as to how to proceed. |
Yes, I have more against the trailing comma than empty values in flow-style, but we've established we can't come up with a way to allow all combinations of empty values without them, hence I'm against empty values. Or perhaps I would prefer the wart to be that it's impossible to represent a list containing a single empty string in flow-style than for this confusing syntax of using trailing commas - a list of a single empty value can be represented on a single line even if not using flow-style, and as you say this is expected to be an edge case anyway. The example of nested flow-style is admittedly fairly convincing (and also isn't solved by my alternative suggestion above). I'm not sure why 0 would be represented by an empty string though. My general hypothesis is that wanting to represent empty strings should be quite rare, and that it may make more sense to use an alternative representation for some of these cases anyway. If this is the case then having slightly ugly syntax for it doesn't seem like the end of the world. But this is currently no more than a hypothesis, and one that you seem to disagree with. As for how to proceed, how about this:
|
Sounds like a good plan. |
The zigforum post is at https://zigforum.org/t/zig-nestedtext-release-0-1-0/383/5 |
Full disclosure: I'm a colleague of Lewis, and mainly have the context he has provided on this issue. Having said that, being his colleague would hopefully make it easier for me to disagree with him if anything! I'd like to offer another voice in support of not allowing lists containing a single empty value (i.e.
In my view, this means that where there are easy analogies to draw with natural languages, they should be used. Natural language uses commas as separators and a trailing comma looks incorrect/incomplete. That means I imagine that non-programmers (and programmers unfamiliar with trailing commas) would infer that a single comma separated two empty values. |
I think everyone is in agreement that using trailing commas to signal empty strings in an inline list is not desirable. But the alternative of simply not supporting empty strings is not desirable either. The disagreement is which is more undesirable. There are now three alternatives, perhaps you can weigh in on what you think is the best approach:
The new proposal matches what people expect in all cases except perhaps in the case of a list containing a single empty string. In this case, it is distinguished from an empty list by the fact that there is one or more spaces between the brackets. These spaces are taken as a value, and then, as with all values, the leading and trailing spaces are removed, leaving an empty string. |
Just wanted to explicitly add my other suggestion - trailing commas are disallowed but empty values are allowed (i.e. a comma at the end of a flow-style list indicate an empty value follows) , and we just accept that there's no way to represent a list containing a single empty string in flow-style (noting it can be expressed with a single line anyway when not embedded in another flow-style structure). |
So, it is like #3, except that Which is your preference? |
Yes, that's correct. I'm actually not sure which I prefer, but personally feel any of the alternatives are better than status quo. |
I think 3 offers a good compromise between ease-of-understanding and what can be represented. I think that a special case of |
I don't like 3. It's very counter-intuitive for
I don't like 4 either. Whether or not a value can be represented shouldn't depend on what other values are in the container. I still think that 1 is clearly the best syntax. It allows empty values to be represented and uses standard, well-known syntax rules. It allows unabiguous cases like |
I have decided to deprecate trailing commas and instead go with the idea of having I have updated the documentation, the tests, and the python implementation. I will give some time to let things settle before releasing version 3.0. Hopefully this will be the last enhancement that is not backward compatible. |
The changes have been incorporated into version 3.0, which is now available. |
I have created an implementation of NestedText in Go.
https://git.sr.ht/~torresjrjr/go-nestedtext
It's in early development, and provides only an executable which converts NestedText to JSON for now. I wish to create a library too.
Comments and critiques welcome.
The text was updated successfully, but these errors were encountered: