Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can dotted keys define sub-tables within tables indirectly defined in header form? #771

Closed
Validark opened this issue Sep 21, 2020 · 26 comments
Labels

Comments

@Validark
Copy link

From: #769

So we have this example in the spec:

"The [table] form can, however, be used to define sub-tables within tables defined via dotted keys."

[fruit]
apple.color = "red"
apple.taste.sweet = true

[fruit.apple.texture]  # you can add sub-tables
smooth = true

Is the following also a valid TOML document (just changing the order)?

[fruit.apple.texture]  # you can add sub-tables
smooth = true

[fruit]
apple.color = "red"
apple.taste.sweet = true
@pradyunsg
Copy link
Member

Yes.

@ChristianSi
Copy link
Contributor

To explain a bit more thoroughly: the order of table blocks in a document doesn't matter (except where arrays of tables are concerned), just like the order of key/value pairs in a table block doesn't matter. Hence it's obvious that, since the original document is known to be valid, the reordered document must be valid as well.

Regarding the question which tables are DEFINED in this document, it must be remembered that each table block DEFINES the table given as its name as well as any subtables mentioned to the left of dots in dotted keys (if any are used). So, to spell this out in comments:

[fruit.apple.texture]  # you can add sub-tables
smooth = true
# The table block ending here DEFINES fruit.apple.texture

[fruit]
apple.color = "red"
apple.taste.sweet = true
# The table block ending here DEFINES fruit, fruit.apple, fruit.apple.taste

Reading the DEFINES comments, you can easily see that no table is defined more than once, hence the rule forbidding the redefinition of tables is fulfilled.

@genericptr
Copy link

I think as far as the parser is concerned [fruit.apple.texture] defined 3 tables: "fruit", "fruit.apple" and "fruit.apple.texture". It's my understanding of the dot syntax that it defines all tables to the left of the dot also (if they don't already exist), or is that only if it's outside of []?

@marzer
Copy link
Contributor

marzer commented Sep 25, 2020

It's my understanding of the dot syntax that it defines all tables to the left of the dot also (if they don't already exist)

It does, but in an implicit manner; once you directly name the table it becomes explicit. You can only explicitly name a table in a [header] once, but you can reference it implicitly as often as is necessary.

[fruit.apple.texture]  # 'fruit' and 'fruit.apple' are defined implicitly since they didn't already exist

[fruit.pear]  # 'fruit.pear' is defined explicitly, 'fruit' is referenced implicitly

[fruit] # 'fruit' is now defined explicitly

[fruit.watermelon]  # still ok; 'fruit' is referenced implicitly

[fruit] # NOPE, we've already defined 'fruit'!

dotted.keys work in the same fashion, though do be mindful that you can't mix-and-match [names.used.in.table.headers] and dotted.keys; if the fully-qualified 'path' of any key equals that of a table (implicit or explicit, doesn't matter), then you've mixed concepts and it's a name collision violation.

This makes sense to me when I think about it from a programming perspective (in terms of variable scoping etc), but I can see how it would be confusing. I'm not gonna even begin to pretend to know how to make this part of the spec clearer... My gut feeling is that it's best explained by example, though the existing examples already demonstrate it pretty clearly.

@ChristianSi
Copy link
Contributor

ChristianSi commented Sep 25, 2020

@genericptr No, a table name (in brackets) is NOT a dotted key, and, while a dotted key might DEFINE one or more tables, each and every table name DEFINES exactly one. DEFINES, that is, in the sense of the spec — which I write in capital letters, and what @marzer calls "explicitly defines".

It is true that additional tables might be CREATED in the parsed data structure in order to place the new table there, but make no mistake: they were not yet DEFINED! Thus I would rather use a different term such as CREATED for what @marzer calls (rather confusingly) "implicitly defined". Each and every table must be DEFINED at most once, and it will be CREATED just once, but it may be CREATED before it is DEFINED. To demonstrate by example.

[fruit.apple.texture]
# This table block DEFINES and CREATES fruit.apple.texture, and it CREATES
# fruit and fruit.apple (which are needed to contain the newly DEFINED table)

[fruit.pear]
# This table block DEFINES and CREATES fruit.pear

[fruit]
# This table block DEFINES fruit (and CREATES nothing)

[fruit.watermelon] 
# This table block DEFINES and CREATES fruit.watermelon

[fruit.watermelon.texture.details]
# This table block DEFINES and CREATES fruit.watermelon.texture.details, and
# it CREATES fruit.watermelon.texture

At the end of this fragment, five tables have been DEFINED, but seven have been CREATED. If the TOML document is now finished and the parsed document is returned to a user, they won't notice any difference whatsoever between a table that has been DEFINED and one that has merely been CREATED (the difference is important for the parser, but not for the user). But if the document is NOT yet finished, any not-yet-DEFINED table might still be DEFINED, regardless of whether or not it has already been CREATED.

[fruit.apple]

[fruit.watermelon.texture]

Now every CREATED table has also been DEFINED. But this will by no means be the case in every TOML document.

@marzer
Copy link
Contributor

marzer commented Sep 25, 2020

rather confusingly

It's not a confusing distinction to me, if you want to be snide about it @ChristianSi. "Implicit" and "Explicit" are simple and ubiquitous concepts. I wasn't suggesting that terminology be added to the spec.

@ChristianSi
Copy link
Contributor

ChristianSi commented Sep 25, 2020

No snideness intended, just hoping to help clear up the confusion!

But talking about "implicit" AND "explicit" definitions is a really bad idea here, since it implies that every table can be defined twice – once implicit and another time explicit. But they can't, the spec is very clear on that point. Hence another terminology is needed.

@eksortso
Copy link
Contributor

The words "create" and "define" as described by @ChristianSi are a great addition to TOML's lexicon. They would work very well in the spec. We can say that a header "creates and defines" a subtable, that dotted keys create subtables that haven't been introduced yet but don't define them per, and so forth. (Sorry I don't have the spec in front of me.)

I know that @Validark wanted to write up some language to clear up these concepts, and @marzer already seems to have the concepts down pat. I'd contribute but I don't have time right now. So I hope that "create" and "define" get used and used well.

@genericptr
Copy link

So using new the new term "create" how do we explain this snippet. If this were dotted keys it would be legal, but it's not for table headers.

It's just inherently confusing that headers and dotted keys have different rules at all. I would think the rules should be the same but headers open a new table scope, while dotted keys do not.

It would be natural for the logic of parsing to be the same for all dotted keys, whether they are within a header or not.

# DO NOT DO THIS EITHER

[fruit]
apple = "red"

[fruit.apple]
texture = "smooth"

@genericptr
Copy link

Just some thoughts on what a more technical description could look like, intended for programmers making parsers or advanced users etc... This along with more JSON examples would have been helpful for my styling of thinking.

  1. Tables are not nested, to nest tables use dotted key names.
  2. Dotted key names are used for nesting and are evaluated from left to right.
    2a) if the key name exists advance to the next name.
    2b) if the key name does not exist then create a new table as a chid of the last table.
    2c) once the final (rightmost) key name is reached:
    3a) if the table name already exists then give an error "already defined table"
    3b) if the able name doesn't exist, create a new table (same as 2b).
    3b) finally, the current table is now "open" and all subsequent key/value pairs are relative to this table (until the table is closed)
  3. Open tables are closed when the next table [...] or array of tables [[...]] is encountered.

@eksortso
Copy link
Contributor

So using new the new term "create" how do we explain this snippet. If this were dotted keys it would be legal, but it's not for table headers.

No, it would not be legal, actually. Try parsing this, and you'll get a duplicate key error.

fruit.apple = "red"
fruit.apple.texture = "smooth"  #INVALID

In this example, fruit.apple gets defined as a string, and then when fruit.apple.texture is parsed, it identifies fruit.apple as a table but cannot create it because fruit.apple is already defined as a string. It can't be both a string and a table, and the parser throws an error.

In your example, fruit.apple gets defined as a string, but a table section [fruit.apple] cannot be created. fruit.apple already exists! It's the same error: keys can never be duplicated.

@genericptr
Copy link

In your example, fruit.apple gets defined as a string, but a table section [fruit.apple] cannot be created. fruit.apple already exists! It's the same error: keys can never be duplicated.

Of course, I was brain dead on a long haul flight when I wrote that. :)

@eksortso
Copy link
Contributor

It's just inherently confusing that headers and dotted keys have different rules at all. I would think the rules should be the same but headers open a new table scope, while dotted keys do not.

You ought to expect that different syntaxes have different rules, otherwise, why bother having different syntaxes? Inline tables exist, too, so why should they? The three ways of creating tables have different degrees of flexibility, convenience, and conciseness. You could write a TOML document using only section headers to create and define tables. But it would not necessarily make an easily readable document.

I don't understand what you're saying about "nested tables." You can nest tables with all three syntaxes. But you must adhere to the rules for each syntax. And every table, except the root (more about that later) must have a name, i.e. the key on which it's created and defined.

Dotted keys aren't headers; you could define two subtables at once in the same section with dotted keys. It's not recommended, but it may serve a good purpose.

[fruit]
# Colors
apple.color = "red"
strawberry.color = "red"
# Textures
apple.texture = "smooth"
strawberry.texture = "bumpy"

A table defined by a dotted key must have all of its non-subtable values defined within the same section. That excuses subtables, though. This is essentially your third rule in practice: fruit.apple and fruit.strawberry are defined here, BUT they may have subtables created in other sections.

And inline tables are entirely self-contained. Outside of the curly braces, no part of the inline table can be created or defined. Opened and closed completely, per your lingo.

The rules do enforce an implicit principle: every single table has its key/value pairs defined in exactly one place in the document. Dotted keys are more flexible than inline tables, but the tables that dotted keys create are fully defined within the scope of the section that they appear in.

It would be natural for the logic of parsing to be the same for all dotted keys, whether they are within a header or not.

They are. I've been saying "section" instead of header, because there is one section of the document that has no header: the top-level section before all headers, which itself defines a table. That's the root table, the table from which all subtables, however they're created, ultimately descend. It follows all the same rules as tables defined under the headers. There's only one difference: the root table has no name.

@genericptr
Copy link

I don't understand what you're saying about "nested tables." You can nest tables with all three syntaxes. But you must adhere to the rules for each syntax. And every table, except the root (more about that later) must have a name, i.e. the key on which it's created and defined.

How do you nest tables using [] syntax? I don't see how that's possible because there is no closing terminator for the table.

@eksortso
Copy link
Contributor

How do you nest tables using [] syntax? I don't see how that's possible because there is no closing terminator for the table.

With the bracket notation, sections begin with a header (or start of file) and end on the line before the next header (or end of file). You don't have to arrange the text like JSON does to allow for nesting. You list out your tables section by section. This encourages users to keep similar things together in one subtable to make them easier to find.

depth_here = 0
[one_deep]
depth_here = 1
[one_deep.two_deep]
depth_here = 2
[one_deep.two_deep."how deep?"]
depth_here = 3

In the rare cases where you want a subtable to be defined right in the middle of a table, you can use dotted keys or online tables.

[one_deep]
depth_up_here = 1
two_deep.what = 2
two_deep.i_said_what = 2
depth_right_here = 1
two_again = {depth = 2, depth_again = 2}
depth_down_here = 1

Remember that TOML is primarily a section-based configuration format. Nesting is possible, as you see above. But deep nesting is not encouraged, and that's intentional.

This may be lifted over time if TOML gets used for general data more often, but that is a completely different issue. It depends on where we want TOML to go in the future.

@genericptr
Copy link

Yes, the nesting occurs by using dotted keys.

@eksortso
Copy link
Contributor

eksortso commented Sep 28, 2020

You're still not being clear what you mean by "nesting." Why do you consider the use of dotted keys "nesting" and the use of an inline table not, when both create subtables with syntax inside a section? And why don't you consider a subtable header a way to provide "nesting" when it clearly adds a subtable?

@genericptr
Copy link

genericptr commented Sep 28, 2020

sorry, nesting means:

[aaa]
   [bbb]
      [ccc]

created a structure like:

{
  "aaa": {
    "bbb": {
      "ccc": {
      }
    }
  }
}

but tables headers don't nest in TOML so we need to use inline tables or dotted keys.

@eksortso
Copy link
Contributor

Thank you, yes. Headers follow the INI notion of delineating sections of the document, and the referencing is always absolute. The indentation doesn't matter.

[aaa]
[aaa.bbb]
[aaa.bbb.ccc]

However, there is a one-liner, [aaa.bbb.ccc], that will get you exactly that structure, because subtables that are implied but not defined are left empty..

@ChristianSi
Copy link
Contributor

ChristianSi commented Sep 28, 2020

@genericptr Note that [aaa.bbb.ccc] is NOT a dotted key, it's a table name. The names of tables that aren't direct children of the (nameless) root table will always contain one or more dots.

You could call such table names dotted table names if you prefer, but I don't think that would be particularly helpful. Just keep in mind that they aren't keys.

Table names are always absolute, while keys are always relative (to the table they are in).

@genericptr
Copy link

Just wanted to double confirm this. So the "dotted keys" inside tables [] are literal names and not children of each other? Initially my parser did this:

# [x] you
# [x.y] don't
# [x.y.z] need these
[x.y.z.w] # for this to work

[x] # defining a super-table afterwards is ok

which makes this structure according to the dotted-key syntax:

{
  "x" : {
    "y" : {
      "z" : {
        "w" : {}
      }
    }
  }
}

You're saying that not's correct right? Otherwise the comment "defining a super-table afterwards is ok" should be false because you'd be redefining the table x. I feel like if the spec had JSON examples accompanying ALL TOML that would really help us poor folks making parsers. :)

@eksortso
Copy link
Contributor

eksortso commented Oct 1, 2020

@genericptr No, your TOML example would produce the structure exactly as you have it written out in JSON. The header [x.y.z.w] would create the table x (and x.y, and x.y.z, etc.) but x would not be defined yet. The subsequent header [x] starts the section where x is defined.

This was exactly what I said before.

@genericptr
Copy link

genericptr commented Oct 1, 2020

Is this correct? The second [x] is now invalid? This rule of "created" vs "defined" is very confusing. From the parsers perspective you've added the key "x" into he object so when I encounter "x" again I give a "already exists" error. What your saying I think is that I should mark them as "undefined" in addition to adding them to the object (talking in JSON terms btw) and then NOT give the error if the key exists but is merely undefined.

[x.y.z.w]

[x]

# this is invalid now because x has both been created AND defined
[x]

@eksortso
Copy link
Contributor

eksortso commented Oct 1, 2020

@genericptr Yes, your example is correct.

I'll see if I can describe table creation and table definition enough to explain what's going on. When a header is encountered, a TOML parser will create any table or subtable in the table's full name. In the example, the created tables are x, x.y, x.y.z, and x.y.z.w. But the only table that gets defined is x.y.z.w, at the end of the section. The three super-tables are as yet undefined. So when [x] is encountered, the parser knows that x is already created, and it starts the section that will define x. A table can only be defined in one place, so when the second [x] is encountered, the parser throws an error.

@genericptr
Copy link

I think that makes sense. Let me try to implement it now. This definition of "define" vs "create" would be great to add to the spec. I also suggest (if I may) to add JSON examples to all TOML examples (maybe a little popup or something would be nice and out of the way).

@eksortso
Copy link
Contributor

eksortso commented Oct 1, 2020

I may describe the concepts in depth in toml.md, once I find some time. I'm not sure if anyone else is doing this at the moment.

It may be more instructive to review the tests in the compliance project, if you want to see TOML and JSON comparisons, although JSON has no date/time types. You may want to look at what folks are doing on toml.io as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants