Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixed-type arrays still legal, even though the intent appears to have been to forbid them #553

Closed
rmunn opened this issue Aug 2, 2018 · 30 comments

Comments

@rmunn
Copy link

rmunn commented Aug 2, 2018

Creating a new issue for this because the history of this discussion is spread among multiple old (and now-closed) issues, and it's not at all clear which one should be used to re-open the discussion.

The TOML spec currently (2018-08-02) says that data types may not be mixed inside an array, but if the array contains nested sub-arrays then mixing data types is allowed. The example given is:

arr5 = [ [ 1, 2 ], ["a", "b", "c"] ]

In #28 (comment), the comment was made that strongly-typed languages like Haskell are going to have a very difficult time mapping arr5 to a native data structure, because of the mixing of types in the inner arrays. That is, arr5 is an array that contains two different types: "array of ints" and "array of strings", and in Haskell those two data types are completely different and can't be combined in the same collection: arr5 is neither an "array of arrays of ints" nor an "array of arrays of strings" and is thus invalid.

After some discussion, @mojombo decided to forbid mixing data types in arrays, and moved the [ [ 1, 2 ], ["a", "b", "c"] ] example to the "things that are not allowed" section of the spec. He created #154 for this, which also included adding a syntax for tuples. The discussion on #154 went on for some time, with a lot of back-and-forth between @dahu (who wanted to allow heterogenous arrays) and @BurntSushi (who was in favor of keeping arrays homogenous and forbidding arrays like [ [ 1, 2 ], ["a", "b", "c"] ]).

In the end, #154 ended up being closed in favor of #235 (motivation found in #219). But what was lost in the move from #154 to #219 was the decision about homogenous arrays. #154 updated the spec to forbid [ [ 1, 2 ], ["a", "b", "c"] ], but #219 did not.

So despite the fact that both @mojombo and @BurntSushi appear to have been in favor of forbidding [ [ 1, 2 ], ["a", "b", "c"] ], the spec still allows that mixing today. This has caused some confusion (see #28 (comment) for example), which is why I think the discussion needs to be re-opened.

Is [ [ 1, 2 ], ["a", "b", "c"] ] legal syntax or not?

The consensus of the TOML maintaners in those comments appears to have been to forbid that mixed array, but that part of the spec change got lost in the move from one PR to another. Was that intentional, and a reversal of the previous decision? Or was the intention always to forbid mixed arrays, but the spec change just fell through the cracks?

@dahu
Copy link

dahu commented Aug 2, 2018

FWIW, the current README (https://github.com/toml-lang/toml#array) says:

Data types may not be mixed (different ways to define strings should be considered the same type, and so should arrays with different element types).

Explicitly stating that arrays of mixed-type sub-arrays should be considered the same type (array of array, I presume). I don't know what Haskell will make of that.

I don't have any skin in this game, so I'll leave it to you guys to hammer out.

@rmunn
Copy link
Author

rmunn commented Aug 2, 2018

Yes, that's precisely the part of the README that I meant to quote in my issue description, but I failed to actually quote it. Thanks!

@bitwalker
Copy link

I had actually implemented array typing in my parser before I re-read that section and realized they were actually allowed. That said, I'm not sure the typing argument holds much water, as tables themselves already allow a mixture of types for values - presumably parsers which can't support heterogeneous arrays would need to handle them with the same care that they take to handle tables which don't map cleanly to some type they've defined (whether that's a struct, or just a typed map). I doubt a language which does not permit arrays like that would permit heterogeneous maps either. Either way, this problem already exists with JSON for example, as far as I'm aware all of those languages have mechanisms to cope with the dynamic nature of the format - why would TOML be any different?

@lmna
Copy link

lmna commented Aug 2, 2018

Is [ [ 1, 2 ], ["a", "b", "c"] ] legal syntax or not?

[1, "a"] is forbidden, but [[1],["a"]] is explicitly allowed since toml version 0.1.0. This feels inconsistent, but suggested fix is backwards-incompatible.

What is the use-case for things like [ [ 1, 2 ], ["a", "b", "c"] ] in configuration files?

@bitwalker
Copy link

bitwalker commented Aug 2, 2018

How about something simpler, which demonstrates why the definition of "type" is pretty loosely used in the spec:

deps = [ 
  { name = "foo", path = "../foo" },
  { name = "bar", url = "http://....", port = 8080 },
]

According to the spec, these are the same type, table; but from a strongly typed perspective, these are different types, similar for sure, but different. I would guess this type of config is quite common, or at least common enough. I don't really have a problem with the inconsistency, but it's hard to argue that this isn't at least a little confusing to people.

@eksortso
Copy link
Contributor

eksortso commented Aug 3, 2018

Although I missed all of the past arguments leading to homogenous arrays, from the spec alone, the typing seems simple enough to understand.

TOML typing is "strong but shallow," I'd call it. There's a fixed list of types (String, Integer, Float, Boolean, Datetime, Array, Inline Table) and every value has one of those types. As for arrays, all elements of a given array are the same type (call it the array's "subtype" if you wish), that is, one of the following: String, Integer, Float, Boolean, Datetime, Array, or Inline Table.

So an Array of Arrays is properly homogenous in TOML v0.5.0, even if the subtypes of the inner arrays differ. The subtypes don't matter outside of the inner arrays.

Not knowing Haskell, I don't know the burden that the TOML types bring upon the language, nor how they can be properly handled. But surely a specific config format demands specific types be used for a config file to be considered valid.

@bitwalker
Copy link

If the purpose of homogeneous arrays in TOML is to prevent issues with strongly typed languages decoding TOML data structures, then allowing an array of arrays with different subtypes defeats that goal, as you either have to treat the type of the outer array as the common supertype of all the inner arrays, sacrificing the utility of types, or rely on whatever reflection mechanism the language provides, which usually incurs expensive overhead and is basically an escape hatch anyway.

I don't think anyone is debating whether types in the spec are valuable, they definitely are (in my opinion), and are a strength of TOML over things like JSON. That said, there doesn't appear to be a good justification, from a type system perspective, why heterogeneous arrays are disallowed, but arrays of arrays with different subtypes are allowed - it's unsound.

Let's take it at face value that arrays have a type of Array, then an array of Array is itself Array, and an array of [1, "foo", 3.14] is also of type Array. But since the spec claims that arrays effectively have a subtype, Array[T], in which case an array of arrays should have type Array[Array[T]], which logically prohibits arrays of arrays with different subtypes.

I think TOML should make a choice, post-1.0, to either commit to subtyping Array, or abandon any subtyping of arrays - I would favor the latter, since I don't believe homogeneous arrays actually solve the problem it appears they were designed to solve - but I could be convinced either way. What I don't think we can do is leave this kind of weird hole in the typing scheme. The statement "subtypes don't matter outside of the inner arrays" doesn't make sense in the context of type systems, because they absolutely do matter in a strongly typed language, as the type of the outer structure is dependent on the type of it's members.

@rmunn
Copy link
Author

rmunn commented Aug 3, 2018

@bitwalker's example from #553 (comment) is a good starting point to explain why I'd like "array of mixed array types" to be forbidden. As I implement a TOML parser in F# (which, like C#, is strongly-typed), I have to choose how to represent arrays in the values I return to the user. Arrays of integers or strings will simply be int[] or string[]. Arrays of tables... well, tables must be mixed and there are very good use cases for it. So an array like @bitwalker's deps example, I will have to represent as object[].

Now, if the only thing I represent as object[] in my library is "array of tables", then that's easy for the library-consuming code to handle. If you see an object[], then treat each object as a table and cast it to the appropriate table type. BUT if arrays of mixed array types are allowed, I have a problem. Because F# (and C# as well) won't let me create an array that contains both int[] and string[]. The only way I can do so is by casting both the int array and the string array to object, and making an array of type object[]. And suddenly the type object[] has two meanings in my library, and the user consuming the library has to resort to runtime type-checking to distinguish between them. No longer can object[] be treated safely as "array of tables"; now the end user who sees an object[] has to check whether it's an array of tables, or an array of arrays (and he further has to tell what kind of arrays are contained in that array of arrays, which is quite difficult).

Permitting heterogenous tables does present a few difficulties too, of course, but there are very good use cases for it. But there are no good use cases for heterogenous arrays given that tables exist; for any valid use for heterogenous arrays (such as ["example.com", 80]), that purpose can be better served by a named table: { host="example.com", port=80 }.

Therefore, allowing arrays of arrays to have mixed types is not only a weird inconsistency in the spec, it also makes things more difficult for strongly-typed languages without a good reason for creating that difficulty. And so I'm arguing against having it in there.

Also, both @mojombo and @BurntSushi appear to have intended to remove the "arrays of arrays of mixed types" example from the spec, but it seems to have fallen through the cracks, which is another reason I want to revive this discussion before 1.0 hits.

@ChristianSi
Copy link
Contributor

@rmunn:

it seems to have fallen through the cracks, which is another reason I want to revive this discussion before 1.0 hits.

I think that ship has sailed. The current spec says:

As of version 0.5.0, TOML should be considered extremely stable. The goal is for version 1.0.0 to be backwards compatible (as much as humanly possible) with version 0.5.0.

It's certainly humanly possible to leave Array of (mixed) Array in the spec, so unless the spec writers want to declare themselves liars, we are talking about a possible change to TOML 2.0.

Personally, I think the current decision is quite reasonable. As @eksortso said, TOML's typing is "strong but shallow." It would be madness to force a type upon tables, so arrays of (untyped) inline tables must remain allowed, but allowing arrays of (arbitrary) tables while forbidding arrays of (arbitrary) arrays would feel weird and inconsistent to me.

Also, as a Haskell programmer I can confirm that Haskell can deal with JSON's extremely weak typing just fine, so TOML's much stronger (though not perfectly strong) typing certainly won't present insurmountable obstacles.

@rmunn
Copy link
Author

rmunn commented Aug 3, 2018

Except that since there's no use case that I can think of, and therefore nobody using it in practice, there's no downside to removing it. The goal was stability before 1.0, but a goal can be changed with a good enough reason: and "Oops, we meant to do this but forgot" is, IMHO, a good enough reason when there's zero downside and a significant benefit (ease of use in strongly-typed languages).

Is anyone using this in practice? Please speak up if you are; that would be a VERY strong argument against removing this weird corner case, because if it's actually useful to someone then it's not a weird corner case.

But if nobody is using it in practice, I'd argue that removing it comes at basically no cost, since in the languages where this matters it's actually harder to allow mixed arrays than it is to forbid them.

@TheElectronWill
Copy link
Contributor

@rmunn

I'd argue that removing it comes at basically no cost, since in the languages where this matters it's actually harder to allow mixed arrays than it is to forbid them.

In other languages such as Java it's more tedious to forbid them than to allow them. You'll have to check the class of the elements, which on a shallow level is fine but not on a deep level because lists all have the same type at run-time.

Regarding your library: shouldn't an array of tables be something like dictionary[] instead of object[]? (sorry if there's an obvious reason, I've never used F# ^^)

@demurgos
Copy link

demurgos commented Aug 10, 2018

Hi,
Sorry for jumping this late in the discussion, but I consider the current requirement for array homogeneity to be surprising and wrong.
This requirement is pushing the need to embed a type checker in the parser, and the lack of deep type matching causes soundness issues as described in #553 (comment).

For statically/strongly typed language, the representation of arrays of mixed values is a solved issue: it should be represented as a sum type. So you'd have a TomlValue[] or map of TomlValue. You need to have this kind of type any way for table values, I don't see why it wouldn't be the type of array items. In languages such as C#, you can for example represent sum types with inheritance and add a safe API on the parent class: the instances are already tagged with the name of the concrete class. Same for Java. Haskel, OCaml, Rust, etc support sum types out of the box. In Javascript/Typescript, you can use native types or explicitly tagged unions.

TomlValue is the lowest common denominator and you have to unwrap it to your correct type. But this should be the responsibility of the schema validation step/object mapping. This can be inter-wined with parsing, but it is conceptually a different step.

You can treat tables as a Map<string, TomlValue> but most likely you'll want to map it to a strongly typed struct or your language equivalent. For this you need a schema: either something declarative or procedural code validation. You need to ensure that the required fields are there, properly typed etc. But this cannot be enforced at the TOML level.

There was an example above (#553 (comment)):

deps = [ 
  { name = "foo", path = "../foo" },
  { name = "bar", url = "http://....", port = 8080 },
]

From a parser pointer of view, this is an array of tables. From a consumer point of view this is an array of strongly typed resource references, a sum type between a local file and HTTP document: it makes no sense for TOML to try to guess the type of them items and enforce uniformity, so it defaults to a lower common denominator: table. Why shouldn't it go further and use a type common to table/string/number?

But how different is this case from (for example) a list of tuples represented as arrays? Sure, some languages like Java are against tuples because they are not explicit, but on the other hand they allow to pack a lot of information without noise. Preventing mixed-type arrays prevents me from using TOML to represent a group of entities as a list of tuples:

[
  [123, "user1"],
  [456, "user2"],
  [789, "user3"],
]

It's good that TOML has a few intrinsic types, but enforcing schema validation (even locally) should be out of scope of a data representation language. TOML should remain language agnostic and avoid this kind of homogeneity constraint.

@ChristianSi
Copy link
Contributor

ChristianSi commented Aug 11, 2018

@demurgos Arrays are logically different from tuples:

  • Array: variable-size list of values of (usually) the same type
  • Tuple: fixed-size list of values of different types in a specific order

Also, there are named tuples where each of the values expected by a tuple has an explicit name (in regular tuples, each value only has an implicit name: someone must know what it represents, but that information is not part of the tuple). Every tuple can be converted to a named tuple by making the implicit information explicit. TOML doesn't have named tuples, but (as you know) it has tables – and, in particular, inline tables – which serve exactly the same purpose. Your example can also be represented nicely as an array of inline tables:

[
  {id=123, name="user1"},
  {id=456, name="user2"},
  {id=789, name="user3"},
]

Inline tables have the advantage of being self-documenting, which also prevents the risk of undetected errors. #154 was a proposal to add (unnamed) tuples to TOML, but it was rejected in favor of inline tables – quite reasonably, I'd say.

@demurgos
Copy link

demurgos commented Aug 11, 2018

@ChristianSi
I definitely agree that arrays are logically different from tuples, the same way tables/hash maps are logically different from structs/named tuples. (Side note: I acknowledged that tuples are less explicit but also have less noise, the point of my example was specifically about avoiding tables.)
TOML does not have named tuples, but it can represent them using tables. Similarly, it does not have anonymous tuples but it can represent them as arrays.

To decode them from this TOML representation and retrieve their high-level type, you need context (what I call the schema).

If we go on, TOML does not have sets, queues, graphs, etc. But I can use it to represent them. TOML provides some primitive types that I can use as an intermediate abstraction. The primitive "array of any legal TOML value" is more useful than "array of TOML values with the same TOML type".

Another way to look at it is that mixed type arrays allow you to have uniform type arrays, but not the other way around. It allows users to chose the behaviour they want. For example in Python, checking for uniformity is a one-liner:

assert all(type(item) == type(arr[0]) for item in arr)

My main point is that TOML does not have enough information to decide to enforce this check or not. Forcing this check at the spec level artificially restricts the possible usages.

@haven1433
Copy link

I'd like to point out that tables can be used to represent arrays of non-uniform arrays.

arr5 = { numbers=[ 1, 2 ], letters=["a", "b", "c"] }

This is exactly the reasoning used to include inline tables instead of tuples. If you have non-homogeneous data, each element of the data likely has a reasonable name that could be associated with it. Likewise, if you have an array of arrays, where each inner array needs a different data-type, I'd claim that each inner array has a reasonable name that could be associated with it.

The shallow nature of TOML's type system is confusing. My (unimportant) opinion would be to either go all out (arrays are strongly typed based on their subtype) or avoid it all together (arrays don't care about subtype). My preference would be for the first, because type validation can be very useful.

As pointed out, this would make it incompatible with v5.0.0. But in the last 4 months, no one has piped up claiming to actually use arrays of non-uniform arrays.

@demurgos
Copy link

demurgos commented Dec 5, 2018

I'd like to point out that tables can be used to represent arrays of non-uniform arrays.

arr5 = { numbers=[ 1, 2 ], letters=["a", "b", "c"] }

You lose ordering with this scheme (and it hurts semantics). If you don't need ordering it's fine, but it prevents you from differentiating the following situations: letters followed by numbers (["a", "b", "c", 1, 2]), numbers followed by letters ([1, 2, "a", "b", "c"]), interleaved letters and numbers (["a", 1, "b", 2, "c"]).

The shallow nature of TOML's type system is confusing. My (unimportant) opinion would be to either go all out (arrays are strongly typed based on their subtype) or avoid it all together (arrays don't care about subtype).

I agree with this observation. I am still not convinced that TOML is well suited for validation. It's type system is oriented toward ensuring that data is well-formed. Asking TOML to handle validation would require to add abstractions such as interfaces and would complicate implementations. I'd prefer TOML to focus only on data representation and leave validation to higher levels (such as IDLs).

@LongTengDao
Copy link
Contributor

LongTengDao commented Dec 5, 2018

There will always be TOML legal configs that not allowed by program, like:

lastName = "LongTeng Dao"
Error: lastName field can't include space!

So, if an legal value is not supported by some program, just specify a rule for itself... I can't see any difference between illegal case and error case.

@haven1433
Copy link

You lose ordering with this scheme (and it hurts semantics). If you don't need ordering it's fine, but it prevents you from differentiating the following situations: letters followed by numbers (["a", "b", "c", 1, 2]), numbers followed by letters ([1, 2, "a", "b", "c"]), interleaved letters and numbers (["a", 1, "b", 2, "c"]).

Do you have an example of where the ordering of non-uniform unnamed items matter? The closest example coming to my mind is in specifying a data-format, at which point a table is probably a better candidate anyway:

[[class]]
fields = [
   { type = "char[]", length = 12 },
   { type = "int" },
   { type = "date", includeTime = true },
]

When do you find yourself wanting a clear ordering of unnamed items of different types?

@demurgos
Copy link

demurgos commented Dec 6, 2018

The most common situation I have seen is config files allowing to mix strings and tables corresponding to the parsed version (or with extra parameters).

contributors fields in JS packages:

[package]
contributors = [
  "Foo bar <foo@bar.com>",
  { name = "Baz Qux", email = "baz@qux.com", url = "baz.qux.com"},
]

Uplink servers in private package managers:

[uplink]
servers = [
  "https://srv1.uplink.com/",
  { url = "https://private.uplink.com/", auth_token = "..." },
]

In these cases, ordering matters for core logic (in which order to print the contributors, which uplink server has priority in case of inconsistency), but ordering is very useful for seemingly unordered values when you want determinism (for example to run your test suite and compare the results).

Regarding the example you posted with the tables, this is a common case. It currently works because of the shallow typing of TOML. This is also the kind of situation that I fear may break if you ask TOML to do some deep checks: how is the decoder supposed to know that these values are compatible?

I just want to reiterate that your method is useful and may solve many issues, but not all. I just think that a more general solution is possible.

@rmunn
Copy link
Author

rmunn commented Dec 6, 2018

AFAICT, @demurgos's examples (of arrays which mix strings and tables) violate the TOML spec, and if they currently work, it's because some TOML parsers aren't enforcing the spec. The spec says "Data types may not be mixed", and gives as an example of mixing data types:

arr6 = [ 1, 2.0 ] # INVALID

Since TOML considers ints and floats to be different data types, which may not be mixed in a single array, it therefore follows that strings and tables (which are far more different than ints and floats) are also different data types, and the contributors and servers examples in @demurgos's examples are in violation of the TOML spec.

The obvious way to rewrite those examples to bring them into spec compliance would be:

[package]
contributors = [
  { name = "Foo bar", email = "foo@bar.com" },
  { name = "Baz Qux", email = "baz@qux.com", url = "baz.qux.com"},
]
[uplink]
servers = [
  { url = "https://srv1.uplink.com/" },
  { url = "https://private.uplink.com/", auth_token = "..." },
]

TOML does not (and should not) require that tables in an array all have exactly the same "shape", so these contributors and servers arrays are legal according to the TOML spec as it currently stands.

@demurgos
Copy link

demurgos commented Dec 6, 2018

@rmunn
You are right, my examples are currently violating the spec and should be rewritten as in your comment to be compliant.

I brought these up because I am advocating that the spec should be relaxed to make these examples legal. I think that the current spec sits in the middle with regard to validation: it's more than simple syntax checks but not enough to provide full validation.

@haven1433
Copy link

haven1433 commented Dec 6, 2018

@demurgos what you're advocating for is ringing similar to the various tuples proposals, such as in #154. I claim that, thanks to inline tables, TOML has two very useful "group" datatypes right now: homogeneous arrays, where elements are unnamed but strongly ordered, and non-homogeneous tables, where elements are named but not strongly ordered. Combining these two constructs together can give you any combination of homogeneous, non-homogeneous, ordered, and unordered.

A problem I see is that, if arrays are not strongly typed, you can do something like this:

parameterList = [[15], ["john doe"], [2003-03-20]]

Which looks like someone trying to hack around a problem. You claim that the best solution is to let them do what they wanted in the first place:

parameterList = [15, "john doe", 2003-03-20]

While I claim that the strictness of TOML should guide them to a less succinct, but more descriptive format:

parameterList = { age=15, name="john doe", birthdate=2003-03-20 }

Or, if the order matters:

parameterList = [
   { age = 15  },
   { name = "john doe" },
   { birthdate = 2003-03-20 },
]

A more strict TOML might lead to slightly more verbose data, but I claim that the names of elements are important when the data is non-homogeneous.

@ChristianSi
Copy link
Contributor

ChristianSi commented Dec 7, 2018

I consider @demurgos 's examples useful use cases indicating that the "array elements must have the same type" restriction should indeed be dropped from the spec. The TOML spec should not force applications to use a specific modelling style, prohibiting anything else. If app writers want to allow their contributors (or whatever else a list is used for) to specify their credentials either compactly as "Name <email>" or more verbosely as { name = "...", email = "..." } (with the second syntax also allowing other, optional fields which are not supported by the first syntax), why shouldn't they? Note that they don't have to make that choice – if they want to standardize on a single syntax, fine. But if they want to allow alternative syntaxes, I'd consider that a legitimate choice which should not be prohibited by an arbitrary decision of the TOML spec authors. So, let's drop that restriction.

An additional oddity of the current situation is that not even ints and floats might be mixed in arrays. For example, consider the 1–2–5 series of preferred numbers, written in the Wikipedia (after slight TOML-ification) as follows:

values = [0.1, 0.2, 0.5, 1, 2, 5, 10, 20, 50]  # etc.

Currently, TOML parsers have to reject that array, though there is no good reason for this. Every mathematically inclined person knows that 1 and 1.0 are the same number, and (while this might not be true for their digital representations) nearly all programming languages will autoconvert ints to floats as needed. TOML probably shouldn't autoconvert, but it nevertheless should allow arbitrary lists of numbers. By allowing arbitrary values in arrays, we get that for free.

@eksortso
Copy link
Contributor

eksortso commented Dec 7, 2018

From the perspective of the spec, changing arrays from single-type to heterogeneous is a backwards-compatible change. I'm personally okay with this change. But whatever basic validation was provided by existing parsers to prevent heterogeneous arrays will likely be going away.

At the very least, older parsers that properly implemented that validation could make it an optional feature instead. For small applications, users could make a simple fix to parser calls to ensure that arrays stay single-typed.

For bigger applications, schema validators could take up the slack and provide a way to specify only specific types within arrays. Heads up to #116, for instance.

@workingjubilee
Copy link
Contributor

workingjubilee commented Sep 3, 2019

fwiw, since some "but what will Haskell do" questions came up about this and this is still being referenced, I feel I should add a data point: https://typeclasses.com/phrasebook/dynamic

We might use this, for example, to construct a heterogeneous list – that is, a list containing elements of different types.

Though I don't think this changes anything in particular inside the conversation.

@eksortso
Copy link
Contributor

eksortso commented Sep 3, 2019

PR #663 intends to enforce homogeneous arrays in v1.0.0. If accepted, then mixed-type arrays will remain invalid until at least v1.0.1.

Just so you know, I'm doing this only to get v1.0.0 done.

@pradyunsg
Copy link
Member

pradyunsg commented Nov 6, 2019

I just bit the bullet and went the other way entirely -- allowing arrays to contain heterogeneous values as argued for in #665.

That change makes 1.0.0 a superset of 0.5.0, so every 0.5.0 file is still a valid 1.0.0 file. I imagine that some implementations might need updating for this.

I'm already planning to do a 1.0.0-rc.1 release, to have a period for implementations to catch up and provide inputs, prior to releasing 1.0.0.

@pradyunsg
Copy link
Member

Closing since there's nothing actionable here now. :)

@ViaFerrata
Copy link

ViaFerrata commented Feb 3, 2021

@demurgos [...] Your example can also be represented nicely as an array of inline tables:

[
  {id=123, name="user1"},
  {id=456, name="user2"},
  {id=789, name="user3"},
]

Inline tables have the advantage of being self-documenting, which also prevents the risk of undetected errors. #154 was a proposal to add (unnamed) tuples to TOML, but it was rejected in favor of inline tables – quite reasonably, I'd say.

@ChristianSi Just a thought about this solution.
Yes, inline tables work well to mix types, but only if the user has profound knowledge about toml.

Imagine you have a usecase where you allow users, having no or little toml knowledge, to make their own toml file based on a reference file. I.e. in the example from above, users would change the id and the name in the inline tables, and add/delete inline tables if necessary.
But what happens, if the user splits an inline table into two lines? Just because it looks better for him in this way?

[
  {id=123,
   name="user1"},
]

He will get an error, since:
"Inline tables are intended to appear on a single line. No newlines are allowed between the curly braces unless they are valid within a value." (from toml spec).
So in this case, using inline tables is not really possible. I can expect from (my) users to adhere to the general syntax based on an examplary toml file, but I cannot expect them to not do any linebreaks somewhere.

@eksortso
Copy link
Contributor

eksortso commented Feb 3, 2021

@ViaFerrata It's true that config editors without a lot of experience may be confused by this. So, specifying a good format to copy from would be useful for them. For a simple list of users, like this one:

users = [
  {id=123, name="user1"},
  {id=456, name="user2"},
  {id=789, name="user3"},
]

... you can instead use the double-bracket [[array-of-tables]] syntax to allow for more flexibility. The above is equal to this:

[[users]]
id = 123
name = "user1"
[[users]]
id = 456
name = "user2"
[[users]]
id = 789
name = "user3"

... and those editors would be able to add a new entry to the list with little fuss.

If an array of inline tables were used, they may try to split a table up and fail at it. Inline tables are for small, self-contained tables, and that's that. It's up to whoever sets the example for future editors to pick the smartest default choice for their purposes.

xiaokangwang added a commit to v2fly/v2ray-core that referenced this issue Oct 21, 2021
Whether to allow mixed type data is contested. Since V2Ray does not use this kind of mix content in array by design, relaxing this test to avoid test break.
toml-lang/toml#553
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests