Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.INI compatibility is a worthy goal #411

Closed
hgolden opened this issue May 5, 2016 · 6 comments
Closed

.INI compatibility is a worthy goal #411

hgolden opened this issue May 5, 2016 · 6 comments

Comments

@hgolden
Copy link

hgolden commented May 5, 2016

I believe that it is a worthy goal to make TOML upward compatible with a large class of existing .INI files (despite the fact that the format isn't fully specified). If necessary, this could be an alternative syntax. The idea behind this is to immediately acquire a large set of valid TOML so adoption of the parser will be encouraged.

I don't know enough about .INI in non-ASCII uses, but it would be preferable to read data from existing .INI files encoded in existing non-Unicode encodings. The reason is that there are so many of them. To enable this, it should be possible to specify a default non-Unicode encoding when calling a TOML parser. I assume that ICU would provide all the necessary code conversion.

While UTF-8 is a lingua franca, IMO there is no reason that it has to be the only encoding accepted.

@BurntSushi
Copy link
Member

BurntSushi commented May 6, 2016

I don't think it's wrong to say that one of the motivations for TOML was to embrace an "ini-like" format, but with something that is well specified. Trying to make TOML work with various alternative unspecified ini formats seems like backtracking on that goal, and also seems like it would be quite complex. Trying to make TOML compatible with some number of the various flavors of ini out in the wild feels like a fool's errand.

While UTF-8 is a lingua franca, IMO there is no reason that it has to be the only encoding accepted.

Of course there are reasons. One major reason is simplification.

It seems to me like most of your ideas could be implementation specific. For example:

  1. If you want to support multiple ini-like formats, then write a converter for the formats you care about.
  2. If you want to support multiple text encodings, then transcode to UTF-8 first.

@hgolden
Copy link
Author

hgolden commented May 6, 2016

Trying to make TOML work with various alternative unspecified ini formats seems like backtracking on that goal, and also seems like it would be quite complex.

Let me clarify my point about this: I said to make TOML compatible with a large class of existing INI files. By this I meant to specify completely what it can read. This can be determined empirically. As an example, run all Windows INI files through it and make sure they can be parsed. If there are any weird cases, decide whether they should be supported. The Wikipedia article on INI files has good references. As a minimum, the Apache Commons specification for INI files should be supported.

@BurntSushi
Copy link
Member

@hgolden Right. And that sounds like a giant hairball to me. It would make the spec significantly more complicated. For example, many ini files support unquoted strings. We don't.

@hgolden
Copy link
Author

hgolden commented May 6, 2016

For example, many ini files support unquoted strings. We don't.

The TOML spec doesn't belong to me, but your statement begs the impertinent question: Why? Would interpreting unquoted strings as strings create a giant hairball? (On the other hand, would accepting unquoted strings mean that you already had millions of TOML-conforming files? But who cares anyway? We'd rather have a parser that fewer people will use.)

Here's my point, in case you're still reading: A language designed for human writing should be liberal rather than conservative (Postel's Law).

@mojombo
Copy link
Member

mojombo commented May 14, 2016

It would indeed be very nice to have a boatload of valid TOML out there from the get-go, but I agree with @BurntSushi that it's impractical. And this is a big reason why:

On the other hand, would accepting unquoted strings mean that you already had millions of TOML-conforming files? But who cares anyway? We'd rather have a parser that fewer people will use.

The answer, for me, is yes, I'd rather have fewer users. TOML came about precisely because I can't stand unquoted strings in things like YAML and INI. They are anything BUT unambiguous, which is one of TOML's most sacred founding goals. I'm ok if not everyone in the world uses TOML, but I'm not ok making compromises that undermine the very things that makes TOML great.

I get your perspective, I really do. I simply happen to be going for something different and better, rather than the most effective short-term way to bolster adoption. I think adoption will happen just fine if we hold true to TOML's core values. That approach has always worked for me in the past.

@mojombo mojombo closed this as completed May 14, 2016
@hgolden
Copy link
Author

hgolden commented May 14, 2016

I get your perspective, I really do. I simply happen to be going for
something different and better, rather than the most effective
short-term way to bolster adoption. I think adoption will happen just
fine if we hold true to TOML's core values. That approach has always
worked for me in the past.
I understand your perspective also. Postel's Law is great for getting
adoption going, but there are significant long-term impacts it causes.
At some point it gets in the way. Thanks for your reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants