New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In favor of NULL #146

Closed
rcarver opened this Issue Feb 28, 2013 · 10 comments

Comments

Projects
None yet
7 participants
@rcarver

rcarver commented Feb 28, 2013

Here's an argument in favor of NULL values, as previously discussed and rejected in #30.

I believe that in configuring a system, the most important things are:

  1. The set of available keys (and key groups).
  2. An understanding of what those keys do, and as necessary, the type of value expected.

Therefore, I think it's important to be able to define, and comment, keys for which you don't yet have a value. A TOML document should be able to act as a specification for the possible configuration. It may be preferable not to define a value in the TOML config - say, in order to set a reasonable default at runtime. But, it is important to specify that such a value can be set. This is typically done by commenting out the key, and that seems ugly.

Put another way, it's the difference between hash[key].nil? and hash.key?(key) in Ruby or hash[key] == null and hash[key] === undefined in JavaScript. I think it's important, to aid in the downstream validation and use of the data provided by a TOML document.

Disclosure: my own take of this whole situation is levels, which defines a way to merge multiple inputs into a final configuration. When adding TOML support in rcarver/levels#3 I realized that we have a fundamental disagreement here. In all other ways, TOML is the ideal format for levels configuration.

As far as the syntax, I don't have a strong opinion. I think I'm leaning toward a lack of value because it doesn't introduce a new keyword, and it resembles what you'd do in bash.

this_is_null = 
@BurntSushi

This comment has been minimized.

Show comment
Hide comment
@BurntSushi

BurntSushi Feb 28, 2013

Member

Note that this necessarily adds a new NULL type to the spec. (A type containing precisely one value. Otherwise known as the unit type.) A new type isn't so much a big deal, but it bullies itself into the type of all other types in TOML. Namely, an integer is no longer just an integer. It's an integer or NULL.

I think the added type complicates things. It means that every valid TOML parser has to differentiate between non-existence and NULL. This complicates types in static languages.

@rcarver - Could you maybe elaborate on why it is important for a TOML file to have knowledge of the set of definable keys? (As opposed to this information being in the application, or perhaps defined in a TOML array somewhere.)

Member

BurntSushi commented Feb 28, 2013

Note that this necessarily adds a new NULL type to the spec. (A type containing precisely one value. Otherwise known as the unit type.) A new type isn't so much a big deal, but it bullies itself into the type of all other types in TOML. Namely, an integer is no longer just an integer. It's an integer or NULL.

I think the added type complicates things. It means that every valid TOML parser has to differentiate between non-existence and NULL. This complicates types in static languages.

@rcarver - Could you maybe elaborate on why it is important for a TOML file to have knowledge of the set of definable keys? (As opposed to this information being in the application, or perhaps defined in a TOML array somewhere.)

@rcarver

This comment has been minimized.

Show comment
Hide comment
@rcarver

rcarver Feb 28, 2013

@BurntSushi the or case and static languages are good points. I'm still pondering the implications myself, thanks.

In practice, I find that something needs to define the set of possible keys. Again, in practice, they tend to accumulate over time and it's difficult to track. I'm thinking about both traditional applications and also provisioning tools (Chef) that use lots and lots of configuration variables. I like that the config file can act as the one place that defines the possible keys. I like that the app can enforce that the key is defined in the config file. If it's defined as NULL, the app can provide a default value if appropriate.

To put this all into perspective, I think we should look at the use of TOML data. TOML parses to a hash, which I understand to return NULL when an undefined key is read (coming from Ruby). Here are some examples to consider how an application might want to treat various cases.

When NULL is not allowed

[user]
username = "rcarver"
# name = "example name"

Obviously, this works:

config["user"].key?("username") # => true
config["user"]["username"] # => "rcarver"

Generally reading an undefined key returns NULL.

config["user"].key?("name") # => false
config["user"]["name"] # => nil

Alternatively, and application could choose to raise an error:

config["user"].key?("name") # => false
config["user"]["name"] # raises exception

When NULL is allowed

[user]
username = "rcarver"
name = # "example name"

We can safely read the key, and still decide between the options above for both undefined and null keys.

config["user"].key?("name") # => true
config["user"]["name"] # => nil

So, if we agree that a hash returns NULL for an undefined key, an application already has to deal with "value or NULL" case. Adding NULL support to TOML lets an application differentiate between "no value" and "undefined" if it chooses to do so.

All that said, I do agree that it complicates TOML. I'll think on this some more. Happy to hear more perspectives here.

rcarver commented Feb 28, 2013

@BurntSushi the or case and static languages are good points. I'm still pondering the implications myself, thanks.

In practice, I find that something needs to define the set of possible keys. Again, in practice, they tend to accumulate over time and it's difficult to track. I'm thinking about both traditional applications and also provisioning tools (Chef) that use lots and lots of configuration variables. I like that the config file can act as the one place that defines the possible keys. I like that the app can enforce that the key is defined in the config file. If it's defined as NULL, the app can provide a default value if appropriate.

To put this all into perspective, I think we should look at the use of TOML data. TOML parses to a hash, which I understand to return NULL when an undefined key is read (coming from Ruby). Here are some examples to consider how an application might want to treat various cases.

When NULL is not allowed

[user]
username = "rcarver"
# name = "example name"

Obviously, this works:

config["user"].key?("username") # => true
config["user"]["username"] # => "rcarver"

Generally reading an undefined key returns NULL.

config["user"].key?("name") # => false
config["user"]["name"] # => nil

Alternatively, and application could choose to raise an error:

config["user"].key?("name") # => false
config["user"]["name"] # raises exception

When NULL is allowed

[user]
username = "rcarver"
name = # "example name"

We can safely read the key, and still decide between the options above for both undefined and null keys.

config["user"].key?("name") # => true
config["user"]["name"] # => nil

So, if we agree that a hash returns NULL for an undefined key, an application already has to deal with "value or NULL" case. Adding NULL support to TOML lets an application differentiate between "no value" and "undefined" if it chooses to do so.

All that said, I do agree that it complicates TOML. I'll think on this some more. Happy to hear more perspectives here.

@BurntSushi

This comment has been minimized.

Show comment
Hide comment
@BurntSushi

BurntSushi Feb 28, 2013

Member

@rcarver

I'm not sure how NULL gives an application the ability to enforce that a key is defined. Doesn't that ability exist anyway? If the key isn't defined, then a default value can be given.

I think I just have a fundamentally different opinion about where the Truth of which keys are available should be known. I don't believe it belongs in a configuration file (controlled by users). I'll leave this point to be debated by others.

With that said, I still want to make the typing implications of NULL values clear for anyone else that wants to weigh in.

So, if we agree that a hash returns NULL for an undefined key, an application already has to deal with "value or NULL" case.

Almost all implementations of a hash table provide a way to distinguish between keys that are defined and keys that map to a NULL value. (The lone exception that I know of is Lua.) Namely, the possibility of non-existence is handled by the type of the hash rather than the values stored in the hash. In this way, non-existence does not creep into the type of any value, as it is handled implicitly in the type of a hash.

With NULL values, every parser has to distinguish between non-existence and NULL for every value.

In dynamic languages, this isn't an unreasonable burden. Indeed, the distinction is even difficult to notice. Mostly because dynamic languages allow any type to contain NULL values (they've allowed it to be a big bully). In static languages, not all types can have NULL values.

Most static languages have facilities to handle such things, but it becomes a burden when they must be anticipated for all types.

Member

BurntSushi commented Feb 28, 2013

@rcarver

I'm not sure how NULL gives an application the ability to enforce that a key is defined. Doesn't that ability exist anyway? If the key isn't defined, then a default value can be given.

I think I just have a fundamentally different opinion about where the Truth of which keys are available should be known. I don't believe it belongs in a configuration file (controlled by users). I'll leave this point to be debated by others.

With that said, I still want to make the typing implications of NULL values clear for anyone else that wants to weigh in.

So, if we agree that a hash returns NULL for an undefined key, an application already has to deal with "value or NULL" case.

Almost all implementations of a hash table provide a way to distinguish between keys that are defined and keys that map to a NULL value. (The lone exception that I know of is Lua.) Namely, the possibility of non-existence is handled by the type of the hash rather than the values stored in the hash. In this way, non-existence does not creep into the type of any value, as it is handled implicitly in the type of a hash.

With NULL values, every parser has to distinguish between non-existence and NULL for every value.

In dynamic languages, this isn't an unreasonable burden. Indeed, the distinction is even difficult to notice. Mostly because dynamic languages allow any type to contain NULL values (they've allowed it to be a big bully). In static languages, not all types can have NULL values.

Most static languages have facilities to handle such things, but it becomes a burden when they must be anticipated for all types.

@rcarver

This comment has been minimized.

Show comment
Hide comment
@rcarver

rcarver Mar 1, 2013

@BurntSushi I completely agree with the typing implications of NULL. In fact, most of the time I would take your position.Two things continue to have me question that in this context:

  • Experience tells me there's something inherently messy about config files and configuration, so I'm not sure what adding this level of purity/cleanliness will provide (other than simpler parsers, but maybe that's enough).
  • I'm not sure how opinionated TOML intends to be about the workflows implied by not supporting NULL. I hope this thread at least shows what that means.

I'm glad to have had this discussion. At this point I could go either way, whatever @mojombo thinks aligns with the overall goals of TOML.

rcarver commented Mar 1, 2013

@BurntSushi I completely agree with the typing implications of NULL. In fact, most of the time I would take your position.Two things continue to have me question that in this context:

  • Experience tells me there's something inherently messy about config files and configuration, so I'm not sure what adding this level of purity/cleanliness will provide (other than simpler parsers, but maybe that's enough).
  • I'm not sure how opinionated TOML intends to be about the workflows implied by not supporting NULL. I hope this thread at least shows what that means.

I'm glad to have had this discussion. At this point I could go either way, whatever @mojombo thinks aligns with the overall goals of TOML.

@tnm

This comment has been minimized.

Show comment
Hide comment
@tnm

tnm Mar 4, 2013

Contributor

My initial intuition tells me that NULL should be avoided for the general historical reasons most of us are familiar with. I'd be curious if there was evidence of wide-spread usage of a NULL value in existing application/database/system config files (not in the format specs themselves, but simply on-the-wild config file content), but to my knowledge it's pretty rare.

Contributor

tnm commented Mar 4, 2013

My initial intuition tells me that NULL should be avoided for the general historical reasons most of us are familiar with. I'd be curious if there was evidence of wide-spread usage of a NULL value in existing application/database/system config files (not in the format specs themselves, but simply on-the-wild config file content), but to my knowledge it's pretty rare.

@rossipedia

This comment has been minimized.

Show comment
Hide comment
@rossipedia

rossipedia Mar 5, 2013

Contributor

Null has some value as representative of the idea of an unknown, especially in RDBMS. However, I've rarely ever found it useful in actually application code, as most usages of it are better served by patterns such as Null Object.

Contributor

rossipedia commented Mar 5, 2013

Null has some value as representative of the idea of an unknown, especially in RDBMS. However, I've rarely ever found it useful in actually application code, as most usages of it are better served by patterns such as Null Object.

@tnm

This comment has been minimized.

Show comment
Hide comment
@tnm

tnm Mar 8, 2013

Contributor

Yeah I don't really see a strong enough argument to justify the complexity and burden of NULL. I remain in favor of keeping it out.

Contributor

tnm commented Mar 8, 2013

Yeah I don't really see a strong enough argument to justify the complexity and burden of NULL. I remain in favor of keeping it out.

@ambv

This comment has been minimized.

Show comment
Hide comment
@ambv

ambv Apr 24, 2013

Contributor

Definitely keep it out. As @BurntSushi correctly points out, if NULL is in the file format, you have to special-case it while using any other type. In the real world, this is already the case because a key might not be set at all. So while I think having NULL as a type is bad, the "explicitly unset a key" syntax looks useful:

[integration]
api_key= 
Contributor

ambv commented Apr 24, 2013

Definitely keep it out. As @BurntSushi correctly points out, if NULL is in the file format, you have to special-case it while using any other type. In the real world, this is already the case because a key might not be set at all. So while I think having NULL as a type is bad, the "explicitly unset a key" syntax looks useful:

[integration]
api_key= 
@88Alex

This comment has been minimized.

Show comment
Hide comment
@88Alex

88Alex Jun 27, 2013

You can just do this:

[toml]
null_integer = 0
null_string = ""

This is much easier to parse than null values.

88Alex commented Jun 27, 2013

You can just do this:

[toml]
null_integer = 0
null_string = ""

This is much easier to parse than null values.

@mojombo

This comment has been minimized.

Show comment
Hide comment
@mojombo

mojombo Sep 24, 2013

Member

I think an application should be in charge of knowing what the valid keys are and making sure sane defaults are set. It's too risky to leave that to a user editable config file. If you want to document all the available keys, but leave them "null" until they're set, then I think commenting those lines out is the best solution. Thanks for all the thoughts on this everyone!

Member

mojombo commented Sep 24, 2013

I think an application should be in charge of knowing what the valid keys are and making sure sane defaults are set. It's too risky to leave that to a user editable config file. If you want to document all the available keys, but leave them "null" until they're set, then I think commenting those lines out is the best solution. Thanks for all the thoughts on this everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment