Guarantee that JSON in data-* attributes will be loaded by `.data` (#7579) #658

Closed
wants to merge 1 commit into
from

Conversation

Projects
None yet
4 participants
@wolever

wolever commented Jan 19, 2012

This patch should guarantee that any valid JSON data in data-* attributes will be correctly converted when read with the .data(...) method.

Currently .data(…) will correctly convert most JSON values (singletons, numbers, arrays and objects), but it does not convert strings.

This change will make it safe to use data- attributes with arbitrary data, so long as it is first JSON encoded.

Backwards compatibility will only be broken in situations where the value of a data-* attribute matches /^".*"$/.

See also: http://bugs.jquery.com/ticket/7579

I'll make the appropriate documentation updates if this patch would otherwise be accepted.

Convert JSON strings in data-* attributes.
This patch should guarantee that any valid JSON data in `data-*`
attributes will be correctly converted when read with the `.data(...)`
method.
@rwaldron

This comment has been minimized.

Show comment
Hide comment
@rwaldron

rwaldron Jan 19, 2012

Member

Check this....

var div = jQuery("<div>");

div.attr({ "data-test": "'hello apostrophe\'s'" });

div.data("test");
> "'hello apostrophe's'"


var div = jQuery("<div>");

div.attr({ "data-test": '"hello apostrophe\'s"' });

div.data("test");
> "hello apostrophe's"
Member

rwaldron commented Jan 19, 2012

Check this....

var div = jQuery("<div>");

div.attr({ "data-test": "'hello apostrophe\'s'" });

div.data("test");
> "'hello apostrophe's'"


var div = jQuery("<div>");

div.attr({ "data-test": '"hello apostrophe\'s"' });

div.data("test");
> "hello apostrophe's"
@wolever

This comment has been minimized.

Show comment
Hide comment
@wolever

wolever Jan 19, 2012

Yes, as noted — this does break backwards compatibility when using .data(…) with data-* attributes matching /^".*"$/.

(warning: claims without data ahead) I would bet that that the number of people who would be inconvenienced by this backwards incompatibility is smaller than the number of people who would find it useful.

Unless you're suggesting that it's inconsistent because only double-quoted strings are treated differently? If that's the case, I proposed only treating double-quoted strings differently because single-quoted strings aren't valid JSON, and the point of this patch is to guarantee that JSON values in data-* attributes will always be correctly decoded by .data(…).

wolever commented Jan 19, 2012

Yes, as noted — this does break backwards compatibility when using .data(…) with data-* attributes matching /^".*"$/.

(warning: claims without data ahead) I would bet that that the number of people who would be inconvenienced by this backwards incompatibility is smaller than the number of people who would find it useful.

Unless you're suggesting that it's inconsistent because only double-quoted strings are treated differently? If that's the case, I proposed only treating double-quoted strings differently because single-quoted strings aren't valid JSON, and the point of this patch is to guarantee that JSON values in data-* attributes will always be correctly decoded by .data(…).

@rwaldron

This comment has been minimized.

Show comment
Hide comment
@rwaldron

rwaldron Jan 19, 2012

Member

We can't making backwards compatibility breaking changes without thorough discussion that includes the entire core dev team. The reach is tremendous: http://trends.builtwith.com/javascript/jQuery

Member

rwaldron commented Jan 19, 2012

We can't making backwards compatibility breaking changes without thorough discussion that includes the entire core dev team. The reach is tremendous: http://trends.builtwith.com/javascript/jQuery

@wolever

This comment has been minimized.

Show comment
Hide comment
@wolever

wolever Jan 19, 2012

Fair enough. What is the process for doing that?

I'll try and gather some data on which existing projects would be affected by this change (on the same day that Google code search is shut down… facepalm).

wolever commented Jan 19, 2012

Fair enough. What is the process for doing that?

I'll try and gather some data on which existing projects would be affected by this change (on the same day that Google code search is shut down… facepalm).

@dmethvin

This comment has been minimized.

Show comment
Hide comment
@dmethvin

dmethvin Jan 20, 2012

Member

I don't see us changing this. If you always want a string, the solution is to use .attr("data-x") instead of .data("x") which seems pretty straightforward. Are there reasons this won't work for you?

Member

dmethvin commented Jan 20, 2012

I don't see us changing this. If you always want a string, the solution is to use .attr("data-x") instead of .data("x") which seems pretty straightforward. Are there reasons this won't work for you?

@wolever

This comment has been minimized.

Show comment
Hide comment
@wolever

wolever Jan 20, 2012

Sorry, I guess the message and title doesn't make the purpose entirely clear. I've updated it to clarify.

The purpose of this patch is to guarantee that JSON in data-* attributes will be correctly decoded. This means that it will always be safe to set data-* attributes to JSON.stringify(some_arbitrary_data).

My specific use-case for this is that I have data associated with DOM nodes which is seeded in the template (ie, the template has ``data-foo="{{ foo }}"), then used and updated in the code. If I could trust.data(…)` to correctly load JSON data, I could use `data-foo="{{ JSON.stringify(foo) }}` in my templates, then access that data with `$(…).data("foo")` from the code.

However, since I can't trust .data(…), the logic in my code needs to be (roughly): var foo = $(…).data("foo"); if (foo === undefined) foo = JSON.parse($(…).attr("data-foo"));

Also, note: there is a typo (now fixed) in my initial comment: the regular expression used is /^".*"$/ (with the trailing quote).

wolever commented Jan 20, 2012

Sorry, I guess the message and title doesn't make the purpose entirely clear. I've updated it to clarify.

The purpose of this patch is to guarantee that JSON in data-* attributes will be correctly decoded. This means that it will always be safe to set data-* attributes to JSON.stringify(some_arbitrary_data).

My specific use-case for this is that I have data associated with DOM nodes which is seeded in the template (ie, the template has ``data-foo="{{ foo }}"), then used and updated in the code. If I could trust.data(…)` to correctly load JSON data, I could use `data-foo="{{ JSON.stringify(foo) }}` in my templates, then access that data with `$(…).data("foo")` from the code.

However, since I can't trust .data(…), the logic in my code needs to be (roughly): var foo = $(…).data("foo"); if (foo === undefined) foo = JSON.parse($(…).attr("data-foo"));

Also, note: there is a typo (now fixed) in my initial comment: the regular expression used is /^".*"$/ (with the trailing quote).

@timmywil

This comment has been minimized.

Show comment
Hide comment
@timmywil

timmywil Jan 20, 2012

Member

I am not convinced. '"quoted"' should not come through as "quoted", but '"quoted"'. Valid JSON works with .data. I would not use JSON.stringify(foo) unless you are sure that foo is actually JSON.

Member

timmywil commented Jan 20, 2012

I am not convinced. '"quoted"' should not come through as "quoted", but '"quoted"'. Valid JSON works with .data. I would not use JSON.stringify(foo) unless you are sure that foo is actually JSON.

@wolever

This comment has been minimized.

Show comment
Hide comment
@wolever

wolever Jan 20, 2012

'"quoted"' should not come through as "quoted", but '"quoted"'

Under my propposed patch, it would. Single-quoted strings are not valid JSON.

A quick re-cap of semantics:

Proposed semantics:
data-foo="bar" --> "bar" (not a JSON string, treated normally)
data-foo="'bar'" --> "'bar'" (not a JSON string, treated normally)
data-foo="\"bar" --> "\"bar" (not a JSON string, treated normally)
data-foo='"bar"' --> "bar" (a JSON string, decoded as such)

Current semantics:
data-foo="bar" --> "bar"
data-foo="'bar'" --> "'bar'"
data-foo="\"bar" --> "\"bar"
data-foo='"bar"' --> "\"bar\""

Valid JSON works with .data.

I'm not sure I understand. "foo" (ie, a double-quoted string) is valid JSON, but will not be loaded by .data(…) at the moment (ie, given data-foo='"quoted"', .data("foo") will return "\"quoted\"").

However, all JSON apart from strings will be correctly loaded (ie, null, false and true singletons, numbers, arrays and objects all work correctly).

I would not use JSON.stringify(foo) unless you are sure that foo is actually JSON.

Again, forgive me, but I don't understand why this is relevant…

wolever commented Jan 20, 2012

'"quoted"' should not come through as "quoted", but '"quoted"'

Under my propposed patch, it would. Single-quoted strings are not valid JSON.

A quick re-cap of semantics:

Proposed semantics:
data-foo="bar" --> "bar" (not a JSON string, treated normally)
data-foo="'bar'" --> "'bar'" (not a JSON string, treated normally)
data-foo="\"bar" --> "\"bar" (not a JSON string, treated normally)
data-foo='"bar"' --> "bar" (a JSON string, decoded as such)

Current semantics:
data-foo="bar" --> "bar"
data-foo="'bar'" --> "'bar'"
data-foo="\"bar" --> "\"bar"
data-foo='"bar"' --> "\"bar\""

Valid JSON works with .data.

I'm not sure I understand. "foo" (ie, a double-quoted string) is valid JSON, but will not be loaded by .data(…) at the moment (ie, given data-foo='"quoted"', .data("foo") will return "\"quoted\"").

However, all JSON apart from strings will be correctly loaded (ie, null, false and true singletons, numbers, arrays and objects all work correctly).

I would not use JSON.stringify(foo) unless you are sure that foo is actually JSON.

Again, forgive me, but I don't understand why this is relevant…

@timmywil

This comment has been minimized.

Show comment
Hide comment
@timmywil

timmywil Jan 20, 2012

Member

You have a test where '"quoted"' comes through as "quoted". How is "quoted" valid JSON? It is simply a javascript string and should stay that way. JSON means Javascript object notation. Valid JSON has curly braces or brackets. http://jsonlint.com

Member

timmywil commented Jan 20, 2012

You have a test where '"quoted"' comes through as "quoted". How is "quoted" valid JSON? It is simply a javascript string and should stay that way. JSON means Javascript object notation. Valid JSON has curly braces or brackets. http://jsonlint.com

@rwaldron

This comment has been minimized.

Show comment
Hide comment
@rwaldron

rwaldron Jan 20, 2012

Member

I don't understand why this is relevant…

Sorry, but I'm feeling the same way about your proposal. I brought up the single quotes issue because it's very likely that a user will try that and report it as a bug.

Member

rwaldron commented Jan 20, 2012

I don't understand why this is relevant…

Sorry, but I'm feeling the same way about your proposal. I brought up the single quotes issue because it's very likely that a user will try that and report it as a bug.

@rwaldron

This comment has been minimized.

Show comment
Hide comment
@rwaldron

rwaldron Jan 20, 2012

Member
> JSON.parse("foo")
SyntaxError: Unexpected token o

>JSON.parse('"foo"')
"foo"

I'm not sure I see any broad use case benefit to this

Member

rwaldron commented Jan 20, 2012

> JSON.parse("foo")
SyntaxError: Unexpected token o

>JSON.parse('"foo"')
"foo"

I'm not sure I see any broad use case benefit to this

@wolever

This comment has been minimized.

Show comment
Hide comment
@wolever

wolever Jan 20, 2012

timmywill: I see that the validator doesn't approve of either 42 or "foo", but it also doesn't approve of null, true or false. However, by my understanding of the specification all of the above are valid. Additionally, every JSON parser and emitter I've used has no problem with them:

> JSON.parse("42")
42
> JSON.parse('"foo"')
"foo"

rwldrn: so, to clarify: do you believe that .data's current implementation, which makes it impossible to use on arbitrary data, is acceptable? Or would you be open to a patch which makes it safe to use on arbitrary data and you just disapprove of the specific implementation I'm proposing?

wolever commented Jan 20, 2012

timmywill: I see that the validator doesn't approve of either 42 or "foo", but it also doesn't approve of null, true or false. However, by my understanding of the specification all of the above are valid. Additionally, every JSON parser and emitter I've used has no problem with them:

> JSON.parse("42")
42
> JSON.parse('"foo"')
"foo"

rwldrn: so, to clarify: do you believe that .data's current implementation, which makes it impossible to use on arbitrary data, is acceptable? Or would you be open to a patch which makes it safe to use on arbitrary data and you just disapprove of the specific implementation I'm proposing?

@rwaldron

This comment has been minimized.

Show comment
Hide comment
@rwaldron

rwaldron Jan 20, 2012

Member

I'm open to leave data() exactly as it is and avoiding..

  • the backlash that any kind of change in semantics will cause.
  • the additional bytes that seemingly benefit one person
Member

rwaldron commented Jan 20, 2012

I'm open to leave data() exactly as it is and avoiding..

  • the backlash that any kind of change in semantics will cause.
  • the additional bytes that seemingly benefit one person
@wolever

This comment has been minimized.

Show comment
Hide comment
@wolever

wolever Jan 20, 2012

rwldrn: As I noted in a previous comment the use-case is: I want to attach some arbitrary data to a DOM element when it leaves my templating engine. Those data will be accessed with .data(…) and updated over the lifecycle of that element.

Additionally, it will make the behaviour of .data with respect to data-* elements very straightforward, simplifying the documentation and (I hypothesize, again without evidence) future discussions when people are surprised. The rule will simply be: “if the content of a data-* attribute is JSON it will be loaded as such, otherwise it will be loaded as the literal string”.

However, if you don't see the value in allowing .data() to correctly load all JSON values (instead of most JSON values, as it does now), I won't fight for it any further.

wolever commented Jan 20, 2012

rwldrn: As I noted in a previous comment the use-case is: I want to attach some arbitrary data to a DOM element when it leaves my templating engine. Those data will be accessed with .data(…) and updated over the lifecycle of that element.

Additionally, it will make the behaviour of .data with respect to data-* elements very straightforward, simplifying the documentation and (I hypothesize, again without evidence) future discussions when people are surprised. The rule will simply be: “if the content of a data-* attribute is JSON it will be loaded as such, otherwise it will be loaded as the literal string”.

However, if you don't see the value in allowing .data() to correctly load all JSON values (instead of most JSON values, as it does now), I won't fight for it any further.

@rwaldron

This comment has been minimized.

Show comment
Hide comment
@rwaldron

rwaldron Jan 20, 2012

Member

Looks like jQuery already matches the behaviour... http://jsfiddle.net/rwaldron/4G6bS/

Member

rwaldron commented Jan 20, 2012

Looks like jQuery already matches the behaviour... http://jsfiddle.net/rwaldron/4G6bS/

@timmywil

This comment has been minimized.

Show comment
Hide comment
@timmywil

timmywil Jan 20, 2012

Member

@wolever: That would be a misinterpretation of the spec. Perhaps the RFC is more clear: http://tools.ietf.org/html/rfc4627. Everything you have noted as valid JSON are valid JSON values, but JSON itself is always either an object or list(array). The purpose of .data() is not only to parse JSON. null, true, and false, are not passed through as JSON, they are passed through as the value null and the booleans true and false. The reason why you might see JSON parsers work is that they are not validating. The simplest JSON parser uses a function constructor to convert the string to an object automatically. This would also convert numbers and strings, but that doesn't mean numbers and strings are JSON. Admittedly, the data conversion is there for convenience. It is not a perfect catch-all do-what-I-mean function (hence why #7579 was closed wontfix). When in doubt, use .attr(), but I think data is doing the right thing here. We can continue discussion, but closing for now.

Member

timmywil commented Jan 20, 2012

@wolever: That would be a misinterpretation of the spec. Perhaps the RFC is more clear: http://tools.ietf.org/html/rfc4627. Everything you have noted as valid JSON are valid JSON values, but JSON itself is always either an object or list(array). The purpose of .data() is not only to parse JSON. null, true, and false, are not passed through as JSON, they are passed through as the value null and the booleans true and false. The reason why you might see JSON parsers work is that they are not validating. The simplest JSON parser uses a function constructor to convert the string to an object automatically. This would also convert numbers and strings, but that doesn't mean numbers and strings are JSON. Admittedly, the data conversion is there for convenience. It is not a perfect catch-all do-what-I-mean function (hence why #7579 was closed wontfix). When in doubt, use .attr(), but I think data is doing the right thing here. We can continue discussion, but closing for now.

@timmywil timmywil closed this Jan 20, 2012

@wolever

This comment has been minimized.

Show comment
Hide comment
@wolever

wolever Jan 20, 2012

rwldrn: no, it does not. The value returned is "\"foo\"", not "foo". See: http://jsfiddle.net/4G6bS/2/

wolever commented Jan 20, 2012

rwldrn: no, it does not. The value returned is "\"foo\"", not "foo". See: http://jsfiddle.net/4G6bS/2/

@rwaldron

This comment has been minimized.

Show comment
Hide comment
Member

rwaldron commented Jan 20, 2012

@wolever

This comment has been minimized.

Show comment
Hide comment
@wolever

wolever Jan 20, 2012

timmywill: ah, yes, you are correct:

A JSON text is a serialized object or array.

 JSON-text = object / array

In that case, I am requesting that .data(…) correctly load all JSON values instead of “all JSON values except strings”.

The purpose of .data() is not only to parse JSON

I 100% agree. And I entirely understand that the link to between data-* attributes and .data(…) seems to be regretted.

However, I don't believe the current behaviour (ie, that it's impossible to use .data to load an arbitrary string value from a data-* attribute) is optimal, and it seems that slightly modifying the definition of .data(…) to be “if the value of a data- attributes is a JSON value it will be loaded as such, otherwise the string value will be returned” would make it more sensible.

null, true, and false, are not passed through as JSON, they are passed through as the value null and the booleans true and false.

Agreed. But it would appear that the current .data(…) is essential a broken JSON value parser (ie, broken because it doesn't handle strings).

rwldrn: I believe we are misunderstanding each other. Are you meaning to imply that jQuery currently exhibits the behaviour I'm proposing? If so, I believe that you are incorrect. Currently, if the value of a data-* attribute is enclosed in double quotes (ie, it matches /^".*"$/), the literal value (ie, including quotes) is returned. I'm proposing that if the value of a data-* attribute is enclosed in double quotes, .data() will load it as JSON string (in the same way that, if the value is is enclosed in moustaches, .data() will load it as a JSON object).

wolever commented Jan 20, 2012

timmywill: ah, yes, you are correct:

A JSON text is a serialized object or array.

 JSON-text = object / array

In that case, I am requesting that .data(…) correctly load all JSON values instead of “all JSON values except strings”.

The purpose of .data() is not only to parse JSON

I 100% agree. And I entirely understand that the link to between data-* attributes and .data(…) seems to be regretted.

However, I don't believe the current behaviour (ie, that it's impossible to use .data to load an arbitrary string value from a data-* attribute) is optimal, and it seems that slightly modifying the definition of .data(…) to be “if the value of a data- attributes is a JSON value it will be loaded as such, otherwise the string value will be returned” would make it more sensible.

null, true, and false, are not passed through as JSON, they are passed through as the value null and the booleans true and false.

Agreed. But it would appear that the current .data(…) is essential a broken JSON value parser (ie, broken because it doesn't handle strings).

rwldrn: I believe we are misunderstanding each other. Are you meaning to imply that jQuery currently exhibits the behaviour I'm proposing? If so, I believe that you are incorrect. Currently, if the value of a data-* attribute is enclosed in double quotes (ie, it matches /^".*"$/), the literal value (ie, including quotes) is returned. I'm proposing that if the value of a data-* attribute is enclosed in double quotes, .data() will load it as JSON string (in the same way that, if the value is is enclosed in moustaches, .data() will load it as a JSON object).

@dmethvin

This comment has been minimized.

Show comment
Hide comment
@dmethvin

dmethvin Jan 21, 2012

Member

@wolever, if you sense frustration around here it is because all of these issues have been discussed ad nauseam already. Whether we made the right choice in your mind about two years ago, it is the choice we made and the proposed patch breaks existing behavior for a case that has only bothered a handful of people over those two years.

So please stop arguing your point. We are not going to make it work the way you want because it is a breaking change. If the current semantics don't work for edge cases, use .attr("data-x") to get the argument as a string and process it in your preferred manner. You can turn it into a plugin. It can become a cult classic and so popular that we have no choice but to incorporate it into jQuery 2.0 as a breaking change. But we're not interested in doing that right now.

Member

dmethvin commented Jan 21, 2012

@wolever, if you sense frustration around here it is because all of these issues have been discussed ad nauseam already. Whether we made the right choice in your mind about two years ago, it is the choice we made and the proposed patch breaks existing behavior for a case that has only bothered a handful of people over those two years.

So please stop arguing your point. We are not going to make it work the way you want because it is a breaking change. If the current semantics don't work for edge cases, use .attr("data-x") to get the argument as a string and process it in your preferred manner. You can turn it into a plugin. It can become a cult classic and so popular that we have no choice but to incorporate it into jQuery 2.0 as a breaking change. But we're not interested in doing that right now.

@wolever

This comment has been minimized.

Show comment
Hide comment
@wolever

wolever Jan 21, 2012

@dmethvin: acknowledged. Thanks for the straightforward response.

wolever commented Jan 21, 2012

@dmethvin: acknowledged. Thanks for the straightforward response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment