New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How Much XSS Vulnerability Protection is React Responsible For? #3473

Closed
sebmarkbage opened this Issue Mar 20, 2015 · 42 comments

Comments

Projects
None yet
@sebmarkbage
Member

sebmarkbage commented Mar 20, 2015

There was a security hack that mentions React here: http://danlec.com/blog/xss-via-a-spoofed-react-element

Ultimately this is a server-side bug and NOT a bug in React itself. This issue is about figuring out if there is something we can do to mitigate issues when you have a JSON parsing bug or some server-side issue.

isValidElement

React is designed to work with plain objects as input, and in fact, we're even getting rid of the _isReactElement as a way to verify this. We'll allow any JSON object. IMO, there is no problem with the verification here.

All string values are sanitized before inserted into the DOM (except for CSS styles which is a known wontfix issue for IE8).

In earlier versions we used instanceof checks but that didn't work well with multiple Reacts, it makes it difficult to optimize (inline objects are much faster) and couples JSX permanently to React, which we would like to avoid.

dangerouslySetInnerHTML

One possible solution is to disable this feature and require it to be used imperatively (React.findDOMNode(ref).innerHTML = '') which makes for worse performance at insertion time.

However, I don't believe this is the only bad thing once you can insert arbitrary HTML tags. It is certainly the easiest way to gain access to XSS though. You can also insert arbitrary Web Components which could expose data. You can render form elements that can potentially pass data.

What else can we do?

Ultimately this is an issue where <div>{userData}</div> seems like a valid use case, but if your userData is compromised, it becomes dangerous.

Should React be responsible for protecting itself against arbitrary JSON as children?

@yukinying

This comment has been minimized.

Show comment
Hide comment
@yukinying

yukinying Mar 20, 2015

Hi, Albert from Yahoo Security team here. I was working with @mridgway on studying this issue. Just wanna share more thoughts.

Re: disabling dangeriouslySetInnerHTML, it is a good move, but it would not be sufficient. React allow inject "script" and "style" tag, or "style" attributes. All of those would allow script execution. Also, in general, we worried about UI redressing attack that attackers can create arbitrary overlay on a page that could steal all the input submitted.

yukinying commented Mar 20, 2015

Hi, Albert from Yahoo Security team here. I was working with @mridgway on studying this issue. Just wanna share more thoughts.

Re: disabling dangeriouslySetInnerHTML, it is a good move, but it would not be sufficient. React allow inject "script" and "style" tag, or "style" attributes. All of those would allow script execution. Also, in general, we worried about UI redressing attack that attackers can create arbitrary overlay on a page that could steal all the input submitted.

@sebmarkbage

This comment has been minimized.

Show comment
Hide comment
@sebmarkbage

sebmarkbage Mar 20, 2015

Member

I don't think "script" tags will execute because we use innerHTML to create them. Do the "style" tag and "style" attributes allow you to execute (in context) scripts in modern browsers? I think it is only older versions of IE.

Regardless, the UI redressing attack is still an equally valid concern.

Member

sebmarkbage commented Mar 20, 2015

I don't think "script" tags will execute because we use innerHTML to create them. Do the "style" tag and "style" attributes allow you to execute (in context) scripts in modern browsers? I think it is only older versions of IE.

Regardless, the UI redressing attack is still an equally valid concern.

@yukinying

This comment has been minimized.

Show comment
Hide comment
@yukinying

yukinying Mar 20, 2015

I was also having the same doubt on script tag when I saw that it was supported in facebook.github.io/react/docs/tags-and-attributes.html, and then I figured out that script tag is used in server side react and it would be executable. And indeed most style attacks only works in IE8 or minor. But it still contributes to quite some amount of traffic.

yukinying commented Mar 20, 2015

I was also having the same doubt on script tag when I saw that it was supported in facebook.github.io/react/docs/tags-and-attributes.html, and then I figured out that script tag is used in server side react and it would be executable. And indeed most style attacks only works in IE8 or minor. But it still contributes to quite some amount of traffic.

@yukinying

This comment has been minimized.

Show comment
Hide comment
@yukinying

yukinying Mar 20, 2015

I am more inclined to see _isReactElement be used as a validation mechanism. The issue here is that we can't easily distinguish if an JSON object would have execution context or data context. While it is true that server side validation could help, it only works if every time we get an JSON we have a schema to validate it. The nature of JSON, as you suggested, is plain and free form. Otherwise it will be falling back as XML validation that would discourage developers to use it.

In a nutshell, we really need a way to distinguish if a JSON object has executable context (i.e. _isReactElement). Thanks!

yukinying commented Mar 20, 2015

I am more inclined to see _isReactElement be used as a validation mechanism. The issue here is that we can't easily distinguish if an JSON object would have execution context or data context. While it is true that server side validation could help, it only works if every time we get an JSON we have a schema to validate it. The nature of JSON, as you suggested, is plain and free form. Otherwise it will be falling back as XML validation that would discourage developers to use it.

In a nutshell, we really need a way to distinguish if a JSON object has executable context (i.e. _isReactElement). Thanks!

@mattapperson

This comment has been minimized.

Show comment
Hide comment
@mattapperson

mattapperson Mar 20, 2015

None... I think React should stay as light and lean as it can be. Unix philosophy fan here.

mattapperson commented Mar 20, 2015

None... I think React should stay as light and lean as it can be. Unix philosophy fan here.

@sebmarkbage

This comment has been minimized.

Show comment
Hide comment
@sebmarkbage

sebmarkbage Mar 20, 2015

Member

Note that JSON rendering of a virtual DOM tree is a perfectly valid use case. E.g. you can prerender a high level component into a smaller level virtual DOM tree on the server and then render the result on the client. This is also one of the ideas we had for web worker rendering where it would construct a virtual DOM that can be postMessaged and rendered on the other side.

Therefore, we also have to consider what options we have if we chose not to protect ourselves against JSON data. How much can we mitigate that scenario?

Member

sebmarkbage commented Mar 20, 2015

Note that JSON rendering of a virtual DOM tree is a perfectly valid use case. E.g. you can prerender a high level component into a smaller level virtual DOM tree on the server and then render the result on the client. This is also one of the ideas we had for web worker rendering where it would construct a virtual DOM that can be postMessaged and rendered on the other side.

Therefore, we also have to consider what options we have if we chose not to protect ourselves against JSON data. How much can we mitigate that scenario?

@graue

This comment has been minimized.

Show comment
Hide comment
@graue

graue Mar 20, 2015

Contributor

Possible solution, use different syntax for inserting strings as children:

  • <span>{=this.props.username}</span>: Coerces username to a string.
  • <a href={this.props.url}> .. </a>: Coerces url to a string (as now).
  • <div>{this.renderFooter()}</div>: If this.renderFooter() returns a string or number, warns in 0.14, error in 0.15. Explicitly wrap in a span or use {=str}.

Feels ugly but would solve the problem, wouldn't it?

Contributor

graue commented Mar 20, 2015

Possible solution, use different syntax for inserting strings as children:

  • <span>{=this.props.username}</span>: Coerces username to a string.
  • <a href={this.props.url}> .. </a>: Coerces url to a string (as now).
  • <div>{this.renderFooter()}</div>: If this.renderFooter() returns a string or number, warns in 0.14, error in 0.15. Explicitly wrap in a span or use {=str}.

Feels ugly but would solve the problem, wouldn't it?

@sebmarkbage

This comment has been minimized.

Show comment
Hide comment
@sebmarkbage

sebmarkbage Mar 20, 2015

Member

Coercion doesn't catch mistakes where you miss it. We would need to make that a requirement, e.g. by wrapping strings in some placeholder that is distinct from elements: { __thisIsSupposeToBeAString: str }

Member

sebmarkbage commented Mar 20, 2015

Coercion doesn't catch mistakes where you miss it. We would need to make that a requirement, e.g. by wrapping strings in some placeholder that is distinct from elements: { __thisIsSupposeToBeAString: str }

@graue

This comment has been minimized.

Show comment
Hide comment
@graue

graue Mar 21, 2015

Contributor

That's why I suggested making it an error to pass a string to the non-coercing syntax (plain {}).

Contributor

graue commented Mar 21, 2015

That's why I suggested making it an error to pass a string to the non-coercing syntax (plain {}).

@sebmarkbage

This comment has been minimized.

Show comment
Hide comment
@sebmarkbage

sebmarkbage Mar 21, 2015

Member

I see. That would be difficult to enforce since they can come nested arrays or non-JSX sources.

Member

sebmarkbage commented Mar 21, 2015

I see. That would be difficult to enforce since they can come nested arrays or non-JSX sources.

@monsanto

This comment has been minimized.

Show comment
Hide comment
@monsanto

monsanto Mar 21, 2015

Contributor

dangerouslySetInnerHTML is really useful when interfacing with existing Markdown libraries. (I know of one that generates React elements, but that isn't the point I'm trying to make.) Removing this feature over a bug that isn't even React's fault seems pretty scorched earth to me.

Contributor

monsanto commented Mar 21, 2015

dangerouslySetInnerHTML is really useful when interfacing with existing Markdown libraries. (I know of one that generates React elements, but that isn't the point I'm trying to make.) Removing this feature over a bug that isn't even React's fault seems pretty scorched earth to me.

@syranide

This comment has been minimized.

Show comment
Hide comment
@syranide

syranide Mar 21, 2015

Contributor

@sebmarkbage As schemaless storages are becoming somewhat popular, this is going to become more prevalent. At some level it is a repeat of inserting foreign HTML markup as a result of lack of escaping (React made it a goal to make it a thing of the past), but instead the vulnerability moved to React components where wrongly typed values can lead to insertion of foreign hierarchies.

I'd argue that even if we could somehow guarantee that insertion of foreign hierarchies wouldn't be dangerous, it would always be very bad. Malicious data will always wreak havoc on a system and React is just one part of it. The danger I see with React being targeted (apart from being easily exploited due to being debuggable client-side) is malicious data being rendered for other users of the same system, it's hard to guarantee that the malcious data won't be able to act freely under the assumed identity of those users.


I think the way forward is to drop implicit wrapping of primitive values. Printable values should be elements like everything else from React's perspective, they should not be implicitly wrapped like they are today. A printable value should look something like React.createValue('foobar') => {type: 'value', props: 'foobar'}, it's still serializable but it's not exploitable. JSX could have some simple syntax sugar for it too, say {=foobar}, and inline text would be automatically wrapped.

While it may seem rather draconian in a sense, I think this is the sensible way forward, HTML introduced a lot of conviences that turned out to be really inconvenient (for user interfaces), this is another one. I think the practical implication of this is not as bad as it seems; any sensible user interface backend would expose labels as <label value={...} />, buttons as <button label={...} />, etc. React.createValue would be largely exclusive to rich-text components where it actually makes sense.

Contributor

syranide commented Mar 21, 2015

@sebmarkbage As schemaless storages are becoming somewhat popular, this is going to become more prevalent. At some level it is a repeat of inserting foreign HTML markup as a result of lack of escaping (React made it a goal to make it a thing of the past), but instead the vulnerability moved to React components where wrongly typed values can lead to insertion of foreign hierarchies.

I'd argue that even if we could somehow guarantee that insertion of foreign hierarchies wouldn't be dangerous, it would always be very bad. Malicious data will always wreak havoc on a system and React is just one part of it. The danger I see with React being targeted (apart from being easily exploited due to being debuggable client-side) is malicious data being rendered for other users of the same system, it's hard to guarantee that the malcious data won't be able to act freely under the assumed identity of those users.


I think the way forward is to drop implicit wrapping of primitive values. Printable values should be elements like everything else from React's perspective, they should not be implicitly wrapped like they are today. A printable value should look something like React.createValue('foobar') => {type: 'value', props: 'foobar'}, it's still serializable but it's not exploitable. JSX could have some simple syntax sugar for it too, say {=foobar}, and inline text would be automatically wrapped.

While it may seem rather draconian in a sense, I think this is the sensible way forward, HTML introduced a lot of conviences that turned out to be really inconvenient (for user interfaces), this is another one. I think the practical implication of this is not as bad as it seems; any sensible user interface backend would expose labels as <label value={...} />, buttons as <button label={...} />, etc. React.createValue would be largely exclusive to rich-text components where it actually makes sense.

@glenjamin

This comment has been minimized.

Show comment
Hide comment
@glenjamin

glenjamin Mar 24, 2015

Contributor

I think the way forward is to drop implicit wrapping of primitive values.

I think to provide safety it would actually have to be the reverse - it should be easier to render a primitive value than to render a component. This is the approach that other front-end frameworks have converged on over time.

For instance <p>{user.name}</p> should always be a primitive render, which is known to be safe. The potentially unsafe version should look different, which causes the reader/author to look more closely, eg. <p>{!:user.name:!}</p> or something similarly wacky. This is possibly too much of a departure from the current way of doing things to be viable?

Note that JSON rendering of a virtual DOM tree is a perfectly valid use case.
In a nutshell, we really need a way to distinguish if a JSON object has executable context (i.e. _isReactElement). Thanks!
React is designed to work with plain objects as input, and in fact, we're even getting rid of the _isReactElement as a way to verify this.

These are conflicting goals, so I think React would have to make a clear choice here, the options I can think of are:

  1. Decide that all data is renderable, and increase documentation warning about this
  2. Decide that all data is primitive by default, and introduce new JSX syntax to render data intended to be non-primitive
  3. Find a way to mark data as "safe".

For point 3, to support the JSON cases, as well as createElement, there might need to be some sort of hydrateElement which can take a plain object and mark it as safe to render eg. <p>{React.hydrateElement(someJSONobj)}</p>.

As proved by the bug which prompted the discussion, a boolean property is not enough to mark some data as executable, as I understand it, the only "safe" option would be an object reference, perhaps something like {type: "div", React: React}, which can be checked using React === React at render-time. This re-introduces the multiple-react problem discussed above, and while that could be alleviated slightly by sharing some arbitrary flag object via global (window || global).___reactMarker = {}, this doesn't really help for iframes, web workers, or multiple windows.

A bit of a brain dump, (1) seems simplest but unsafe. (2) is safest but a large change, (3) seems fiddly and error prone.

Hopefully someone else has a better idea!

Contributor

glenjamin commented Mar 24, 2015

I think the way forward is to drop implicit wrapping of primitive values.

I think to provide safety it would actually have to be the reverse - it should be easier to render a primitive value than to render a component. This is the approach that other front-end frameworks have converged on over time.

For instance <p>{user.name}</p> should always be a primitive render, which is known to be safe. The potentially unsafe version should look different, which causes the reader/author to look more closely, eg. <p>{!:user.name:!}</p> or something similarly wacky. This is possibly too much of a departure from the current way of doing things to be viable?

Note that JSON rendering of a virtual DOM tree is a perfectly valid use case.
In a nutshell, we really need a way to distinguish if a JSON object has executable context (i.e. _isReactElement). Thanks!
React is designed to work with plain objects as input, and in fact, we're even getting rid of the _isReactElement as a way to verify this.

These are conflicting goals, so I think React would have to make a clear choice here, the options I can think of are:

  1. Decide that all data is renderable, and increase documentation warning about this
  2. Decide that all data is primitive by default, and introduce new JSX syntax to render data intended to be non-primitive
  3. Find a way to mark data as "safe".

For point 3, to support the JSON cases, as well as createElement, there might need to be some sort of hydrateElement which can take a plain object and mark it as safe to render eg. <p>{React.hydrateElement(someJSONobj)}</p>.

As proved by the bug which prompted the discussion, a boolean property is not enough to mark some data as executable, as I understand it, the only "safe" option would be an object reference, perhaps something like {type: "div", React: React}, which can be checked using React === React at render-time. This re-introduces the multiple-react problem discussed above, and while that could be alleviated slightly by sharing some arbitrary flag object via global (window || global).___reactMarker = {}, this doesn't really help for iframes, web workers, or multiple windows.

A bit of a brain dump, (1) seems simplest but unsafe. (2) is safest but a large change, (3) seems fiddly and error prone.

Hopefully someone else has a better idea!

@sebmarkbage

This comment has been minimized.

Show comment
Hide comment
@sebmarkbage

sebmarkbage Mar 24, 2015

Member

Having discussed this with several people now, I'm leaning towards the "mark as trusted" solution. It doesn't have to be a reference identity. It can be a unique enough token string or number. This token has to be shared between environments in a similar way as CSRF. By default we may share it with the realm through the global.

It could also be completely disabled if you chose to.

On Mar 24, 2015, at 10:59 PM, Glen Mailer notifications@github.com wrote:

I think the way forward is to drop implicit wrapping of primitive values.

I think to provide safety it would actually have to be the reverse - it should be easier to render a primitive value than to render a component. This is the approach that other front-end frameworks have converged on over time.

For instance

{user.name}

should always be a primitive render, which is known to be safe. The potentially unsafe version should look different, which causes the reader/author to look more closely, eg.

{!:user.name:!}

or something similarly wacky. This is possibly too much of a departure from the current way of doing things to be viable?

Note that JSON rendering of a virtual DOM tree is a perfectly valid use case.
In a nutshell, we really need a way to distinguish if a JSON object has executable context (i.e. _isReactElement). Thanks!
React is designed to work with plain objects as input, and in fact, we're even getting rid of the _isReactElement as a way to verify this.

These are conflicting goals, so I think React would have to make a clear choice here, the options I can think of are:

  1. Decide that all data is renderable, and increase documentation warning about this
  2. Decide that all data is primitive by default, and introduce new JSX syntax to render data intended to be non-primitive
  3. Find a way to mark data as "safe".

For point 3, to support the JSON cases, as well as createElement, there might need to be some sort of hydrateElement which can take a plain object and mark it as safe to render eg.

{React.hydrateElement(someJSONobj)}

.

As proved by the bug which prompted the discussion, a boolean property is not enough to mark some data as executable, as I understand it, the only "safe" option would be an object reference, perhaps something like {type: "div", React: React}, which can be checked using React === React at render-time. This re-introduces the multiple-react problem discussed above, and while that could be alleviated slightly by sharing some arbitrary flag object via global (window || global).___reactMarker = {}, this doesn't really help for iframes, web workers, or multiple windows.

A bit of a brain dump, (1) seems simplest but unsafe. (2) is safest but a large change, (3) seems fiddly and error prone.

Hopefully someone else has a better idea!


Reply to this email directly or view it on GitHub.

Member

sebmarkbage commented Mar 24, 2015

Having discussed this with several people now, I'm leaning towards the "mark as trusted" solution. It doesn't have to be a reference identity. It can be a unique enough token string or number. This token has to be shared between environments in a similar way as CSRF. By default we may share it with the realm through the global.

It could also be completely disabled if you chose to.

On Mar 24, 2015, at 10:59 PM, Glen Mailer notifications@github.com wrote:

I think the way forward is to drop implicit wrapping of primitive values.

I think to provide safety it would actually have to be the reverse - it should be easier to render a primitive value than to render a component. This is the approach that other front-end frameworks have converged on over time.

For instance

{user.name}

should always be a primitive render, which is known to be safe. The potentially unsafe version should look different, which causes the reader/author to look more closely, eg.

{!:user.name:!}

or something similarly wacky. This is possibly too much of a departure from the current way of doing things to be viable?

Note that JSON rendering of a virtual DOM tree is a perfectly valid use case.
In a nutshell, we really need a way to distinguish if a JSON object has executable context (i.e. _isReactElement). Thanks!
React is designed to work with plain objects as input, and in fact, we're even getting rid of the _isReactElement as a way to verify this.

These are conflicting goals, so I think React would have to make a clear choice here, the options I can think of are:

  1. Decide that all data is renderable, and increase documentation warning about this
  2. Decide that all data is primitive by default, and introduce new JSX syntax to render data intended to be non-primitive
  3. Find a way to mark data as "safe".

For point 3, to support the JSON cases, as well as createElement, there might need to be some sort of hydrateElement which can take a plain object and mark it as safe to render eg.

{React.hydrateElement(someJSONobj)}

.

As proved by the bug which prompted the discussion, a boolean property is not enough to mark some data as executable, as I understand it, the only "safe" option would be an object reference, perhaps something like {type: "div", React: React}, which can be checked using React === React at render-time. This re-introduces the multiple-react problem discussed above, and while that could be alleviated slightly by sharing some arbitrary flag object via global (window || global).___reactMarker = {}, this doesn't really help for iframes, web workers, or multiple windows.

A bit of a brain dump, (1) seems simplest but unsafe. (2) is safest but a large change, (3) seems fiddly and error prone.

Hopefully someone else has a better idea!


Reply to this email directly or view it on GitHub.

@syranide

This comment has been minimized.

Show comment
Hide comment
@syranide

syranide Mar 24, 2015

Contributor

@glenjamin

I think to provide safety it would actually have to be the reverse - it should be easier to render a primitive value than to render a component. This is the approach that other front-end frameworks have converged on over time.

By explicitly having to put all primitive values in a value element it becomes inherently safe, primitive values cannot render outside of it and non-primitive values are stringified inside of it (or w/e).

Personally this makes a lot of sense to me, we already implicitly wrap primitive values internally, this is also how any non-markup/style-inheriting backend would work. There's no such thing as rendering inline text in iOS and most other user interface libraries, it only really makes sense for rendering rich-text content.

This is also how you'd do it if it was implemented in say C/C++ where you would want a single type for all children.

Contributor

syranide commented Mar 24, 2015

@glenjamin

I think to provide safety it would actually have to be the reverse - it should be easier to render a primitive value than to render a component. This is the approach that other front-end frameworks have converged on over time.

By explicitly having to put all primitive values in a value element it becomes inherently safe, primitive values cannot render outside of it and non-primitive values are stringified inside of it (or w/e).

Personally this makes a lot of sense to me, we already implicitly wrap primitive values internally, this is also how any non-markup/style-inheriting backend would work. There's no such thing as rendering inline text in iOS and most other user interface libraries, it only really makes sense for rendering rich-text content.

This is also how you'd do it if it was implemented in say C/C++ where you would want a single type for all children.

@glenjamin

This comment has been minimized.

Show comment
Hide comment
@glenjamin

glenjamin Mar 24, 2015

Contributor

By explicitly having to put all primitive values in a value element it becomes inherently safe, primitive values cannot render outside of it and non-primitive values are stringified inside of it (or w/e).

I see what you mean now, as long as the primitive and non-primitive syntaxes are separate then this is safe. Which is subtly different from HTML escaping of strings, where most strings work correctly when escaping is forgotten.

Contributor

glenjamin commented Mar 24, 2015

By explicitly having to put all primitive values in a value element it becomes inherently safe, primitive values cannot render outside of it and non-primitive values are stringified inside of it (or w/e).

I see what you mean now, as long as the primitive and non-primitive syntaxes are separate then this is safe. Which is subtly different from HTML escaping of strings, where most strings work correctly when escaping is forgotten.

@syranide

This comment has been minimized.

Show comment
Hide comment
@syranide

syranide Mar 25, 2015

Contributor

@glenjamin

Which is subtly different from HTML escaping of strings, where most strings work correctly when escaping is forgotten.

Yeah, but then again, React.createValue() would be the equivalent of document.createTextNode() so it's already there today, it's just that HTML markup implicitly creates text nodes, just like React currently does. But that's also a source of XSS in HTML and now in React too, but React hierarchies are object-based rather string, so React is vulnerable to malicous objects instead.

@sebmarkbage So it seems weird to me to add another "feature" to work around this when it isn't an inherent issue, it's introduced by implicit wrapping of primitive values (which we borrow from HTML, it still doesn't make sense for me for the purpose of user interfaces). I'm quite sure that performance implications of this should be minimal, but it obviously does significantly affect the code in one way or another. That is no doubt a very important consideration, but React has challenged many of the wrong-doings and inconvenient conveniences of HTML so far, I think this is another one that should be.

Contributor

syranide commented Mar 25, 2015

@glenjamin

Which is subtly different from HTML escaping of strings, where most strings work correctly when escaping is forgotten.

Yeah, but then again, React.createValue() would be the equivalent of document.createTextNode() so it's already there today, it's just that HTML markup implicitly creates text nodes, just like React currently does. But that's also a source of XSS in HTML and now in React too, but React hierarchies are object-based rather string, so React is vulnerable to malicous objects instead.

@sebmarkbage So it seems weird to me to add another "feature" to work around this when it isn't an inherent issue, it's introduced by implicit wrapping of primitive values (which we borrow from HTML, it still doesn't make sense for me for the purpose of user interfaces). I'm quite sure that performance implications of this should be minimal, but it obviously does significantly affect the code in one way or another. That is no doubt a very important consideration, but React has challenged many of the wrong-doings and inconvenient conveniences of HTML so far, I think this is another one that should be.

@glenjamin

This comment has been minimized.

Show comment
Hide comment
@glenjamin

glenjamin Mar 25, 2015

Contributor

It doesn't have to be a reference identity. It can be a unique enough token string or number. This token has to be shared between environments in a similar way as CSRF. By default we may share it with the realm through the global.

This sounds like it should work - is the idea that by default it would be randomly generated on the client. but there'd be a top-level API to read/write it?

Contributor

glenjamin commented Mar 25, 2015

It doesn't have to be a reference identity. It can be a unique enough token string or number. This token has to be shared between environments in a similar way as CSRF. By default we may share it with the realm through the global.

This sounds like it should work - is the idea that by default it would be randomly generated on the client. but there'd be a top-level API to read/write it?

@brigand

This comment has been minimized.

Show comment
Hide comment
@brigand

brigand Mar 25, 2015

Contributor

I like the idea of trust, and React.createElement being the mechanism to declare trust in the current environment.

const trust = element => typeof element === "object" && element
  ? React.createElement(element.type, element.props, ...[].concat(element.children))
  : !element ? false
  : String(element);
// throw if not a string or trusted element
<div>{stringOrDie}</div> 

// this span is trusted because React.createElement
<div><span>hey</span></div>

// explicitly React.createElement the POJO
<div>{trust(stringOrElement)}</div> 

The simplest way to do this is to have each element have a common property on the prototype. This can be exposed globally and/or allowed to be set externally for multiple React instances and integration with other libraries. Alternatively it could be an own non-enumerable property, or R.cE could set a toJSON that excludes this property. It'll cause confusion if you end up sending this to the server, and get an error about an incorrect 'token', rather than a missing one.

Adding extra syntax for this would be unfortunate. A simple helper function exposed on React would be good, because this is an explicit opt-out to a (future) react security feature, like angular's $sce.trustAsHtml.

The case where you have a serialized element is cool, but I don't think asking for a little more explicitness will harm anyone... users, or the person maintaining the code. "Oh, hey this is rendering something we get from the server, I need to be careful with this code."

Also... please throw, don't warn or String(it).

Contributor

brigand commented Mar 25, 2015

I like the idea of trust, and React.createElement being the mechanism to declare trust in the current environment.

const trust = element => typeof element === "object" && element
  ? React.createElement(element.type, element.props, ...[].concat(element.children))
  : !element ? false
  : String(element);
// throw if not a string or trusted element
<div>{stringOrDie}</div> 

// this span is trusted because React.createElement
<div><span>hey</span></div>

// explicitly React.createElement the POJO
<div>{trust(stringOrElement)}</div> 

The simplest way to do this is to have each element have a common property on the prototype. This can be exposed globally and/or allowed to be set externally for multiple React instances and integration with other libraries. Alternatively it could be an own non-enumerable property, or R.cE could set a toJSON that excludes this property. It'll cause confusion if you end up sending this to the server, and get an error about an incorrect 'token', rather than a missing one.

Adding extra syntax for this would be unfortunate. A simple helper function exposed on React would be good, because this is an explicit opt-out to a (future) react security feature, like angular's $sce.trustAsHtml.

The case where you have a serialized element is cool, but I don't think asking for a little more explicitness will harm anyone... users, or the person maintaining the code. "Oh, hey this is rendering something we get from the server, I need to be careful with this code."

Also... please throw, don't warn or String(it).

@RReverser

This comment has been minimized.

Show comment
Hide comment
@RReverser

RReverser Mar 29, 2015

Contributor

I might be missing something, but why does React need to complicate inserting stuff like conditional JSX elements by forcing users to wrap them, when it's not issue of React nor of 99% of use-cases when server is written properly.

As far as we understood from original report, the "issue" was that server was accepting JSON object of any schema (wat?) and saving/sending it as-is (wat??), while most use-cases pick needed fields and, of course, validate them against type/range expectations. That's the golden rule of code security - never trust data from client as-is and perform server validation. If your server accepts any random data objects and sends them over to clients, you have much bigger problems than you think, whether you're using React or not.

/cc @alexeyraspopov @zerkms

Contributor

RReverser commented Mar 29, 2015

I might be missing something, but why does React need to complicate inserting stuff like conditional JSX elements by forcing users to wrap them, when it's not issue of React nor of 99% of use-cases when server is written properly.

As far as we understood from original report, the "issue" was that server was accepting JSON object of any schema (wat?) and saving/sending it as-is (wat??), while most use-cases pick needed fields and, of course, validate them against type/range expectations. That's the golden rule of code security - never trust data from client as-is and perform server validation. If your server accepts any random data objects and sends them over to clients, you have much bigger problems than you think, whether you're using React or not.

/cc @alexeyraspopov @zerkms

@alexeyraspopov

This comment has been minimized.

Show comment
Hide comment
@alexeyraspopov

alexeyraspopov Mar 29, 2015

Contributor

@RReverser 👍

Also, @sebmarkbage are there any links with demo code to reproduce the issue?

Contributor

alexeyraspopov commented Mar 29, 2015

@RReverser 👍

Also, @sebmarkbage are there any links with demo code to reproduce the issue?

@alexeyraspopov

This comment has been minimized.

Show comment
Hide comment
@alexeyraspopov

alexeyraspopov Mar 29, 2015

Contributor

There is my attempt to reproduce this dangerous issue. As you can see, HTML code is added to body, but script doesn't work. Client-side rendering is protected.

Ignore that difficult string concatenation, JSFiddle uses client-side version of JSXTransformer.

Contributor

alexeyraspopov commented Mar 29, 2015

There is my attempt to reproduce this dangerous issue. As you can see, HTML code is added to body, but script doesn't work. Client-side rendering is protected.

Ignore that difficult string concatenation, JSFiddle uses client-side version of JSXTransformer.

@syranide

This comment has been minimized.

Show comment
Hide comment
@syranide

syranide Mar 30, 2015

Contributor

I might be missing something, but why does React need to complicate inserting stuff like conditional JSX elements by forcing users to wrap them, when it's not issue of React nor of 99% of use-cases when server is written properly.

@RReverser That applies to HTML as well and look at how that turned out :) React is obviously less susceptible as it relies on objects, but one missing/insufficient check somewhere (and a susceptible backend, say JSON schema) and an attacker can potentially act on behalf of all users visiting the site, that's the only reason I see why React should make an effort to prevent it (like it did with HTML markup).

Contributor

syranide commented Mar 30, 2015

I might be missing something, but why does React need to complicate inserting stuff like conditional JSX elements by forcing users to wrap them, when it's not issue of React nor of 99% of use-cases when server is written properly.

@RReverser That applies to HTML as well and look at how that turned out :) React is obviously less susceptible as it relies on objects, but one missing/insufficient check somewhere (and a susceptible backend, say JSON schema) and an attacker can potentially act on behalf of all users visiting the site, that's the only reason I see why React should make an effort to prevent it (like it did with HTML markup).

@brigand

This comment has been minimized.

Show comment
Hide comment
@brigand

brigand Mar 30, 2015

Contributor

@alexeyraspopov __html: "<img onload='alert()' src='/favicon.ico' />".

Contributor

brigand commented Mar 30, 2015

@alexeyraspopov __html: "<img onload='alert()' src='/favicon.ico' />".

@alexeyraspopov

This comment has been minimized.

Show comment
Hide comment
@alexeyraspopov

alexeyraspopov Mar 30, 2015

Contributor

@brigand okay :) And what can we tell about backend developer which allows server to receive JSON without fixed schema?

UPD: this type of XSS works in every JS framework/lib which allows you to render HTML.

Contributor

alexeyraspopov commented Mar 30, 2015

@brigand okay :) And what can we tell about backend developer which allows server to receive JSON without fixed schema?

UPD: this type of XSS works in every JS framework/lib which allows you to render HTML.

@glenjamin

This comment has been minimized.

Show comment
Hide comment
@glenjamin

glenjamin Mar 30, 2015

Contributor

@alexeyraspopov

Consider mustache, it provides two flavours of template iterpolation: {{}} and {{{}}}.

<p>{{name}}</p> <!-- no XSS if name contains HTML -->
<p>{{{name}}}</p> <!-- XSS if name contains HTML -->

The default is safe, and template authors can opt-in to unsafe behaviour, with the understanding that the variable used in the unsafe portion will be checked more stringently.

React also has a feature, where the unsafe version is even more awkward - so it cannot be missed:

<p>{name}</p> {// safe}
<p dangerouslySetInnerHTML={{__html: name}} /> {// unsafe}

However, because React is now translating createElement calls into literals, the "safe" version is not safe at all.

This creates a false sense of security. Good security is like an onion, it should have many layers.

So yes, developers shouldn't allow arbitrary data - but there's a reason the top entries on the OWASP top 10 are related to untrusted data passing through the system - it happens in practice.

Any reasonable steps popular libraries can take which makes it hard to makes these mistakes has a very high net gain.

Contributor

glenjamin commented Mar 30, 2015

@alexeyraspopov

Consider mustache, it provides two flavours of template iterpolation: {{}} and {{{}}}.

<p>{{name}}</p> <!-- no XSS if name contains HTML -->
<p>{{{name}}}</p> <!-- XSS if name contains HTML -->

The default is safe, and template authors can opt-in to unsafe behaviour, with the understanding that the variable used in the unsafe portion will be checked more stringently.

React also has a feature, where the unsafe version is even more awkward - so it cannot be missed:

<p>{name}</p> {// safe}
<p dangerouslySetInnerHTML={{__html: name}} /> {// unsafe}

However, because React is now translating createElement calls into literals, the "safe" version is not safe at all.

This creates a false sense of security. Good security is like an onion, it should have many layers.

So yes, developers shouldn't allow arbitrary data - but there's a reason the top entries on the OWASP top 10 are related to untrusted data passing through the system - it happens in practice.

Any reasonable steps popular libraries can take which makes it hard to makes these mistakes has a very high net gain.

@sebmarkbage

This comment has been minimized.

Show comment
Hide comment
@sebmarkbage

sebmarkbage Apr 7, 2015

Member

@syranide I see your point. You might be right. The usability and performance issues might be deal breakers - especially when other solutions are less invasive.

I'm still trying to wrap my head around where the line can be drawn between various data types. What is is that makes a string safe when its wrapped and when it is not. What makes an object not safe and when is it safe?

I am concerned that it is still too easy to do the wrong thing even with an explicit syntax:

The difference between something like this is very subtle:

<div>{=this.props.user}</div>
<div>{this.props.user}</div>

And both are legit use cases. You would have to actively test these code branches to notice that you did something wrong when you expected a string. Everyone has these kind of edge case code branches that go into production without ever being properly tested. Would this be a false sense of security as well?

Ultimately, I don't think that validating data structures is something that React needs to necessarily concern itself with. (We have type systems for that.)

The reason I think we need to address this particular issue is because the cost (XSS) is too high. I think it is fine if this mistake causes your site to break. I even think it would be ok if we had something that allows for UI redressing in this scenario. There are plenty of other ways to get to that point.

However, it is not ok that a simple mistake like this causes full cross-site script execution. This is too high of a cost/probability ratio. The risk is too high even when the probability is low.

I think the argument for the sourceID solution is that it specifically addresses the tag creation scenario which is causes these high risk scenarios.

There are still other issues that are not solved by simple data types alone:

<a href={this.props.userProvidedURL} />
<div style={{ border: this.props.userProvidedBorder }} />
Member

sebmarkbage commented Apr 7, 2015

@syranide I see your point. You might be right. The usability and performance issues might be deal breakers - especially when other solutions are less invasive.

I'm still trying to wrap my head around where the line can be drawn between various data types. What is is that makes a string safe when its wrapped and when it is not. What makes an object not safe and when is it safe?

I am concerned that it is still too easy to do the wrong thing even with an explicit syntax:

The difference between something like this is very subtle:

<div>{=this.props.user}</div>
<div>{this.props.user}</div>

And both are legit use cases. You would have to actively test these code branches to notice that you did something wrong when you expected a string. Everyone has these kind of edge case code branches that go into production without ever being properly tested. Would this be a false sense of security as well?

Ultimately, I don't think that validating data structures is something that React needs to necessarily concern itself with. (We have type systems for that.)

The reason I think we need to address this particular issue is because the cost (XSS) is too high. I think it is fine if this mistake causes your site to break. I even think it would be ok if we had something that allows for UI redressing in this scenario. There are plenty of other ways to get to that point.

However, it is not ok that a simple mistake like this causes full cross-site script execution. This is too high of a cost/probability ratio. The risk is too high even when the probability is low.

I think the argument for the sourceID solution is that it specifically addresses the tag creation scenario which is causes these high risk scenarios.

There are still other issues that are not solved by simple data types alone:

<a href={this.props.userProvidedURL} />
<div style={{ border: this.props.userProvidedBorder }} />
@zerkms

This comment has been minimized.

Show comment
Hide comment
@zerkms

zerkms Apr 7, 2015

With React.dangerouslyTrustAllSources(); available there is a high chance that people will blindly allow it since they "found it on stackoverflow" or "read in an article written by that guy" which would mitigate all the "protection" brought.

As a result - people who have no idea what they do will make them "vulnerable" themselves. And people with security in mind would continue writing as protected code as before (but now using more complex tool that requires more attention).

The question: what have the product (reactjs) and the community gained from it then?

zerkms commented Apr 7, 2015

With React.dangerouslyTrustAllSources(); available there is a high chance that people will blindly allow it since they "found it on stackoverflow" or "read in an article written by that guy" which would mitigate all the "protection" brought.

As a result - people who have no idea what they do will make them "vulnerable" themselves. And people with security in mind would continue writing as protected code as before (but now using more complex tool that requires more attention).

The question: what have the product (reactjs) and the community gained from it then?

@brigand

This comment has been minimized.

Show comment
Hide comment
@brigand

brigand Apr 7, 2015

Contributor

I didn't know react allowed style={{border: '1px solid black; background: red'}} or href={'javascript:alert();void 0;'}... these seem like a simple fixes that should be done.

The href is bad because it's common; the style is very bad because it's very unexpected. <div style={{foo: 'bar'}}/> => div.style.foo = 'bar' is expected, and safe escaping or error for the innerHTML/renderToString case.

If someone really needs them, they can set it manually or user __html; I doubt any real react code would be broken by these.

Contributor

brigand commented Apr 7, 2015

I didn't know react allowed style={{border: '1px solid black; background: red'}} or href={'javascript:alert();void 0;'}... these seem like a simple fixes that should be done.

The href is bad because it's common; the style is very bad because it's very unexpected. <div style={{foo: 'bar'}}/> => div.style.foo = 'bar' is expected, and safe escaping or error for the innerHTML/renderToString case.

If someone really needs them, they can set it manually or user __html; I doubt any real react code would be broken by these.

@syranide

This comment has been minimized.

Show comment
Hide comment
@syranide

syranide Apr 7, 2015

Contributor

I'm still trying to wrap my head around where the line can be drawn between various data types. What is is that makes a string safe when its wrapped and when it is not. What makes an object not safe and when is it safe?

@sebmarkbage I think to me it kind of comes down to "Foobar", it's currently both an element and a primitive. If you instead imagine something like <MyButton label={createValue('Foobar')} />, whatever you put inside createValue becomes opaque, toString-ed and safe to render. Supplying a primitive value label would throw.

In this example, I would argue that MyButton.label should sensibly only accept primitive values, <MyButton label="Foobar" />. MyButton would internally call createValue(this.props.label). So whatever you value you pass label will be safe to render. If MyButton would instead accept elements only, then passing a non-element would throw.

I am concerned that it is still too easy to do the wrong thing even with an explicit syntax:
The difference between something like this is very subtle:
...

Yeah (regardless of syntax really). But createElement('div', {}, 'Foobar') will throw, and createValue(createElement('div')) will helpfully throw, warn or toString it to something harmless. So yes, you can still mess it up, but if you run it and it looks correct then it is correct, a rendered value can no longer turn into arbitrary elements if given malicious data which it previously could.

Ultimately, I don't think that validating data structures is something that React needs to necessarily concern itself with. (We have type systems for that.)

With static type checking you would catch the above mistake without running the code, implicit wrapping prevents that as far as I see.

The usability and performance issues might be deal breakers - especially when other solutions are less invasive.

I suspect that performance won't be an issue in the end, static values can be trivially reused and pooling should make dynamic values virtually free. But, yes there are some obstacles.

Usability is a very valid point. Any frontend that doesn't have style inheritance wouldn't understand createValue and would be unaffected, a string wouldn't be renderable by itself, to render text you explicitly render <label value="Foobar" ... /> (or <label ...>Foobar</label>). So it's only frontends with style inheritance (like HTML) or rich-text components that would be affected by the introduction of createValue.

I don't have it all laid out in my head yet, but implicit behavior can be fragile/dangerous and we've seen that implicit wrapping is, workarounds will only reduce the problem, not fix it. So I think it's a matter of realizing that the implicit model is flawed in some way, can we take the explicit model and somehow make it almost as friendly as the implicit for such frontends? I suspect the answer is yes.

PS. I realize this really should be a separate discussion, it's not really about solving this XSS issue in any immediate sense. It's about whether or not the implicit model is too flawed. Less invasive solutions are far more realistic at this point.

Contributor

syranide commented Apr 7, 2015

I'm still trying to wrap my head around where the line can be drawn between various data types. What is is that makes a string safe when its wrapped and when it is not. What makes an object not safe and when is it safe?

@sebmarkbage I think to me it kind of comes down to "Foobar", it's currently both an element and a primitive. If you instead imagine something like <MyButton label={createValue('Foobar')} />, whatever you put inside createValue becomes opaque, toString-ed and safe to render. Supplying a primitive value label would throw.

In this example, I would argue that MyButton.label should sensibly only accept primitive values, <MyButton label="Foobar" />. MyButton would internally call createValue(this.props.label). So whatever you value you pass label will be safe to render. If MyButton would instead accept elements only, then passing a non-element would throw.

I am concerned that it is still too easy to do the wrong thing even with an explicit syntax:
The difference between something like this is very subtle:
...

Yeah (regardless of syntax really). But createElement('div', {}, 'Foobar') will throw, and createValue(createElement('div')) will helpfully throw, warn or toString it to something harmless. So yes, you can still mess it up, but if you run it and it looks correct then it is correct, a rendered value can no longer turn into arbitrary elements if given malicious data which it previously could.

Ultimately, I don't think that validating data structures is something that React needs to necessarily concern itself with. (We have type systems for that.)

With static type checking you would catch the above mistake without running the code, implicit wrapping prevents that as far as I see.

The usability and performance issues might be deal breakers - especially when other solutions are less invasive.

I suspect that performance won't be an issue in the end, static values can be trivially reused and pooling should make dynamic values virtually free. But, yes there are some obstacles.

Usability is a very valid point. Any frontend that doesn't have style inheritance wouldn't understand createValue and would be unaffected, a string wouldn't be renderable by itself, to render text you explicitly render <label value="Foobar" ... /> (or <label ...>Foobar</label>). So it's only frontends with style inheritance (like HTML) or rich-text components that would be affected by the introduction of createValue.

I don't have it all laid out in my head yet, but implicit behavior can be fragile/dangerous and we've seen that implicit wrapping is, workarounds will only reduce the problem, not fix it. So I think it's a matter of realizing that the implicit model is flawed in some way, can we take the explicit model and somehow make it almost as friendly as the implicit for such frontends? I suspect the answer is yes.

PS. I realize this really should be a separate discussion, it's not really about solving this XSS issue in any immediate sense. It's about whether or not the implicit model is too flawed. Less invasive solutions are far more realistic at this point.

@sebmarkbage

This comment has been minimized.

Show comment
Hide comment
@sebmarkbage

sebmarkbage Apr 7, 2015

Member

@brigand We've already evaluated and decided against protecting against invalid styles. It is not a simple fix. When we last evaluated it, we found that:

  1. It was prohibitively expensive to do this automatically for you for any given value.
  2. A whitelist of safe patterns would be massively huge and we actively want to move away from whitelists in React and fallback to the underlying DOM.
  3. Any blacklist we could come up with would likely be incomplete anyway.

IE's CSS expressions is another one. Various url and paint expressions. Semicolons was not the only concern, but also various complex parsing rules.

I agree that this is bad because it relies on an implementation detail of React (HTML serialization instead of element.style.foo =). Hopefully one day, the slowest browsers will be fast enough to use that technique instead.

As for javascript: in href. Perhaps that should be protected against but it is also arguable that would be React overreaching. I do agree that the cost of accidentally doing this is very high which is why we've added additional protection against innerHTML.

At the end of the day, we can't protect against everything crazy in the DOM. We target it to interop with the environment that is there. We can put higher demands on new abstractions such as React Native.

It is also highly recommended that you build your own set of high level components instead of targeting HTML directly. This is what we do at Facebook because "semantic HTML" is not enough.

Member

sebmarkbage commented Apr 7, 2015

@brigand We've already evaluated and decided against protecting against invalid styles. It is not a simple fix. When we last evaluated it, we found that:

  1. It was prohibitively expensive to do this automatically for you for any given value.
  2. A whitelist of safe patterns would be massively huge and we actively want to move away from whitelists in React and fallback to the underlying DOM.
  3. Any blacklist we could come up with would likely be incomplete anyway.

IE's CSS expressions is another one. Various url and paint expressions. Semicolons was not the only concern, but also various complex parsing rules.

I agree that this is bad because it relies on an implementation detail of React (HTML serialization instead of element.style.foo =). Hopefully one day, the slowest browsers will be fast enough to use that technique instead.

As for javascript: in href. Perhaps that should be protected against but it is also arguable that would be React overreaching. I do agree that the cost of accidentally doing this is very high which is why we've added additional protection against innerHTML.

At the end of the day, we can't protect against everything crazy in the DOM. We target it to interop with the environment that is there. We can put higher demands on new abstractions such as React Native.

It is also highly recommended that you build your own set of high level components instead of targeting HTML directly. This is what we do at Facebook because "semantic HTML" is not enough.

@sebmarkbage

This comment has been minimized.

Show comment
Hide comment
@sebmarkbage

sebmarkbage Apr 7, 2015

Member

@syranide I think this is a valuable conversation to have in this thread since this is a generic catch-all issue and not necessarily only related to one issue alone (see the title).

I don't really think that there is anything special about primitive values. A string is only one kind of data structure and we'd want to support many different ones. For higher level components you might have objects that represent the same kind of values. Even strings could be reimplemented with a different data structure in something like immutable.js. That may also be JSON serializable.

The key issue here is the union type. You have a set that accepts one of multiple types and you want to make sure that the user explicitly decides which one they put there. I think that ultimately that become prohibitively verbose. That is why we have union types (or overloads for that matter).

FWIW, we have a lot of components at FB that accept both strings or rich text. We need a convenient way to pass rich text through out the system too. It should also be possible to build it up in one place, componentize and use it in another. Since these rich text elements could include animation etc. it might make sense to allow them to be stateful too.

In that scenario, you would want the underlying component rendering of a label to accept a subset of components for its rendering. That's what we have a type system for (except propTypes is a dev-only system which doesn't protect against this kind of attack).

Member

sebmarkbage commented Apr 7, 2015

@syranide I think this is a valuable conversation to have in this thread since this is a generic catch-all issue and not necessarily only related to one issue alone (see the title).

I don't really think that there is anything special about primitive values. A string is only one kind of data structure and we'd want to support many different ones. For higher level components you might have objects that represent the same kind of values. Even strings could be reimplemented with a different data structure in something like immutable.js. That may also be JSON serializable.

The key issue here is the union type. You have a set that accepts one of multiple types and you want to make sure that the user explicitly decides which one they put there. I think that ultimately that become prohibitively verbose. That is why we have union types (or overloads for that matter).

FWIW, we have a lot of components at FB that accept both strings or rich text. We need a convenient way to pass rich text through out the system too. It should also be possible to build it up in one place, componentize and use it in another. Since these rich text elements could include animation etc. it might make sense to allow them to be stateful too.

In that scenario, you would want the underlying component rendering of a label to accept a subset of components for its rendering. That's what we have a type system for (except propTypes is a dev-only system which doesn't protect against this kind of attack).

@syranide

This comment has been minimized.

Show comment
Hide comment
@syranide

syranide Apr 7, 2015

Contributor

@sebmarkbage Perhaps I wasn't entirely clear (I think). It seems to me that "everything is rich-text" model of HTML is flawed; it's great for documents but it's immensely fragile for user interfaces. Documents can be rendered within a user interface, but you don't render a user interface within a document, so it doesn't make sense to me why we should be using the document model of HTML as the fundamental design language if it weren't for it being the current render target. I would argue that any mature React user interface should do its best to avoid the HTML style inheritance/document model. I.e.

<div>
  <span style={...}>Click that</span>
  <button style={...}>Click me</button>
</div>

...and not...

<div style={...}>
  Click that
  <button style={...}>Click me</button>
</div>

In a more simpler more traditional setup I would write that as:

<panel>
  <label value="Click that" style={...} />
  <button label="Click me" style={...} />
</panel>

...or if you want a more generic button implementation (whatever is in children is the label)...

<panel>
  <label value="Click that" style={...} />
  <button>
    <label value="Click me" style={...} />
  </button>
</panel>

There is no longer a universal understanding of how to render a string. Components are inherently isolated and reusable from a style perspective, primitives cannot be rendered so the this XSS issue goes out the window and even white-space rules become irrelevant. There's an inherent technical beauty in that if you will that's as unambigious as it gets (for whatever it's worth), it could be a very formal no nonsense syntax that would be hard to argue with... it could even be marketed as a new general purpose primitive type that's not limited to user interfaces (ahem, we're basically talking about an XML subset now...).

That should be our starting point in my opinion. The only thing that's stands out is rich-text components which usually finds themselves useful in most user interfaces, so they should definitely not be forgotten. But it doesn't necessarily mean that rich-text should have a dedicated syntax or perhaps not the current syntax.

React is currently targeting HTML which makes this a daunting aspect, but that's only because HTML provides all these conveniences, if we would have targeted anything else I'm quite sure React/JSX wouldn't have walked down this path without significant consideration. I'm not sure what the solution is, but HTML sets itself squarely apart from all other user interface targets and doesn't really prove itself exceptionally competent so I think it's only sane to question whether HTML is really an ideal role model for React/JSX user interfaces.

Contributor

syranide commented Apr 7, 2015

@sebmarkbage Perhaps I wasn't entirely clear (I think). It seems to me that "everything is rich-text" model of HTML is flawed; it's great for documents but it's immensely fragile for user interfaces. Documents can be rendered within a user interface, but you don't render a user interface within a document, so it doesn't make sense to me why we should be using the document model of HTML as the fundamental design language if it weren't for it being the current render target. I would argue that any mature React user interface should do its best to avoid the HTML style inheritance/document model. I.e.

<div>
  <span style={...}>Click that</span>
  <button style={...}>Click me</button>
</div>

...and not...

<div style={...}>
  Click that
  <button style={...}>Click me</button>
</div>

In a more simpler more traditional setup I would write that as:

<panel>
  <label value="Click that" style={...} />
  <button label="Click me" style={...} />
</panel>

...or if you want a more generic button implementation (whatever is in children is the label)...

<panel>
  <label value="Click that" style={...} />
  <button>
    <label value="Click me" style={...} />
  </button>
</panel>

There is no longer a universal understanding of how to render a string. Components are inherently isolated and reusable from a style perspective, primitives cannot be rendered so the this XSS issue goes out the window and even white-space rules become irrelevant. There's an inherent technical beauty in that if you will that's as unambigious as it gets (for whatever it's worth), it could be a very formal no nonsense syntax that would be hard to argue with... it could even be marketed as a new general purpose primitive type that's not limited to user interfaces (ahem, we're basically talking about an XML subset now...).

That should be our starting point in my opinion. The only thing that's stands out is rich-text components which usually finds themselves useful in most user interfaces, so they should definitely not be forgotten. But it doesn't necessarily mean that rich-text should have a dedicated syntax or perhaps not the current syntax.

React is currently targeting HTML which makes this a daunting aspect, but that's only because HTML provides all these conveniences, if we would have targeted anything else I'm quite sure React/JSX wouldn't have walked down this path without significant consideration. I'm not sure what the solution is, but HTML sets itself squarely apart from all other user interface targets and doesn't really prove itself exceptionally competent so I think it's only sane to question whether HTML is really an ideal role model for React/JSX user interfaces.

@sophiebits

This comment has been minimized.

Show comment
Hide comment
@sophiebits

sophiebits Apr 7, 2015

Member

React Native's <Text> almost works like that, though you can have both strings and other Text elements as children of a Text. Other components don't accept text children like that.

Member

sophiebits commented Apr 7, 2015

React Native's <Text> almost works like that, though you can have both strings and other Text elements as children of a Text. Other components don't accept text children like that.

@sebmarkbage

This comment has been minimized.

Show comment
Hide comment
@sebmarkbage

sebmarkbage Apr 9, 2015

Member

@syranide Mid/Long term I think you're right that this is the kind of architecture that React and its users should be moving towards. However it is not an easy shift to do immediately.

I don't see that the trusted source solution needs to block any important features that we would've gotten, by shifting to the newer architecture. So it seems like a safe short term solution that also doesn't block anything we were planning to do anyway.

Member

sebmarkbage commented Apr 9, 2015

@syranide Mid/Long term I think you're right that this is the kind of architecture that React and its users should be moving towards. However it is not an easy shift to do immediately.

I don't see that the trusted source solution needs to block any important features that we would've gotten, by shifting to the newer architecture. So it seems like a safe short term solution that also doesn't block anything we were planning to do anyway.

@syranide

This comment has been minimized.

Show comment
Hide comment
@syranide

syranide Apr 9, 2015

Contributor

@sebmarkbage Agree and agree.

Contributor

syranide commented Apr 9, 2015

@sebmarkbage Agree and agree.

@sebmarkbage

This comment has been minimized.

Show comment
Hide comment
@sebmarkbage

sebmarkbage Apr 9, 2015

Member

I've been thinking about the priorities here. As I see it there are three threats:

A) XSS through dangerouslySetInnerHTML, href="javascript:..." or script tags etc.
B) Clickjacking by <a href="..." />, <form /> or similar.
C) Other types of UI redressing scenarios that doesn't involve interactivity.

If we agree that React is the secondary layer to an already broken issue, and the real solution always is to fix the JSON pass-through hole. Then it follows that React is not responsible for providing the permanent solution. React is only responsible for mitigating the impact, IF this hole occurs.

Full XSS (Threat A) is clearly a too high of a cost of this bug and even if you fix it quickly the harm is already done.

Getting an embarrassing redressing message or phishing attempt through a static message (Threat C) might not be as bad. Even if they do happen, there's no guarantee that there is any significant harm done. They can then be fixed by addressing the pass-through hole, before any real harm is done.

If we agree that we only need to protect against A and possibly B, then we have a lot more possibilities to limit the scope of the problem to a few sensitive attributes.

Another thing that I would like to start thinking about is.... What if we ran React in a worker? What if we allowed a cross-origin script to communicate through the worker and render arbitrary ReactElements (which could be targeting higher level abstractions or real DOM nodes)? What kind of things would we need to protect against then?

The point of that scenario is explicitly to allow arbitrary UI elements in a subtree of the page (probably with overflow: hidden) which means explicitly allowing for Threat C and probably even B.

Member

sebmarkbage commented Apr 9, 2015

I've been thinking about the priorities here. As I see it there are three threats:

A) XSS through dangerouslySetInnerHTML, href="javascript:..." or script tags etc.
B) Clickjacking by <a href="..." />, <form /> or similar.
C) Other types of UI redressing scenarios that doesn't involve interactivity.

If we agree that React is the secondary layer to an already broken issue, and the real solution always is to fix the JSON pass-through hole. Then it follows that React is not responsible for providing the permanent solution. React is only responsible for mitigating the impact, IF this hole occurs.

Full XSS (Threat A) is clearly a too high of a cost of this bug and even if you fix it quickly the harm is already done.

Getting an embarrassing redressing message or phishing attempt through a static message (Threat C) might not be as bad. Even if they do happen, there's no guarantee that there is any significant harm done. They can then be fixed by addressing the pass-through hole, before any real harm is done.

If we agree that we only need to protect against A and possibly B, then we have a lot more possibilities to limit the scope of the problem to a few sensitive attributes.

Another thing that I would like to start thinking about is.... What if we ran React in a worker? What if we allowed a cross-origin script to communicate through the worker and render arbitrary ReactElements (which could be targeting higher level abstractions or real DOM nodes)? What kind of things would we need to protect against then?

The point of that scenario is explicitly to allow arbitrary UI elements in a subtree of the page (probably with overflow: hidden) which means explicitly allowing for Threat C and probably even B.

@syranide

This comment has been minimized.

Show comment
Hide comment
@syranide

syranide Apr 9, 2015

Contributor

@sebmarkbage As far as I'm concerned those are all related to HTML and not React (except for this particular issue). So while they should be addressed, doing so without implementing a much safer abstraction of HTML (which it seems we won't) doesn't seem worthwhile and I'm sure there would still be many tricks that ReactDOM might simply not be able to guard against. I think publicly documenting and explaining all the known pitfalls would go a long long way. It seems to me that the biggest issue right now is the general lack of awareness, not mistakes.

E.g. what are all the dangers of a user supplied href? javascript: is one, but it applies to 3rd-party protocols too, including urls without a protocol/domain part, so you should always block everything but say http:// and https://. You should also beware of links that point to the current domain which could possibly perform operations on behalf of that user. React simply cannot guard against this, it needs to be documented and possibly be made available as ready-made helpers.

It's also important to document that style={{backgroundColor: userColor}} is also exploitable and allows injection of arbitrary CSS. It won't stop the problem, but just having it publicly documented is a big step (seeing as it's unlikely that we can ever out-right prevent it when targeting HTML).

Contributor

syranide commented Apr 9, 2015

@sebmarkbage As far as I'm concerned those are all related to HTML and not React (except for this particular issue). So while they should be addressed, doing so without implementing a much safer abstraction of HTML (which it seems we won't) doesn't seem worthwhile and I'm sure there would still be many tricks that ReactDOM might simply not be able to guard against. I think publicly documenting and explaining all the known pitfalls would go a long long way. It seems to me that the biggest issue right now is the general lack of awareness, not mistakes.

E.g. what are all the dangers of a user supplied href? javascript: is one, but it applies to 3rd-party protocols too, including urls without a protocol/domain part, so you should always block everything but say http:// and https://. You should also beware of links that point to the current domain which could possibly perform operations on behalf of that user. React simply cannot guard against this, it needs to be documented and possibly be made available as ready-made helpers.

It's also important to document that style={{backgroundColor: userColor}} is also exploitable and allows injection of arbitrary CSS. It won't stop the problem, but just having it publicly documented is a big step (seeing as it's unlikely that we can ever out-right prevent it when targeting HTML).

sebmarkbage added a commit to sebmarkbage/react that referenced this issue Sep 10, 2015

Use a Symbol to tag every ReactElement
Fixes #3473

I tag each React element with `$$typeof: Symbol.for('react.element')`. We need
this to be able to safely distinguish these from plain objects that might have
come from user provided JSON.

The idiomatic JavaScript way of tagging an object is for it to inherent some
prototype and then use `instanceof` to test for it.

However, this has limitations since it doesn't work with value types which
require `typeof` checks. They also don't work across realms. Which is why there
are alternative tag checks like `Array.isArray` or the `toStringTag`. Another
problem is that different instances of React that might have been created not knowing about eachother. npm tends to make this kind of problem occur a lot.

Additionally, it is our hope that ReactElement will one day be specified in
terms of a "Value Type" style record instead of a plain Object.

This Value Types proposal by @nikomatsakis is currently on hold but does satisfy all these requirements:

https://github.com/nikomatsakis/typed-objects-explainer/blob/master/valuetypes.md#the-typeof-operator

Additionally, there is already a system for coordinating tags across module
systems and even realms in ES6. Namely using `Symbol.for`.

Currently these objects are not able to transfer between Workers but there is
nothing preventing that from being possible in the future. You could imagine
even `Symbol.for` working across Worker boundaries. You could also build a
system that coordinates Symbols and Value Types from server to client or through
serialized forms. That's beyond the scope of React itself, and if it was built
it seems like it would belong with the `Symbol` system. A system could override
the `Symbol.for('react.element')` to return a plain yet
cryptographically random or unique number. That would allow ReactElements to
pass through JSON without risking the XSS issue.

The fallback solution is a plain well-known number. This makes it unsafe with
regard to the XSS issue described in #3473. We could have used a much more
convoluted solution to protect against JSON specifically but that would require
some kind of significant coordination, or change the check to do a
`typeof element.$$typeof === 'function'` check which would not make it unique to
React. It seems cleaner to just use a fixed number since the protection is just
a secondary layer anyway. I'm not sure if this is the right tradeoff.

In short, if you want the XSS protection, use a proper Symbol polyfill.

Finally, the reason for calling it `$$typeof` is to avoid confusion with `.type`
and the use case is to add a tag that the `typeof` operator would refer to.
I would use `@@typeof` but that seems to deopt in JSC. I also don't use
`__typeof` because this is more than a framework private. It should really be
part of the polyfilling layer.

sebmarkbage added a commit to sebmarkbage/react that referenced this issue Sep 10, 2015

Use a Symbol to tag every ReactElement
Fixes #3473

I tag each React element with `$$typeof: Symbol.for('react.element')`. We need
this to be able to safely distinguish these from plain objects that might have
come from user provided JSON.

The idiomatic JavaScript way of tagging an object is for it to inherent some
prototype and then use `instanceof` to test for it.

However, this has limitations since it doesn't work with value types which
require `typeof` checks. They also don't work across realms. Which is why there
are alternative tag checks like `Array.isArray` or the `toStringTag`. Another
problem is that different instances of React that might have been created not knowing about eachother. npm tends to make this kind of problem occur a lot.

Additionally, it is our hope that ReactElement will one day be specified in
terms of a "Value Type" style record instead of a plain Object.

This Value Types proposal by @nikomatsakis is currently on hold but does satisfy all these requirements:

https://github.com/nikomatsakis/typed-objects-explainer/blob/master/valuetypes.md#the-typeof-operator

Additionally, there is already a system for coordinating tags across module
systems and even realms in ES6. Namely using `Symbol.for`.

Currently these objects are not able to transfer between Workers but there is
nothing preventing that from being possible in the future. You could imagine
even `Symbol.for` working across Worker boundaries. You could also build a
system that coordinates Symbols and Value Types from server to client or through
serialized forms. That's beyond the scope of React itself, and if it was built
it seems like it would belong with the `Symbol` system. A system could override
the `Symbol.for('react.element')` to return a plain yet
cryptographically random or unique number. That would allow ReactElements to
pass through JSON without risking the XSS issue.

The fallback solution is a plain well-known number. This makes it unsafe with
regard to the XSS issue described in #3473. We could have used a much more
convoluted solution to protect against JSON specifically but that would require
some kind of significant coordination, or change the check to do a
`typeof element.$$typeof === 'function'` check which would not make it unique to
React. It seems cleaner to just use a fixed number since the protection is just
a secondary layer anyway. I'm not sure if this is the right tradeoff.

In short, if you want the XSS protection, use a proper Symbol polyfill.

Finally, the reason for calling it `$$typeof` is to avoid confusion with `.type`
and the use case is to add a tag that the `typeof` operator would refer to.
I would use `@@typeof` but that seems to deopt in JSC. I also don't use
`__typeof` because this is more than a framework private. It should really be
part of the polyfilling layer.

sebmarkbage added a commit to sebmarkbage/react that referenced this issue Sep 10, 2015

Use a Symbol to tag every ReactElement
Fixes #3473

I tag each React element with `$$typeof: Symbol.for('react.element')`. We need
this to be able to safely distinguish these from plain objects that might have
come from user provided JSON.

The idiomatic JavaScript way of tagging an object is for it to inherent some
prototype and then use `instanceof` to test for it.

However, this has limitations since it doesn't work with value types which
require `typeof` checks. They also don't work across realms. Which is why there
are alternative tag checks like `Array.isArray` or the `toStringTag`. Another
problem is that different instances of React that might have been created not knowing about eachother. npm tends to make this kind of problem occur a lot.

Additionally, it is our hope that ReactElement will one day be specified in
terms of a "Value Type" style record instead of a plain Object.

This Value Types proposal by @nikomatsakis is currently on hold but does satisfy all these requirements:

https://github.com/nikomatsakis/typed-objects-explainer/blob/master/valuetypes.md#the-typeof-operator

Additionally, there is already a system for coordinating tags across module
systems and even realms in ES6. Namely using `Symbol.for`.

Currently these objects are not able to transfer between Workers but there is
nothing preventing that from being possible in the future. You could imagine
even `Symbol.for` working across Worker boundaries. You could also build a
system that coordinates Symbols and Value Types from server to client or through
serialized forms. That's beyond the scope of React itself, and if it was built
it seems like it would belong with the `Symbol` system. A system could override
the `Symbol.for('react.element')` to return a plain yet
cryptographically random or unique number. That would allow ReactElements to
pass through JSON without risking the XSS issue.

The fallback solution is a plain well-known number. This makes it unsafe with
regard to the XSS issue described in #3473. We could have used a much more
convoluted solution to protect against JSON specifically but that would require
some kind of significant coordination, or change the check to do a
`typeof element.$$typeof === 'function'` check which would not make it unique to
React. It seems cleaner to just use a fixed number since the protection is just
a secondary layer anyway. I'm not sure if this is the right tradeoff.

In short, if you want the XSS protection, use a proper Symbol polyfill.

Finally, the reason for calling it `$$typeof` is to avoid confusion with `.type`
and the use case is to add a tag that the `typeof` operator would refer to.
I would use `@@typeof` but that seems to deopt in JSC. I also don't use
`__typeof` because this is more than a framework private. It should really be
part of the polyfilling layer.

@sebmarkbage sebmarkbage closed this in #4832 Sep 10, 2015

@iddan

This comment has been minimized.

Show comment
Hide comment
@iddan

iddan Apr 29, 2017

How about taking another approach here? Instead of wrapping createElement wrap component functions and classes. that way the output of these functions will be plain objects and in big trees of elements it will significantly reduce the function calls.

const MyComponent = (props) => <div />;

to

const MyComponent = component((props) => <div />);

component() will handle adding signatures for the generated objects.

iddan commented Apr 29, 2017

How about taking another approach here? Instead of wrapping createElement wrap component functions and classes. that way the output of these functions will be plain objects and in big trees of elements it will significantly reduce the function calls.

const MyComponent = (props) => <div />;

to

const MyComponent = component((props) => <div />);

component() will handle adding signatures for the generated objects.

@sophiebits

This comment has been minimized.

Show comment
Hide comment
@sophiebits

sophiebits May 1, 2017

Member

The problem is that wrapping after render returns doesn't help you distinguish between <div>{good}</div> and <div>{evil}</div>.

Member

sophiebits commented May 1, 2017

The problem is that wrapping after render returns doesn't help you distinguish between <div>{good}</div> and <div>{evil}</div>.

@iddan

This comment has been minimized.

Show comment
Hide comment
@iddan

iddan May 1, 2017

When is this needed @spicyj ?

iddan commented May 1, 2017

When is this needed @spicyj ?

@sophiebits

This comment has been minimized.

Show comment
Hide comment
@sophiebits

sophiebits May 1, 2017

Member

I think the original post explains the issue in detail. If this is problematic for you for some reason, please open a new issue.

Member

sophiebits commented May 1, 2017

I think the original post explains the issue in detail. If this is problematic for you for some reason, please open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment