Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate textarea newline normalization #6647

Closed
annevk opened this issue May 4, 2021 · 11 comments · Fixed by #6697
Closed

Investigate textarea newline normalization #6647

annevk opened this issue May 4, 2021 · 11 comments · Fixed by #6697
Labels
interop Implementations are not interoperable with each other topic: forms

Comments

@annevk
Copy link
Member

annevk commented May 4, 2021

As discussed in #6624 not all browsers align with the standard here. If Firefox and Safari indeed do LF normalization on all platforms that might be reasonable to go with instead. Otherwise we should just add a test and file bugs.

(wrap=hard not being supported in Firefox is a known bug.)

cc @andreubotella

@annevk annevk added topic: forms interop Implementations are not interoperable with each other labels May 4, 2021
@annevk
Copy link
Member Author

annevk commented May 4, 2021

web-platform-tests/wpt#28798 has some tests for this. @saschanaz perhaps you can run them on Windows? If Firefox doesn't emit CRLF there either I'd be comfortable suggesting a specification change.

@saschanaz
Copy link
Member

The tests all pass on Windows 👍

@annevk
Copy link
Member Author

annevk commented May 4, 2021

@mfreed7 this is another subtle change around newline handling in forms. It seems that Firefox/Safari use LF rather than CRLF for https://html.spec.whatwg.org/#textarea-wrapping-transformation. Would you be okay with aligning Chrome with that as well? If so, I can create a PR for HTML. Andreu already wrote tests.

@mfreed7
Copy link
Contributor

mfreed7 commented May 6, 2021

Thanks for splitting this issue out here. Just so I'm clear, there are three normalization behaviors for <textarea>:

  1. The "raw" value, which isn't normalized, and isn't practically observable.
  2. The "API value" which is what gets read back by textarea.value. The spec says this should normalize to LF. In my quick testing, it appears all three browsers do normalize to LF. I assume we're not talking about changing this normalization here.
  3. The "value" which is used for form submission. The spec says this uses the textarea wrapping transformation which is supposed to normalize to CRLF. Chrome does use CRLF, while WebKit and Gecko use just LF. This is what we're talking about. There is also the "hard" behavior, which sounds like there are potentially bugs but no spec issue.

Please let me know if I'm off anywhere above.

Given that for the entirety (?) of the rest of form serialization, "normalized" means "normalized to CRLF", is there a particular reason (other than 2/3 engines doing it) to make <textarea> a special LF-only snowflake? I think I'd prefer to stick to CRLF here for two reasons:

  1. Consistency with the rest of form serialization.
  2. Compat. Yes, 2/3 browsers do LF now, but in contrast to normalizing filenames, changing the normalization for <textarea> has the potential to really break stuff.

Are there good reasons to use LF for <textarea> serialization?

@annevk
Copy link
Member Author

annevk commented May 7, 2021

When it comes to form submission serialization, textarea would be normalized like any other form control as multipart/form-data and co cannot tell the form control origin of the key-value pairs they get handed.

This is specifically about what should happen for new FormData() or the formdata event. It makes sense to me that those would return the API value. Conveniently, this would also allow for merging of value and API value and simplify the textarea data model.

@andreubotella
Copy link
Member

andreubotella commented May 7, 2021

I actually took a look at the Chromium code that does the <textarea> wrapping transformation, and I'm pretty sure that it's actually normalizing to LF and wrapping with LF line breaks (see lines 964 and 996; kNewlineCharacter == 0x000A). This seems to be code inherited from WebKit, and it was never changed to use CRLF. The reason why new FormData(form) shows CRLF in textarea values is because they're normalized in "construct the entry list".

@annevk
Copy link
Member Author

annevk commented May 7, 2021

(@andreubotella and I further discussed this on IRC and to keep things simple we'd only change CRLF to LF as part of this issue and leave data model cleanup to #6662 as that is a bit more involved due to wrap=hard.)

@annevk
Copy link
Member Author

annevk commented May 19, 2021

@mfreed7 heya, any new thoughts on this given our feedback above? I'd like to land this change together with the other two changes and then start the effort to get all the implementations aligned and point everyone to the latest tests.

@mfreed7
Copy link
Contributor

mfreed7 commented May 19, 2021

Hey, sorry for the delay here. And thanks for the clarifications above. With that new understanding (textarea still serializes as CRLF, and only readback from new FormData() or formdata event gives LF), I'm happy with the proposed changes here. Let's get this landed. 😄 I don't believe we have bugs filed for this yet (?) against the implementations, is that right?

@andreubotella
Copy link
Member

I don't believe we have bugs filed for this yet (?) against the implementations, is that right?

I opened a Chromium bug for #6287 back in the day. Should I mention #6624 and this issue in a comment, or should I open a new bug?

Gecko and WebKit aren't affected by this issue, but I did open bugs on them for #6624, since we kept the USV conversion even though they didn't implement it.

andreubotella pushed a commit to andreubotella/html that referenced this issue May 19, 2021
Complements whatwg#6287 and whatwg#6624.

Fixes whatwg#6647.

See also whatwg#6662 for further cleanup on the textarea data model.
@mfreed7
Copy link
Contributor

mfreed7 commented May 19, 2021

I opened a Chromium bug for #6287 back in the day. Should I mention #6624 and this issue in a comment, or should I open a new bug?

That's perfect! Let's keep all of this together in just the one bug. We'll likely end up tackling this as a single project, so it's easier to have one bug.

Gecko and WebKit aren't affected by this issue, but I did open bugs on them for #6624, since we kept the USV conversion even though they didn't implement it.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interop Implementations are not interoperable with each other topic: forms
Development

Successfully merging a pull request may close this issue.

4 participants