Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

textarea helper swallows leading newline #393

Closed
Fjan opened this Issue · 46 comments
@Fjan

Browsers swallow the first newline after a textarea tag, as per the HTML spec. The text_area helpers in Rails do not emit a newline after the textarea tag and this has the unintended side effect that if the contents of the tag happens to start with a new line it will get eaten by the browser.

It's easy to reproduce by inserting a new line or two in a text area field in any rails app and then updating it (the newline only gets eaten on the second trip to the browser, so updating is necessary). It's not really possible to produce a failing unit test as this requires an actual browser to do the swallowing.

(I looked through the code for a fix but the helper seems to delegate the generating of the actual tag to a generic "content tag" function. It would be ugly to make that content tag function behave differently for just one tag, so it seems that making the text_area helpers emit their own HTML would be the most straightforward option)

@vandrijevik

Can you please point where the HTML spec states that a newline after <textarea> should be silently ignored? Neither http://dev.w3.org/html5/markup/textarea.html nor http://dev.w3.org/html5/markup/syntax.html#replaceable-character-data mention anything on this topic.

@Fjan

Hmm, you are right the documentation is not clear here http://www.w3.org/TR/html4/interact/forms.html#h-17.7 but the example given shows that the newline gets swallowed. I tested this behaviour in all browsers I have here and they do consistently eat the newline after the text area (a quick Google will confirm this as well).

As an aside, the textarea helper in ASP.NET does emit an additional new line, perhaps this is why all browser vendors choose to swallow it. This blog pots has a nice illustration: http://haacked.com/archive/2008/11/18/new-line-quirk-with-html-textarea.aspx

@matiaskorhonen

From the HTML 4.01 specification appendices:

SGML (see [ISO8879], section 7.6.1) specifies that a line break immediately
following a start tag must be ignored, as must a line break immediately
before an end tag. This applies to all HTML elements without exception.

Unfortunately I can't find the same text in the HTML5 spec, but it would seem that this is expected browser behaviour for the foreseeable future. Especially as the HTML5 spec mostly formalizes existing browser behaviour.

@dmathieu
Collaborator

This applies to all HTML elements without exception.

Based on that, it shouldn't only apply to textareas, but all tags.

So instead of <textarea></textarea>, we should have <textarea>\n</textarea> right ?

@Fjan

The textarea tag is the only one that has the problem where a leading \n can be swallowed, the other HTML tags are not sensitive to white space. So it would seem superfluous to add a \n inside every tag.

@dmathieu dmathieu referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@dmathieu dmathieu referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@Fjan

Thanks for fixing that. I looked at the fix, and in this case it might actually be a bit more elegant to simply prefix the content with a newline. That way we don't have to put that extra parameter in for the 99% of the cases where it isn't needed. Good for performance too:

content_tag("textarea", "\n#{html_escape(options.delete('value') || value_before_type_cast(object))}", options)
@Fjan

oh, wait, the html_escape should go on the outside, I tested it and it works fine, so:

content_tag("textarea", html_escape("\n#{options.delete('value') || value_before_type_cast(object)}"), options)

@dmathieu
Collaborator

I think the opinion of a core team member would be better on this. @josevalim what do you think ?

@benatkin

Good call, @dmathieu. Perhaps we could also see if and how another popular form plugin handles this. I'm taking a look at https://github.com/justinfrench/formtastic

@parndt

This never actually got closed?

@Fjan

Yep, this bug has been in there since rails 1.0.
The suggested patch is a bit unwieldy though, since it can also be fixed by simply adding a single newline in the right place instead of adding an extra parameter to every content tag, perhaps that's why it didn't get picked up.

@jeremy
Owner

Crazy. @Fjan, @mathieu, got a simpler patch without the extra param? If this is broken everywhere, it should be default.

@codykrieger

@jeremy There's a simpler patch - better?

@Fjan

@codykrieger Thanks, but this adds a hash lookup to every generated tag. Why not use the 1 line patch I suggested above? (excuse me for not doing it myself, I don't have git here)

@codykrieger

I actually get a bunch of failures in the tests when using your patch:

Finished tests in 80.100596s, 42.4341 tests/s, 192.9074 assertions/s.

  1) Failure:
test_text_area_with_escapes(FormHelperTest) [/Users/codykrieger/projects/forks/rails/actionpack/test/template/form_helper_test.rb:445]:
<"<textarea cols=\"40\" id=\"post_body\" name=\"post[body]\" rows=\"20\">\nBack to &lt;i&gt;the&lt;/i&gt; hill and over it again!</textarea>"> expected to be == to
<"<textarea cols=\"40\" id=\"post_body\" name=\"post[body]\" rows=\"20\">\nBack to &amp;lt;i&amp;gt;the&amp;lt;/i&amp;gt; hill and over it again!</textarea>">.

  2) Failure:
test_text_area_with_html_entities(FormHelperTest) [/Users/codykrieger/projects/forks/rails/actionpack/test/template/form_helper_test.rb:460]:
<"<textarea cols=\"40\" id=\"post_body\" name=\"post[body]\" rows=\"20\">\nThe HTML Entity for &amp; is &amp;amp;</textarea>"> expected to be == to
<"<textarea cols=\"40\" id=\"post_body\" name=\"post[body]\" rows=\"20\">\nThe HTML Entity for &amp;amp; is &amp;amp;amp;</textarea>">.

  3) Failure:
test_content_tag_with_newline(TagHelperTest) [/Users/codykrieger/projects/forks/rails/actionpack/test/template/tag_helper_test.rb:92]:
<"<textarea>\nlimelight</textarea>"> expected to be == to
<"<textarea>limelight</textarea>">.

3399 tests, 15452 assertions, 3 failures, 0 errors, 0 skips
rake aborted!

Looks like things are being escaped oddly.

UPDATE: Turning your patch into content_tag("textarea", "\n#{html_escape(options.delete('value') || value_before_type_cast(object))}".html_safe, options) fixed the first two failures.

@Fjan

Yeah, the corrected one was actually what I suggested on July 1 (you probably picked the earlier one from an (untested) comment in response to the initial test).

@rafaelfranca rafaelfranca referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@rafaelfranca rafaelfranca referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@rafaelfranca

I opened a new pull request with a simpler solution. Could you check if it is satisfactory

@Fjan

I haven't tried it but I think the html_safe on the first line is not correct (the ERB:Util term is already html_safe and the value_before_type_cast may not be html safe). Just removing it should fix it.

@rafaelfranca

@Fjan no. It is need. The ERB::Util expression is

ERB::Util.html_escape(options.delete('value') || value_before_type_cast(object))

This will output a html_safe string.

So if I call

"\n".html_safe + ERB::Util.html_escape(options.delete('value') || value_before_type_cast(object))

Is the same thing that

"\n#{ERB::Util.html_escape(options.delete('value') || value_before_type_cast(object))}".html_safe

But the result of this expression need to be a html safe string, or the content_tag will escape the html again.

@Fjan

My point is that value_before_type_cast(object) may contain unsafe characters so you create a potential HTML injection exploit by doing that.

@rafaelfranca

But it is escaped by ERB::Util.html_escape

@Fjan

Oh, ok, sorry. That seems kind of pointless though: escaping it and then marking it html safe since the content_tag does the same thing.

@rafaelfranca

I can change this line to:

"\n".html_safe + ERB::Util.html_escape(options.delete('value') || value_before_type_cast(object))

But this is the same thing that the current implementation.

If the result of this concatenation is not HTML safe the content_tag will escape the HTML again. So some tests will be broken.

Like these one:

  1) Failure:
test_text_area_with_escapes(FormHelperTest) [/Users/rafaelmfranca/Projects/github/rails/actionpack/test/template/form_helper_test.rb:485]:
--- expected
+++ actual
@@ -1,2 +1,2 @@
 <textarea cols="40" id="post_body" name="post[body]" rows="20">
-Back to &lt;i&gt;the&lt;/i&gt; hill and over it again!</textarea>
+Back to &amp;lt;i&amp;gt;the&amp;lt;/i&amp;gt; hill and over it again!</textarea>


  2) Failure:
test_text_area_with_html_entities(FormHelperTest) [/Users/rafaelmfranca/Projects/github/rails/actionpack/test/template/form_helper_test.rb:500]:
--- expected
+++ actual
@@ -1,2 +1,2 @@
 <textarea cols="40" id="post_body" name="post[body]" rows="20">
-The HTML Entity for &amp; is &amp;amp;</textarea>
+The HTML Entity for &amp;amp; is &amp;amp;amp;</textarea>
@rafaelfranca rafaelfranca referenced this issue from a commit in rafaelfranca/omg-rails
@rafaelfranca rafaelfranca Add a new line after the textarea opening tag.
Closes #393
a6074c3
@rafaelfranca rafaelfranca referenced this issue from a commit in rafaelfranca/omg-rails
@rafaelfranca rafaelfranca Add a new line after the textarea opening tag.
Closes #393
2b4e7a7
@rafaelfranca

Hey @Fjan. I refactored the code to leave the escape logic only with the content_tag. Now the code is cleaner.

Thanks to review it.

@Fjan

Ok, great. Thanks very much for taking the time to do this. I think this bug is probably the longest standing one in Rails, it has been there since Rails 1. Cheers.

@mhfs

@jeremy @josevalim I guess this one can be closed since @rafaelfranca's #5190/#5191 have been merged.

@rafaelfranca

What?! Github didn't close this?

@josevalim josevalim closed this
@jcoleman

@Fjan I'd like to go back to @codykrieger's original patch. I know it adds a hash lookup, but the "simpler" solution smacks of a hacky premature optimization.

Also, the currently implemented solution breaks textarea tags in all apps using HAML (which obviously has a huge user base in the Rails community.) The real problem here isn't just that HAML does something odd, it could easily break other extensions because the newline isn't part of the tag content (though the current patch treats it as such.) The newline is actually part of the tag not the content.

Anyone have more input on this? @josevalim: if I opened a pull request to go back to @codykrieger's patch would that be acceptable?

@Fjan

Adding a hash look up for EVERY single HTML tag generated by Rails instead of adding a few bytes to the code base... "premature" optimization would be the last qualification I would think off. Hacky I can live with, although you could argue that your preferred solution of "magically" adding a newline to the content is more of a hack than magically adding it to the tag where it belongs.

Would you care to elaborate on the HAML problem? Why would an extension break because of an additional new line? Does HAML output an incorrect text area tag as well?

@jcoleman

The reason I think it's hacky and premature optimization is that technically based on the HTML specs quoted, every tag should have a newline. Now we all know this isn't necessary, but it is true nonetheless. So the newline only being added to the textarea tag is to address browser implementations not being compatible with Rails not generating newlines in each tag.

When you magically add it to the content (rather than to the tag) it breaks extensions like HAML because the content has changed (rather than the tag.) In the specific HAML case, HAML automatically preserves whitespace in textarea tags (so it turns newlines into their respective HTML codes) since indentation obviously shouldn't happen inside textarea content (or rather, the browser eats indentation, but you want the content whitespace to be preserved.) Since you've added a newline to the content, HAML preserves that whitespace and browsers add an additional newline. So every time a form containing one is submitted and an object updated an saved, it gets an extra newline tacked onto the beginning. So the patch is actually causing the reverse problem of the problem it intends to prevent.

Extensions need to be able to access the actual content rather than modified content. And I doubt that a single hash lookup here (even though it's in every tag generated) really causes any measurable performance difference even on high usage apps. Hash lookups are performed in constant time and are in general extremely cheap, and there are lots of other parts of the view generation process (like view contexts for each partial) that cause exponentially higher overhead than a single hash lookup.

If you want to avoid the hash lookup, we could follow the spec and add newlines to every tag, but I don't think that would be appreciated...

@jcoleman

To be fair, just going with the original hash-lookup based patch won't automatically fix HAML, but it will making patching HAML far less dirty (and make it resilient to different versions of Rails with or without this fix.)

@jcoleman

By "far less dirty" I mean that to patch HAML with Rails the way it currently works, then HAML will have to either check Rails versions or magically divine whether the newline beginning the textarea tag content is actually part of the content or is just part of the tag. There's no way of knowing definitively the way it's currently implement.

@Fjan

Well, opinions can differ on what's a worthwhile optimization, but I would venture that an additional lookup on every HTML tag on every page generated by every rails application in the world causes a larger CO2 footprint than you as a person. (That's true of any optimization in a hot part of a frequently used code base, but this is a very hot part of a very large installed base.)

If HAML's behaviour is that closely tied to the behavior of the Rails content_tag it should just be tied to the correct Rails version by its gem spec. After the fix HAML can assume a newline is already added, older Rails users get an older HAML gem, no hacking or version detecting needed.

@jcoleman

I realize that this is a hot piece of code, but I'd suggest that the other parts of this method still dwarf the cost of an additional hash lookup. For example, html escaping and the tag_options calls are both rather expensive.

Unless there are benchmarks to back up the hash lookup being that consequential, I'd still say this is a premature optimization that shouldn't be the sole determinant of how this gets patched.

We shouldn't break correctness (the \n isn't actually part of the content--though the current patch acts like it is) just to micro-optimize.

@Fjan

I agree the rest of the code base is wasting CPU cycles left and right (and that makes me sad). To give you one example of the numbers we are talking about: Twitter 20B page views, 890 tags on the page (I did a quick grep/google). You now already have something with 15 zeroes, let's say a server can do a million a second, then you are still left with 10^9 s = 31 additional servers per year just for Twitter. I am by no means a tree hugger, but you shouldn't think too lightly about your impact on global energy consumption and your responsibilities as a programmer.

Anyway, I agree with you that the current implementation is a hack, we just differ on whether the added cost weighs up against making it slightly less hacky, especially since it's not really broken.

@jcoleman

Assuming 20 billion pages views a year, and 1000 tags on the page, then if a server can process a million per second you're looking at 232 additional server days, so under one extra server a year.

Interestingly enough, in a quick benchmark (in an entirely unoptimized Ruby on my MacBook, I find that Ruby can do more than 20 million hash lookups in a second.

So I'd say again, I don't think this is really an issue from a performance perspective.

Plus, the whole point of Ruby is to optimize for the user rather than for the machine. This case is very much optimizing for the machine rather than for the user.

I realize that in this case the only problem is for plugins, but this creates a maintenance trap in the Rails code too. If you're working in the content_tag method, you expect content to be just that, the content. Not some modified variable with additional non-content information tacked on. The method that logically should know about this (separation of concerns) is content_tag--not the method calling it to create a textarea tag.

@jcoleman jcoleman referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@jcoleman jcoleman referenced this issue from a commit in jcoleman/rails
@jcoleman jcoleman Don't break Haml with textarea newline fix.
See issue #393, issue #4000, issue #5190, and issue #5191. Adds a newline after the textarea opening tag based on @codykrieger's original patch so that we don't cause regressions in Haml-using apps. The regression caused textarea tags to add newlines to the field unintentionally (each update/save added an extra newline.)
3350ba3
@jcoleman jcoleman referenced this issue from a commit in jcoleman/rails
@jcoleman jcoleman Don't break Haml with textarea newline fix.
See issue #393, issue #4000, issue #5190, and issue #5191. Adds a newline after the textarea opening tag based on @codykrieger's original patch so that we don't cause regressions in Haml-using apps. The regression caused textarea tags to add newlines to the field unintentionally (each update/save added an extra newline.)
10e729a
@jcoleman jcoleman referenced this issue from a commit in jcoleman/rails
@jcoleman jcoleman Don't break Haml with textarea newline fix.
See issue #393, issue #4000, issue #5190, and issue #5191. Adds a newline after the textarea opening tag based on @codykrieger's original patch so that we don't cause regressions in Haml-using apps. The regression caused textarea tags to add newlines to the field unintentionally (each update/save added an extra newline.)
5b7066d
@jcoleman jcoleman referenced this issue from a commit in jcoleman/rails
@jcoleman jcoleman Don't break Haml with textarea newline fix.
See issue #393, issue #4000, issue #5190, and issue #5191. Adds a newline after the textarea opening tag based on @codykrieger's original patch so that we don't cause regressions in Haml-using apps. The regression caused textarea tags to add newlines to the field unintentionally (each update/save added an extra newline.)
5edf553
@jcoleman jcoleman referenced this issue from a commit in jcoleman/rails
@jcoleman jcoleman Don't break Haml with textarea newline fix.
See issue #393, issue #4000, issue #5190, and issue #5191. Adds a newline after the textarea opening tag based on @codykrieger's original patch so that we don't cause regressions in Haml-using apps. The regression caused textarea tags to add newlines to the field unintentionally (each update/save added an extra newline.)
d15fad1
@jcoleman jcoleman referenced this issue from a commit in jcoleman/rails
@jcoleman jcoleman Don't break Haml with textarea newline fix.
See issue #393, issue #4000, issue #5190, and issue #5191. Adds a newline after the textarea opening tag based on @codykrieger's original patch so that we don't cause regressions in Haml-using apps. The regression caused textarea tags to add newlines to the field unintentionally (each update/save added an extra newline.)

Also fix 6 more tests that didn't yet have the newline expectation.
d843c80
@jcoleman jcoleman referenced this issue from a commit in jcoleman/rails
@jcoleman jcoleman Don't break Haml with textarea newline fix.
See issue #393, issue #4000, issue #5190, and issue #5191. Adds a newline after the textarea opening tag based on @codykrieger's original patch so that we don't cause regressions in Haml-using apps. The regression caused textarea tags to add newlines to the field unintentionally (each update/save added an extra newline.)

Also fix 6 more tests that didn't yet have the newline expectation.
a7a422e
@jcoleman jcoleman referenced this issue from a commit in jcoleman/rails
@jcoleman jcoleman Don't break Haml with textarea newline fix.
See issue #393, issue #4000, issue #5190, and issue #5191. Adds a newline after the textarea opening tag based on @codykrieger's original patch so that we don't cause regressions in Haml-using apps. The regression caused textarea tags to add newlines to the field unintentionally (each update/save added an extra newline.)

Also fix 6 more tests that didn't yet have the newline expectation.
1438e0e
@Fjan

I don't feel like doing the math again, but even taking your number and adding that there additional logic and parameters being passed and that there's a lot more apps out there than twitter (and that the 20B number was from 2009) I'm sure I can come up with a large number.

Perhaps this is not the best place to discuss this, but the principle of Ruby is to make it an optimal experience for the end-user. Matz himself writes mostly in Ansi C to give you that. Rails should follow that philosophy and do the same for its end users, not do something that's optimal for the HAML maintainers.

Edit: I just noticed your comment in another thread that points out the original fix may have a bug. That changes matters, of course

@jcoleman

I definitely understand your concerns re: speed. Rails 3 is already significantly slower than Rails 2 in some cases. I'd just much rather squeeze that performance out of things that can be optimized without making things harder to maintain in Rails or in other projects.

Thanks for all the push back though. In the end, all the discussion hopefully pushes us all into the best of both worlds.

@robinroestenburg robinroestenburg referenced this issue from a commit
@jcoleman jcoleman Don't break Haml with textarea newline fix.
See issue #393, issue #4000, issue #5190, and issue #5191. Adds a newline after the textarea opening tag based on @codykrieger's original patch so that we don't cause regressions in Haml-using apps. The regression caused textarea tags to add newlines to the field unintentionally (each update/save added an extra newline.)

Also fix 6 more tests that didn't yet have the newline expectation.
b50b3f4
@angelomichel

Who came up with this idea that some element should have a newline character in a textarea? It makes NO sense at all. Basic HTML always says < pre >< / pre > isn't < pre >[ new line here]< / pre > Why would a textarea be different?

Google Chrome users now have an issue thanks to this (if I google this problem, I see allot of frustrated users by enabling this!). They all get free bonus unwanted characters before text_area's.

PLEASE roll this back to normal state

Or is there any reason why anyone should/would want this character injected?

@jcoleman

@AngeloMichel Actually, the HTML spec came up with the idea. The character needs to be injected because browser (including most versions of Chrome) actually swallow a leading newline, so if you don't have one, you actually lose real information from the textarea (assuming the real content started with a newline.)

If you're using Haml, then the extra leading spaces/newlines actually being in the content is a known bug, and there is a fixed version out as well. Check the Haml Github repo issues tracker for more information.

@codykrieger

@AngeloMichel, see this comment: #393 (comment)

If Chrome isn't ignoring the first newline inside of textarea tags, it's broken.

@jcoleman

@codykrieger @AngeloMichel Chrome doesn't ignore the leading newline in some specific versions when the newline is encoded as an HTML entity. This is caused by Haml, but there is a fixed version of Haml to correct this behavior (the leading newline shouldn't be encoded.)

@Fjan

@codykrieger Actually, the SGML spec that preceded the HTML spec already had it.

@AngeloMichel The reasoning is probably that you can nicely format the HTML / SGML source by adding an extra newline after a tag. All other tags also allow an extra new line there but all other tags are insensitive to white space so that's why textarea needs a different treatment.

@es128

Perhaps this could be changed to only insert the newline if there is content going into the textarea.

Currently, when this ends up being inserted as the html entity &#x000A; in a textarea that would otherwise be empty, it breaks the placeholder attribute on Mobile Safari.

I haven't tested, but I suspect that Mobile Safari handles an actual newline, as opposed to the entity code, correctly.

@jcoleman

@es128 Are you using Haml? The html entity is not inserted by Rails. Rails inserts an actual newline. Haml decides to encode. If you look at the relevant bugs in the Haml repository on Github, you'll find that there are multiple workarounds and fixes available. I believe that latest version of Haml actually completely resolves this (though it's not the nicest of solutions.)

@es128

@jcoleman Yes, I'm using Haml. Thanks for the tip.

@mnrtks mnrtks referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@thatha thatha referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.