Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

textarea helper swallows leading newline #393

Closed
Fjan opened this issue May 5, 2011 · 46 comments
Closed

textarea helper swallows leading newline #393

Fjan opened this issue May 5, 2011 · 46 comments

Comments

@Fjan
Copy link
Contributor

@Fjan Fjan commented May 5, 2011

Browsers swallow the first newline after a textarea tag, as per the HTML spec. The text_area helpers in Rails do not emit a newline after the textarea tag and this has the unintended side effect that if the contents of the tag happens to start with a new line it will get eaten by the browser.

It's easy to reproduce by inserting a new line or two in a text area field in any rails app and then updating it (the newline only gets eaten on the second trip to the browser, so updating is necessary). It's not really possible to produce a failing unit test as this requires an actual browser to do the swallowing.

(I looked through the code for a fix but the helper seems to delegate the generating of the actual tag to a generic "content tag" function. It would be ugly to make that content tag function behave differently for just one tag, so it seems that making the text_area helpers emit their own HTML would be the most straightforward option)

@vandrijevik
Copy link
Contributor

@vandrijevik vandrijevik commented May 5, 2011

Can you please point where the HTML spec states that a newline after <textarea> should be silently ignored? Neither http://dev.w3.org/html5/markup/textarea.html nor http://dev.w3.org/html5/markup/syntax.html#replaceable-character-data mention anything on this topic.

@Fjan
Copy link
Contributor Author

@Fjan Fjan commented May 5, 2011

Hmm, you are right the documentation is not clear here http://www.w3.org/TR/html4/interact/forms.html#h-17.7 but the example given shows that the newline gets swallowed. I tested this behaviour in all browsers I have here and they do consistently eat the newline after the text area (a quick Google will confirm this as well).

As an aside, the textarea helper in ASP.NET does emit an additional new line, perhaps this is why all browser vendors choose to swallow it. This blog pots has a nice illustration: http://haacked.com/archive/2008/11/18/new-line-quirk-with-html-textarea.aspx

@matiaskorhonen
Copy link
Contributor

@matiaskorhonen matiaskorhonen commented May 7, 2011

From the HTML 4.01 specification appendices:

SGML (see [ISO8879], section 7.6.1) specifies that a line break immediately
following a start tag must be ignored, as must a line break immediately
before an end tag. This applies to all HTML elements without exception.

Unfortunately I can't find the same text in the HTML5 spec, but it would seem that this is expected browser behaviour for the foreseeable future. Especially as the HTML5 spec mostly formalizes existing browser behaviour.

@dmathieu
Copy link
Contributor

@dmathieu dmathieu commented Jun 7, 2011

This applies to all HTML elements without exception.

Based on that, it shouldn't only apply to textareas, but all tags.

So instead of <textarea></textarea>, we should have <textarea>\n</textarea> right ?

@Fjan
Copy link
Contributor Author

@Fjan Fjan commented Jun 7, 2011

The textarea tag is the only one that has the problem where a leading \n can be swallowed, the other HTML tags are not sensitive to white space. So it would seem superfluous to add a \n inside every tag.

@Fjan
Copy link
Contributor Author

@Fjan Fjan commented Jun 30, 2011

Thanks for fixing that. I looked at the fix, and in this case it might actually be a bit more elegant to simply prefix the content with a newline. That way we don't have to put that extra parameter in for the 99% of the cases where it isn't needed. Good for performance too:

content_tag("textarea", "\n#{html_escape(options.delete('value') || value_before_type_cast(object))}", options)
@Fjan
Copy link
Contributor Author

@Fjan Fjan commented Jul 1, 2011

oh, wait, the html_escape should go on the outside, I tested it and it works fine, so:

content_tag("textarea", html_escape("\n#{options.delete('value') || value_before_type_cast(object)}"), options)

@dmathieu
Copy link
Contributor

@dmathieu dmathieu commented Jul 1, 2011

I think the opinion of a core team member would be better on this. @josevalim what do you think ?

@benatkin
Copy link

@benatkin benatkin commented Jul 1, 2011

Good call, @dmathieu. Perhaps we could also see if and how another popular form plugin handles this. I'm taking a look at https://github.com/justinfrench/formtastic

@parndt
Copy link
Contributor

@parndt parndt commented Sep 27, 2011

This never actually got closed?

@Fjan
Copy link
Contributor Author

@Fjan Fjan commented Sep 27, 2011

Yep, this bug has been in there since rails 1.0.
The suggested patch is a bit unwieldy though, since it can also be fixed by simply adding a single newline in the right place instead of adding an extra parameter to every content tag, perhaps that's why it didn't get picked up.

@jeremy
Copy link
Member

@jeremy jeremy commented Oct 9, 2011

Crazy. @Fjan, @mathieu, got a simpler patch without the extra param? If this is broken everywhere, it should be default.

@codykrieger
Copy link

@codykrieger codykrieger commented Dec 15, 2011

@jeremy There's a simpler patch - better?

@Fjan
Copy link
Contributor Author

@Fjan Fjan commented Dec 15, 2011

@codykrieger Thanks, but this adds a hash lookup to every generated tag. Why not use the 1 line patch I suggested above? (excuse me for not doing it myself, I don't have git here)

@codykrieger
Copy link

@codykrieger codykrieger commented Dec 15, 2011

I actually get a bunch of failures in the tests when using your patch:

Finished tests in 80.100596s, 42.4341 tests/s, 192.9074 assertions/s.

  1) Failure:
test_text_area_with_escapes(FormHelperTest) [/Users/codykrieger/projects/forks/rails/actionpack/test/template/form_helper_test.rb:445]:
<"<textarea cols=\"40\" id=\"post_body\" name=\"post[body]\" rows=\"20\">\nBack to &lt;i&gt;the&lt;/i&gt; hill and over it again!</textarea>"> expected to be == to
<"<textarea cols=\"40\" id=\"post_body\" name=\"post[body]\" rows=\"20\">\nBack to &amp;lt;i&amp;gt;the&amp;lt;/i&amp;gt; hill and over it again!</textarea>">.

  2) Failure:
test_text_area_with_html_entities(FormHelperTest) [/Users/codykrieger/projects/forks/rails/actionpack/test/template/form_helper_test.rb:460]:
<"<textarea cols=\"40\" id=\"post_body\" name=\"post[body]\" rows=\"20\">\nThe HTML Entity for &amp; is &amp;amp;</textarea>"> expected to be == to
<"<textarea cols=\"40\" id=\"post_body\" name=\"post[body]\" rows=\"20\">\nThe HTML Entity for &amp;amp; is &amp;amp;amp;</textarea>">.

  3) Failure:
test_content_tag_with_newline(TagHelperTest) [/Users/codykrieger/projects/forks/rails/actionpack/test/template/tag_helper_test.rb:92]:
<"<textarea>\nlimelight</textarea>"> expected to be == to
<"<textarea>limelight</textarea>">.

3399 tests, 15452 assertions, 3 failures, 0 errors, 0 skips
rake aborted!

Looks like things are being escaped oddly.

UPDATE: Turning your patch into content_tag("textarea", "\n#{html_escape(options.delete('value') || value_before_type_cast(object))}".html_safe, options) fixed the first two failures.

@Fjan
Copy link
Contributor Author

@Fjan Fjan commented Dec 15, 2011

Yeah, the corrected one was actually what I suggested on July 1 (you probably picked the earlier one from an (untested) comment in response to the initial test).

@rafaelfranca
Copy link
Member

@rafaelfranca rafaelfranca commented Feb 27, 2012

I opened a new pull request with a simpler solution. Could you check if it is satisfactory

@Fjan
Copy link
Contributor Author

@Fjan Fjan commented Feb 27, 2012

I haven't tried it but I think the html_safe on the first line is not correct (the ERB:Util term is already html_safe and the value_before_type_cast may not be html safe). Just removing it should fix it.

@rafaelfranca
Copy link
Member

@rafaelfranca rafaelfranca commented Feb 27, 2012

@Fjan no. It is need. The ERB::Util expression is

ERB::Util.html_escape(options.delete('value') || value_before_type_cast(object))

This will output a html_safe string.

So if I call

"\n".html_safe + ERB::Util.html_escape(options.delete('value') || value_before_type_cast(object))

Is the same thing that

"\n#{ERB::Util.html_escape(options.delete('value') || value_before_type_cast(object))}".html_safe

But the result of this expression need to be a html safe string, or the content_tag will escape the html again.

@Fjan
Copy link
Contributor Author

@Fjan Fjan commented Feb 27, 2012

My point is that value_before_type_cast(object) may contain unsafe characters so you create a potential HTML injection exploit by doing that.

@rafaelfranca
Copy link
Member

@rafaelfranca rafaelfranca commented Feb 27, 2012

But it is escaped by ERB::Util.html_escape

@Fjan
Copy link
Contributor Author

@Fjan Fjan commented Feb 27, 2012

Oh, ok, sorry. That seems kind of pointless though: escaping it and then marking it html safe since the content_tag does the same thing.

@rafaelfranca
Copy link
Member

@rafaelfranca rafaelfranca commented Feb 27, 2012

I can change this line to:

"\n".html_safe + ERB::Util.html_escape(options.delete('value') || value_before_type_cast(object))

But this is the same thing that the current implementation.

If the result of this concatenation is not HTML safe the content_tag will escape the HTML again. So some tests will be broken.

Like these one:

  1) Failure:
test_text_area_with_escapes(FormHelperTest) [/Users/rafaelmfranca/Projects/github/rails/actionpack/test/template/form_helper_test.rb:485]:
--- expected
+++ actual
@@ -1,2 +1,2 @@
 <textarea cols="40" id="post_body" name="post[body]" rows="20">
-Back to &lt;i&gt;the&lt;/i&gt; hill and over it again!</textarea>
+Back to &amp;lt;i&amp;gt;the&amp;lt;/i&amp;gt; hill and over it again!</textarea>


  2) Failure:
test_text_area_with_html_entities(FormHelperTest) [/Users/rafaelmfranca/Projects/github/rails/actionpack/test/template/form_helper_test.rb:500]:
--- expected
+++ actual
@@ -1,2 +1,2 @@
 <textarea cols="40" id="post_body" name="post[body]" rows="20">
-The HTML Entity for &amp; is &amp;amp;</textarea>
+The HTML Entity for &amp;amp; is &amp;amp;amp;</textarea>
rafaelfranca added a commit to rafaelfranca/omg-rails that referenced this issue Feb 27, 2012
Closes rails#393
rafaelfranca added a commit to rafaelfranca/omg-rails that referenced this issue Feb 27, 2012
Closes rails#393
@jcoleman
Copy link
Contributor

@jcoleman jcoleman commented Mar 27, 2012

To be fair, just going with the original hash-lookup based patch won't automatically fix HAML, but it will making patching HAML far less dirty (and make it resilient to different versions of Rails with or without this fix.)

@jcoleman
Copy link
Contributor

@jcoleman jcoleman commented Mar 27, 2012

By "far less dirty" I mean that to patch HAML with Rails the way it currently works, then HAML will have to either check Rails versions or magically divine whether the newline beginning the textarea tag content is actually part of the content or is just part of the tag. There's no way of knowing definitively the way it's currently implement.

@Fjan
Copy link
Contributor Author

@Fjan Fjan commented Mar 27, 2012

Well, opinions can differ on what's a worthwhile optimization, but I would venture that an additional lookup on every HTML tag on every page generated by every rails application in the world causes a larger CO2 footprint than you as a person. (That's true of any optimization in a hot part of a frequently used code base, but this is a very hot part of a very large installed base.)

If HAML's behaviour is that closely tied to the behavior of the Rails content_tag it should just be tied to the correct Rails version by its gem spec. After the fix HAML can assume a newline is already added, older Rails users get an older HAML gem, no hacking or version detecting needed.

@jcoleman
Copy link
Contributor

@jcoleman jcoleman commented Mar 27, 2012

I realize that this is a hot piece of code, but I'd suggest that the other parts of this method still dwarf the cost of an additional hash lookup. For example, html escaping and the tag_options calls are both rather expensive.

Unless there are benchmarks to back up the hash lookup being that consequential, I'd still say this is a premature optimization that shouldn't be the sole determinant of how this gets patched.

We shouldn't break correctness (the \n isn't actually part of the content--though the current patch acts like it is) just to micro-optimize.

@Fjan
Copy link
Contributor Author

@Fjan Fjan commented Mar 27, 2012

I agree the rest of the code base is wasting CPU cycles left and right (and that makes me sad). To give you one example of the numbers we are talking about: Twitter 20B page views, 890 tags on the page (I did a quick grep/google). You now already have something with 15 zeroes, let's say a server can do a million a second, then you are still left with 10^9 s = 31 additional servers per year just for Twitter. I am by no means a tree hugger, but you shouldn't think too lightly about your impact on global energy consumption and your responsibilities as a programmer.

Anyway, I agree with you that the current implementation is a hack, we just differ on whether the added cost weighs up against making it slightly less hacky, especially since it's not really broken.

@jcoleman
Copy link
Contributor

@jcoleman jcoleman commented Mar 27, 2012

Assuming 20 billion pages views a year, and 1000 tags on the page, then if a server can process a million per second you're looking at 232 additional server days, so under one extra server a year.

Interestingly enough, in a quick benchmark (in an entirely unoptimized Ruby on my MacBook, I find that Ruby can do more than 20 million hash lookups in a second.

So I'd say again, I don't think this is really an issue from a performance perspective.

Plus, the whole point of Ruby is to optimize for the user rather than for the machine. This case is very much optimizing for the machine rather than for the user.

I realize that in this case the only problem is for plugins, but this creates a maintenance trap in the Rails code too. If you're working in the content_tag method, you expect content to be just that, the content. Not some modified variable with additional non-content information tacked on. The method that logically should know about this (separation of concerns) is content_tag--not the method calling it to create a textarea tag.

jcoleman added a commit to jcoleman/rails that referenced this issue Mar 27, 2012
See issue rails#393, issue rails#4000, issue rails#5190, and issue rails#5191. Adds a newline after the textarea opening tag based on @codykrieger's original patch so that we don't cause regressions in Haml-using apps. The regression caused textarea tags to add newlines to the field unintentionally (each update/save added an extra newline.)

Also fix 6 more tests that didn't yet have the newline expectation.
jcoleman added a commit to jcoleman/rails that referenced this issue Mar 27, 2012
See issue rails#393, issue rails#4000, issue rails#5190, and issue rails#5191. Adds a newline after the textarea opening tag based on @codykrieger's original patch so that we don't cause regressions in Haml-using apps. The regression caused textarea tags to add newlines to the field unintentionally (each update/save added an extra newline.)

Also fix 6 more tests that didn't yet have the newline expectation.
@Fjan
Copy link
Contributor Author

@Fjan Fjan commented Mar 27, 2012

I don't feel like doing the math again, but even taking your number and adding that there additional logic and parameters being passed and that there's a lot more apps out there than twitter (and that the 20B number was from 2009) I'm sure I can come up with a large number.

Perhaps this is not the best place to discuss this, but the principle of Ruby is to make it an optimal experience for the end-user. Matz himself writes mostly in Ansi C to give you that. Rails should follow that philosophy and do the same for its end users, not do something that's optimal for the HAML maintainers.

Edit: I just noticed your comment in another thread that points out the original fix may have a bug. That changes matters, of course

@jcoleman
Copy link
Contributor

@jcoleman jcoleman commented Mar 27, 2012

I definitely understand your concerns re: speed. Rails 3 is already significantly slower than Rails 2 in some cases. I'd just much rather squeeze that performance out of things that can be optimized without making things harder to maintain in Rails or in other projects.

Thanks for all the push back though. In the end, all the discussion hopefully pushes us all into the best of both worlds.

@angelomichel
Copy link

@angelomichel angelomichel commented May 1, 2012

Who came up with this idea that some element should have a newline character in a textarea? It makes NO sense at all. Basic HTML always says < pre >< / pre > isn't < pre >[ new line here]< / pre > Why would a textarea be different?

Google Chrome users now have an issue thanks to this (if I google this problem, I see allot of frustrated users by enabling this!). They all get free bonus unwanted characters before text_area's.

PLEASE roll this back to normal state

Or is there any reason why anyone should/would want this character injected?

@jcoleman
Copy link
Contributor

@jcoleman jcoleman commented May 1, 2012

@angelomichel Actually, the HTML spec came up with the idea. The character needs to be injected because browser (including most versions of Chrome) actually swallow a leading newline, so if you don't have one, you actually lose real information from the textarea (assuming the real content started with a newline.)

If you're using Haml, then the extra leading spaces/newlines actually being in the content is a known bug, and there is a fixed version out as well. Check the Haml Github repo issues tracker for more information.

@codykrieger
Copy link

@codykrieger codykrieger commented May 1, 2012

@angelomichel, see this comment: #393 (comment)

If Chrome isn't ignoring the first newline inside of textarea tags, it's broken.

@jcoleman
Copy link
Contributor

@jcoleman jcoleman commented May 1, 2012

@codykrieger @angelomichel Chrome doesn't ignore the leading newline in some specific versions when the newline is encoded as an HTML entity. This is caused by Haml, but there is a fixed version of Haml to correct this behavior (the leading newline shouldn't be encoded.)

@Fjan
Copy link
Contributor Author

@Fjan Fjan commented May 1, 2012

@codykrieger Actually, the SGML spec that preceded the HTML spec already had it.

@angelomichel The reasoning is probably that you can nicely format the HTML / SGML source by adding an extra newline after a tag. All other tags also allow an extra new line there but all other tags are insensitive to white space so that's why textarea needs a different treatment.

@es128
Copy link

@es128 es128 commented May 15, 2012

Perhaps this could be changed to only insert the newline if there is content going into the textarea.

Currently, when this ends up being inserted as the html entity &#x000A; in a textarea that would otherwise be empty, it breaks the placeholder attribute on Mobile Safari.

I haven't tested, but I suspect that Mobile Safari handles an actual newline, as opposed to the entity code, correctly.

@jcoleman
Copy link
Contributor

@jcoleman jcoleman commented May 15, 2012

@es128 Are you using Haml? The html entity is not inserted by Rails. Rails inserts an actual newline. Haml decides to encode. If you look at the relevant bugs in the Haml repository on Github, you'll find that there are multiple workarounds and fixes available. I believe that latest version of Haml actually completely resolves this (though it's not the nicest of solutions.)

@es128
Copy link

@es128 es128 commented May 15, 2012

@jcoleman Yes, I'm using Haml. Thanks for the tip.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.