New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
warning: regexp match /.../n against to UTF-8 string #87
Comments
Oops ... forgot to mention I'm using ruby-1.9.2-preview3. |
sounds like this is coming from inside rack, any evidence this is a Capybara problem? |
I'm also using Capybara and I'm having exactly the same problem, but I can't tell you it is a Capybara problem. I don't know what is the cause... I'm also having this warnings: Non US-ASCII detected and no charset defined. These are probably related to mail delivering... |
God I hate Ruby 1.9 charset issues :( |
:) |
I find it very unlikely that this is a Capybara issue. Closing this. If someone can prove to me otherwise, please reopen! |
Yes, Rack::Utils.escape uses an ascii regural expression on unicode data, notice the '/n'
But I have no idea where this gets called :| |
Its because of the _snowman ... |
Here is a extremely bad patch for capybara/driver/rack_test_driver.rb: def process(method, path, attributes = {}) return if path.gsub(/^#{request_path}/, '') =~ /^#/ if Kernel.const_defined?(:Encoding) att = {} attributes.each_pair do |k,v| key = k.dup.force_encoding(Encoding::ASCII_8BIT) att[key] = v.dup.force_encoding(Encoding::ASCII_8BIT) end path = path.dup.force_encoding(Encoding::ASCII_8BIT) else att = attributes end send(method, path, att, env) follow_redirects! end The same goes for the submit method. |
I don't know much about encodings, but that seems completely wrong to me. Unless the attributes are actually in ASCII_8BIT, then force encoding them there seems like a bad idea, and if they're not ASCII, then why cast them? Wouldn't that just lead to encoding errors? I think in effect we are removing the encoding information from the Strings for no gain. The problem lies deeper than this. I'm not sure how to fix it, but I'm pretty sure this isn't it. |
ASCII_8BIT is an alias for binary and 'force_encoding' does exactly what you mentioned: it removes the encoding. You cannot "encode" to binary, and as for speed, this doesn't have overhead, apart from looking very ugly. The fact is that frameworks such as cucumber don't guarantee binary strings, and they don't seem to belong to rack. Why Capybara? Because rack works on binary data and is optimized for real world applications, where parameters are binary, because you cannot effectively make assumptions about the encoding (what about UTF-16? Shift-JIS? - they will crash on the /u parameter). Capybara has a driver that passes utf-8 strings that occur only because the parameters are crafted in the cucumber tests and never cast to binary. In summary, the warning is there because making thing assume utf-8 is fundamentally wrong and suggests a deeper problem, as you have described. Summary: In my opinion Capybara should simulate the real world by passing unencoded strings (ASCII_8BIT as named in ruby), instead of getting rack to work around acceptance tests, which provide non-binary strings and grow encoding handling functionality. Also, note that the US_ASCII is the only exception that is compatible with ASCII_8BIT ('binary' in normal people's terms) and ruby handles automatic conversion as a special case. Thats why it works without warning until utf-8 characters are used (_snowman in this case). I deeply recommend the following: http://yehudakatz.com/2010/05/17/encodings-unabridged/ You can read through ruby-core (especially Yui Naruse's comments) on why ruby works this way and there is no "binary" encoding. |
That was a case well made, I'm actually convinced. Can you provide a patch with tests? That'd be absolutely awesome! This whole encoding thing is giving me headaches (and yes, I've read through Yehuda's article, and a number of others), so I feel ill equipped to tackle this. |
Rushed patch with tests here: http://github.com/e2/capybara/commit/c8a55c4011c2f91943ee9aeebe3b81a4420c3ecf I don't like the solution and you can completely redo it without crediting me at all - and I will be completely happy. Just let me know once you have patched it so I can nuke my fork... Thanks :) P.S. Sorry about "bonus" indenting - I just didn't want to rebase+squash AGAIN... |
Good job e2! I actually thought that solution was pretty decent. I've merged it in, if someone wants to improve on it, they're welcome, but this looks good for now. |
Humm, I upgraded to this while using (ruby 1.9.2, Rails 3.0.0). While I don't get the warning, things are worse off. My tests fail with a : undefined local variable or method `node' for # (NameError) ./features/step_definitions/web_steps.rb:35:in `block (2 levels) in ' ./features/step_definitions/web_steps.rb:14:in `with_scope' ./features/step_definitions/web_steps.rb:34:in `/^(?:|I )follow "([^\"]*)"(?: within "([^\"]*)")?$/' When I try to use Spork, I get this: /home/jwilson/.rvm/rubies/ruby-1.9.2-p0/lib/ruby/1.9.1/drb/drb.rb:573:in `load': too large packet 67654656 (DRb::DRbConnError) from /home/jwilson/.rvm/rubies/ruby-1.9.2-p0/lib/ruby/1.9.1/drb/drb.rb:632:in `recv_reply' from /home/jwilson/.rvm/rubies/ruby-1.9.2-p0/lib/ruby/1.9.1/drb/drb.rb:918:in `recv_reply' from /home/jwilson/.rvm/rubies/ruby-1.9.2-p0/lib/ruby/1.9.1/drb/drb.rb:1197:in `send_message' from /home/jwilson/.rvm/rubies/ruby-1.9.2-p0/lib/ruby/1.9.1/drb/drb.rb:1088:in `block (2 levels) in method_missing' from /home/jwilson/.rvm/rubies/ruby-1.9.2-p0/lib/ruby/1.9.1/drb/drb.rb:1172:in `open' from /home/jwilson/.rvm/rubies/ruby-1.9.2-p0/lib/ruby/1.9.1/drb/drb.rb:1087:in `block in method_missing' from /home/jwilson/.rvm/rubies/ruby-1.9.2-p0/lib/ruby/1.9.1/drb/drb.rb:1105:in `with_friend' Exception encountered: # backtrace: |
To be fair, I just pulled edge from github, so possible this is something else... Falling back to released 0.3.9 |
You shouldn't be getting that error on the latest master. It's a problem where cucumber monkey-patches a Capybara method which no longer exists. There's a line in env.rb along the lines of:
or something to that effect. Remove it and the error will go away. |
Yep, that did it! Thanks. |
Deleting this makes clicking normal links work, but for me at least then links with onclick's (generated from link_to ... :method => :post) fails.. Any ideas? |
Having the same problem. Any resolution? |
whatever happened to this issue? |
OK seems to be continued here: #243 |
tl;dr: upgrade to Rack 1.3.0 |
tl;dr 2: A lot of things don't support Rack 1.3.0 yet (formtastic, rails 3.0.9, sendgrid-rails, etc). For now, you can put this in an initializer or support file: module Rack
module Utils
def escape(s)
CGI.escape(s.to_s)
end
def unescape(s)
CGI.unescape(s)
end
end
end |
Hey there,
In one of my specs I'm getting several warnings because of this line:
click 'Enviar instruções de redefinição de senha'
gems/rack-1.1.0/lib/rack/utils.rb:15: warning: regexp match /.../n against to UTF-8 string
gems/rack-1.1.0/lib/rack/utils.rb:15: warning: regexp match /.../n against to UTF-8 string
gems/rack-1.1.0/lib/rack/utils.rb:15: warning: regexp match /.../n against to UTF-8 string
gems/rack-1.1.0/lib/rack/utils.rb:15: warning: regexp match /.../n against to UTF-8 string
In case it helps, here is my Gemfile content https://gist.github.com/c0e8f04433cf1affcf84
Thanks,
Marcelo
The text was updated successfully, but these errors were encountered: