All Japanese char turn to ?????? #8

Closed
marocchino opened this Issue Apr 14, 2011 · 21 comments

Projects

None yet
@marocchino

OS
CentOS release 5.5 (Final)

rpm
qt.i386 1:3.3.6-23.el5 installed
qt4.i386 4.2.1-1 installed
xorg-x11-server-Xvfb.i386 1.1.1-48.76.el5 installed

gem
capybara (0.4.1.2)
capybara-webkit (0.1.5)

html page was encoded Utf-8
any solution?

@eparreno

Same issue with pages including latin chars encoded in utf-8.
You can change http headers with capybara or even in the controller to force utf-8 encoding instead of iso but doesn't works.

@jferris
Member
jferris commented Apr 14, 2011

Which version of Ruby are you guys using? When you say that the characters turn into question marks, do you mean in save_and_open_page or somewhere else?

@eparreno

In my case i've tried with REE, Ruby 1.8.7 and Ruby 1.9.2 (in Mac OSX) and doesn't works in anyone of them.
As you say, save_and_open_page shows a page with wrong characters but app shows views properly. The problem is webkit renders views using iso encoding although you set headers manually and force encoding to utf8.

@Fonsan
Fonsan commented Apr 14, 2011

I'm having similar issues

Swedish characters like Å are turning into ?? which means click_on fails and using save_and_open_page produces the following on 1.9.2 on mac osx

 Failure/Error: save_and_open_page
 Encoding::UndefinedConversionError:
   "\xE5" from ASCII-8BIT to UTF-8
@marocchino

I'm using just REE.

@ngauthier

I believe I have a related issue that may pinpoint this further. When I fill in an input box with a string with a space in it, the space is converted to %20, so:

"Hello World" => "Hello%20World"

Then, when I submit the form, I see the params come through as:

"Hello%25%20World"

Which indicates that it was escaped (by my JS to send along as params).

So, my guess is that you guys are escaping the content to send across the socket, but are not unescaping it. And when you escape the content, it's stripping the non-ascii characters away.

Perhaps if you can send the strings over the socket as binary data somehow? Not really sure ...

@jferris
Member
jferris commented Apr 14, 2011

We're not doing anything special to escape content going across the socket, and form params don't actually go through our socket layer. They go directly from WebKit to the application.

If extra escaping is happening, it must be when setting or retrieving form values. I'll see if I can reproduce this and get to the bottom of it, but if somebody else is curious, I think the Node commands for set and value are the place to start looking.

@ngauthier

Yeah, I realize webkit goes straight to the app, which is why I assumed it was escaping to cross the socket. I see now you are writing directly.

When I set the value using #fill_in then retrieve it with #value it is correct. Not sure what it is then!

@carlosantoniodasilva

I'm running into the same issue here. All specs are green using Webdriver + Chrome. When running with capybara-webkit, some of them fail on has_content? matchers due to this issue, and inspecting the page returns me the ? chars. Using REE, capybara 0.4.1.2, capybara-webkit 0.1.6, on Mac OSX.

@osegrums

+1
Same issue here with Baltic (ā, ē, š, etc) letters.

@pascalh1011

I can replicate this problem anywhere by simply putting & copy; (without the space obviously) anywhere in the page HTML and calling save_and_open_page

@rodrigotassinari

+1, my apps texts are in portuguese, all accented characters are show as question marks when using the webkit driver. this is visible with save_and_open_page and also on simple content checks, like:

Failure/Error: find('.field_errors_detail').text.should == 'Este campo é obrigatório.'
  expected: "Este campo é obrigatório."
       got: "Este campo ? obrigat?rio." (using ==)
@dapi
dapi commented Apr 17, 2011

+1. Same thing with russian unicode. Checked with save_and_open_page

@colszowka

+1 on this, clicking links or verifying seen content that has german umlauts (i.e. "Möwenhirn") fails for me too. When removing the special chars from the operation "..,follow 'wenhirn'", it works fine.

Edit: I'm on 1.8.7-p334 here, but from what others wrote this doesn't seem to be important.

@colszowka

I gave it a spin by adding hotchpotch's fork as the git path for the gem in my gemfile, didn't help, still getting 'no link with title, id or text '...''

@colszowka

Just wanted to report that after another fix by hotchpotch on his fork, gem 'capybara-webkit', :git => 'https://github.com/hotchpotch/capybara-webkit.git' now makes links/text with german umlauts work smoothly!

Please pull in those fixes and push a new version to gemcutter.

@eparreno

It works with latin chars, cool!

@osegrums

With https://github.com/hotchpotch/capybara-webkit.git all my tests with Baltic letters now pass too.
This solution works for me. Tnx

@carlosantoniodasilva

I can confirm that @hotchpotch's branch is working with accented chars, specs that were failing now passes.

@jferris
Member
jferris commented Apr 20, 2011

I pulled in the patches from the above fork and released the changes in 0.2.0.

@jferris jferris closed this Apr 20, 2011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment