Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding problem with german characters (e.g. ü,ä) and jQuery selection problem #1

Open
DenisMir opened this issue Mar 4, 2011 · 3 comments

Comments

@DenisMir
Copy link

DenisMir commented Mar 4, 2011

Thank you for this nice little scrapping wrapper.

At the moment I am getting incorrect encoding for german characters like "ä, ü etc.". I don't really know if this is a JSDom problem. For the rendering I am using UTF-8 encoding.

@stagas
Copy link
Owner

stagas commented Mar 4, 2011

Where do you get these characters? There's a module out there that converts strings of various charsets but couldn't find it right now. Try searching for it and let me know how it goes.

@DenisMir
Copy link
Author

DenisMir commented Mar 5, 2011

Thank you for your answer. At the moment I am just experimenting with the following simple example code.

https://gist.github.com/3f9c45f72d1e95479498 (server part)
https://gist.github.com/baeddbc83ccb490354b8 (simple jqtpl template)

When rendering the results I am getting the following result:
http://dl.dropbox.com/u/287197/screen_scrape.png

In the screenshot you can find another error happening in the jquery selection process. I have tried the exact same function manually on the google page from the firebug console which gave me the correct array of result items. When doing this one on the server the properties in the first result item are holding all the values from the other items as well. I don't really get why. I would guess that the virtually rendered DOM is not quite correct which makes jQuerys find() method to find the other elements as well. (which shouldn't happen since it only searches for the descendants of the selected element)

This is the mentioned jQuery selection method (this one perfectly runs on the client side) with a jQueryfied google site:
https://gist.github.com/b04320d864f07a1e54d6

@stagas
Copy link
Owner

stagas commented Mar 5, 2011

Dunno, maybe some weird jsdom behaviour. Maybe try .children('a.l').first() instead of .find('a.l'). Also try changing the meta charset to something else for the umlauts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants