Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non breaking space ( ) in gamelist.xml #233

Open
PhasecoreX opened this issue Aug 23, 2018 · 1 comment
Open

Non breaking space ( ) in gamelist.xml #233

PhasecoreX opened this issue Aug 23, 2018 · 1 comment

Comments

@PhasecoreX
Copy link

It seems like the scraper is not parsing HTML entity encoded characters properly, at least from screenscraper.fr, and only for some games. For example, I get this in the XML file:

<publisher>Nintendo&amp;nbsp;of&amp;nbsp;America&amp;nbsp;Inc.</publisher>

It looks like the actual ampersand on &nbsp; is being encoded as &amp;, which gives us &amp;nbsp;. In EmulationStation, this all shows up as:

Nintendo&nbsp;of&nbsp;America&nbsp;Inc.

Oddly, it seems to be only for Game Boy games (that I have noticed). An example is Kirby's Dream Land for Game Boy. This issue isn't present in other systems (For example, Kirby 64 - The Crystal Shards for N64 works just fine with spaces). Not sure if this is a scraper problem, or bad data from screenscraper.fr that scraper could potentially clean up.

@sselph
Copy link
Owner

sselph commented Aug 23, 2018

From my side I think this is probably working as intended. Looking at the data in screenscraper.fr and the way they encode the json data (using php) I think they actually sending something like {"publisher": "Nintendo&nbsp;of&nbsp;America&nbsp;Inc."} when it should be UTF-8, something like {"publisher": "Nintendo\u00a0of\u00a0America\u00a0Inc."} I think the php used to generate the json is encoding the nbsp for html or the literal &nbsp; is in the database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants