New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
encoding problem with get_eurostat_dic #55
Comments
Now
It is not probably UTF-8. These encoding issues are tricky. |
Tricky indeed. Not sure if there is a universal solution. |
Maybe reading as "Windows-1252" (or what ever it really is) and then change to UTF-8? |
If the encoding is always the same, or can be recognized automatically, then this will work. We can try at least. |
Could you @rocian try it now. I changed to fileEncoding = "". Works for me now. However, with |
@jhuovari With fileEncoding="Windows-1252" With fileEncoding="UTF-8", or fileEncoding="" The last one is what I see in the tooltip on Data Explorer. However, even if we see different things, we both obtain from |
Good to know that you get something with fileEncoding="". For me It think this affects only few dictionaries, but better solutions would be great. |
After changing to read_tsv I have proper table (on OSX)
I do not have access to a windows machine, but maybe someone can check if this error is still present? |
@muuankarski ping |
On windows the table is OK, but
From stackoverflow: "U+0092 is a never-used control character. It is almost always the result of misdecoding a single right quote ’ in a Windows code page 1252 file as ISO-8859-1." I think that's the case also here. Not a big issue, but still... |
In dc13905 there is a dirty hack that replaces U+0092 by '. |
The misdecoding is probably done on eurostat, so I think this is solved on our side. |
There is an issue in get_eurostat_dict on systems with an encoding different from "Windows-1252". At least on linux systems where I tried it. I was able to resolve the issue overriding get_eurostat_dict and configuring the proper encoding (UTF-8).
The text was updated successfully, but these errors were encountered: