Charsets #17

Closed
jbt opened this Issue Aug 1, 2011 · 5 comments

Projects

None yet

2 participants

jbt commented Aug 1, 2011

The latest GeoIP API has native UTF8 string conversion. Perhaps use the GeoIP_set_charset method with GEOIP_CHARSET_UTF8 to avoid charset issues (node uses UTF8 and breaks non-latin characters with ISO-8859-1)

Owner
kuno commented Aug 1, 2011

I am not aware the existence of this kind problem.
Could you offer some details about it or even samples?
Forgive my stupid, if so.
--kuno

jbt commented Aug 1, 2011

The GeoIP data files are stored in the ISO-8859-1 character set, which has some issues when converting to UTF-8. When used directly, all non-ascii characters are converted to the same illegal character.

Examples:
190.168.44.100 : M�rida (should be Mérida)
89.81.133.127 : Orl�ans (Orléans)
187.47.6.0 : S�o Paulo (São Paulo)

Owner
kuno commented Aug 2, 2011

hey, I remembered that someone called "TheDeveloper" told me that they already solved this problem by referencing gun iconv library.
It seems that where this people point to is your repository???
#11

jbt commented Aug 3, 2011

Ah yes, TheDevelpoper's actually a friend of mine - I guess he reported that while I was working on the problem a while back.
I'm not terribly experienced with C++ so I'm not sure if using iconv is the best way to go about converting the charsets - maybe you have a better idea about that than me. It's probably better to use a native implementation (like copying _GeoIP_iso_8859_1__utf8) rather than depending on iconv - you probably have a better idea about that than me.

Owner
kuno commented Aug 3, 2011

OK ,I've pick up the _GeoIP_iso_8859_1_utf8 function to handle this issue.

@kuno kuno closed this Mar 13, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment