Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle non-ascii chars in bbdb #16

Closed
bergey opened this issue Jun 29, 2012 · 3 comments
Closed

Handle non-ascii chars in bbdb #16

bergey opened this issue Jun 29, 2012 · 3 comments

Comments

@bergey
Copy link

bergey commented Jun 29, 2012

bbdb / emacs has a rather wacky way of representing non-ASCII characters.

ASynck, encountering them, throws errors on the command line like:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xed' in position 53: ordinal not in range(128)

The debugger shows that python is calling str(e), where e is:

BBDBParseError(u'Could not Parse BBDB contact entry: ["D Jared" #("Dom\xednguez" 0 9 (charset iso-8859-1)) nil nil nil nil ("danjared@mit.edu") ((creation-date . "2011-10-31") (timestamp . "2011-10-31")) nil]',)

Here are some more bbdb lines that produce the same error:
[#("Héctor" 0 6 (charset iso-8859-1)) "Tarrido-Picart" nil nil nil nil ("http1917@gmail.com") ((creation-date . "2011-12-14") (timestamp . "2011-12-14")) nil]
[#("severiano alberto vélez" 0 23 (charset iso-8859-1)) #("gómez" 0 5 (charset iso-8859-1)) nil nil nil nil ("sevevelez@hotmail.com") ((creation-date . "2011-12-14") (timestamp . "2011-12-14")) nil]
[#("王洋" 0 2 (charset chinese-gb2312)) "" nil nil nil nil ("wanghongyang1767@gmail.com") ((creation-date . "2011-10-06") (timestamp . "2011-10-06")) nil]
[#("박준원" 0 3 (charset korean-ksc5601)) "" nil nil nil nil ("flowrime@naver.com") ((creation-date . "2011-10-06") (timestamp . "2011-10-06")) nil]

If this issue is still open in a month, I'll dig deeper, but I don't have time now.

@skarra
Copy link
Owner

skarra commented Jun 30, 2012

If I strip out all the funky text encoding and just retain the unicode strings (within quotes) as I see them in your description above, ASynK is able to parse the file properly using utf-8.

Are you using some old version of Emacs? I am not even sure how to get Emacs to dump such encoded strings. I would wager that if you "convert" your bbdb file in Emacs to a more "modern" utf-8 encoded file you should be able to use it with ASynK today.

But then, it is also desirable to get ASynK to be able to detect and deal with such cases. So patches are welcome.

@bergey
Copy link
Author

bergey commented Jun 30, 2012

I'm using emacs 23.4 and bbdb 2.36 (current Debian unstable). I'd much rather get bbdb to use UTF-8 than get ASynk to support the horrible old multiple-encodings format. So I'll look into that. Thanks for the quick repsonse!

@skarra
Copy link
Owner

skarra commented Jul 1, 2012

Great; I'll close this issue now.

@skarra skarra closed this as completed Jul 1, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants