Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-Ascii Characters in Contacts cause encoding error #17

Closed
soulsource opened this issue Feb 9, 2013 · 4 comments
Closed

Non-Ascii Characters in Contacts cause encoding error #17

soulsource opened this issue Feb 9, 2013 · 4 comments
Assignees
Labels

Comments

@soulsource
Copy link

Hi!

First of all, thank you very much for this nice script!
When I was importing my previous address book, of course stored as UTF-8 plaintext vcf, to ppl, I realized that contacts containing non-ascii characters are not parsed, and ppl complains with:

ppl: incompatible encoding regexp match (ASCII-8BIT regexp with UTF-8 string)

In my case the character ppl could not digest was the character 'ß', which is very common in German addresses, for it's part of the German word for street, "Straße". For now I worked around this issue, by setting:

RUBYOPT="-E ASCII-8BIT"

Obviously this is not the correct way to deal with this issue, so please fix it in the ppl source.

@henrycatalinismith
Copy link
Owner

Ouch! This is quite an oversight on my part. I completely forgot to test ppl with non-ASCII text. On the positive side, this is an extremely easy bug to reproduce and I'm eager to fix it. Stay tuned!

@henrycatalinismith
Copy link
Owner

I'm still struggling to figure this one out. I'm now awaiting a response from Sam Roberts - the developer of vpim - who can hopefully explain to me what nuances are required in the invocation of his gem when UTF-8 is involved. Here's the question I've sent him, in full, just in case anybody else out there can offer any insight:

Hi there,

I'm having a bit of trouble using vpim with vCard files that contain UTF-8 text. Here's an example vCard:

BEGIN:VCARD
VERSION:3.0
N:;Straße;;;
FN:Straße
END:VCARD

And here's a script that attempts to decode that file using Vpim::Vcard.decode:

#!/usr/bin/env ruby
require "vpim"
source_vcard = File.read("/home/h2s/tmp/contacts/de.vcf")
decoded_vcard = Vpim::Vcard.decode(source_vcard).first
puts decoded_vcard

And finally, here's the error message in full:

/var/lib/gems/1.9.1/gems/vpim-0.695/lib/vpim/vcard.rb:676:in `===': incompatible encoding regexp match (ASCII-8BIT regexp with UTF-8 string) (Encoding::CompatibilityError)
        from /var/lib/gems/1.9.1/gems/vpim-0.695/lib/vpim/vcard.rb:676:in `decode'
        from ./test.rb:4:in `<main>'

Am I doing something wrong? The output of source_vcard.encoding.name is "UTF-8".

Thanks in advance!

I know this isn't exactly tangible progress, but I wanted to make it clear that I haven't given up on this issue!

@henrycatalinismith
Copy link
Owner

According to the response from Sam Roberts, the problem may be as simple as vpim not quite supporting the String class in Ruby 1.9.1 yet. He seems optimistic about sorting this out in the not-too-distant future, and I'll be sure to follow any upstream progress with a new release of ppl as quickly as possible.

So stay tuned: it looks like this might be a thing of the past sometime soon.

@henrycatalinismith
Copy link
Owner

Sorry this took so long to fix. This m17n incompatibility stuff in Ruby is quite a big topic and this particular problem seems to have been a uniquely awkward combination of the possible problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants