Quoted-printable values ignore charset (always UTF-8) #10

GoogleCodeExporter · 2015-03-21T12:39:38Z

What steps will reproduce the problem?
1. Set a Note-value to something with newlines and special chars
2. Write vCard with VCardVersion.V2_1


What is the expected output?
The complete result is in ISO 8859-1 including the quoted-printable parts after 
they are decoded

What is the actual output?
The result is in ISO 8859-1 except for the quoted-printable parts which after 
decoding turn out to be in UTF-8 

What version of ez-vcard are you using?
0.9.0

What version of Java are you using?
1.6

Please provide any additional information below.
This also happens if I explicitly set the charset of the properties to ISO 
8859-1

Original issue reported on code.google.com by tom_vo...@gmx.de on 27 Nov 2013 at 3:03

The text was updated successfully, but these errors were encountered:

GoogleCodeExporter · 2015-03-21T12:39:39Z

Hello,

Thanks for your input.  I'm not sure if this is a bug though.  It makes sense 
that you could get a UTF-8 string after decoding a quoted-printable string.  
The purpose of quoted-printable is to encode characters which cannot be encoded 
in the current character set.

Original comment by mike.angstadt on 4 Dec 2013 at 3:20

GoogleCodeExporter · 2015-03-21T12:39:39Z

Hi, 

thanks for the reply. 
But if I set the charset on the type to iso 8859-1 then I would expect the 
resulting string after decoding to be of charset iso 8859-1 and not utf-8.

Original comment by tom_vo...@gmx.de on 4 Dec 2013 at 3:46

GoogleCodeExporter · 2015-03-21T12:39:39Z

What is the exact string you are using in the CHARSET parameter value?  It 
looks like there must be a hyphen between "ISO" and "8859-1", instead of a 
space.  If there is a space, Java will not recognize the charset, which causes 
ez-vcard to decode it using UTF-8.

Original comment by mike.angstadt on 4 Dec 2013 at 4:07

GoogleCodeExporter · 2015-03-21T12:39:39Z

"ISO-8859-1"

Original comment by tom_vo...@gmx.de on 4 Dec 2013 at 4:24

GoogleCodeExporter · 2015-03-21T12:39:39Z

Can you check to see if there are any parser warnings?  ez-vcard will add a 
parser warning if there is problem decoding a quoted-printable value.

To do that with the VCardReader class, call the getWarnings() method.  To do 
that with the Ezvcard class, pass an empty list into the "warnings()" method, 
then print the list after parsing the vCard.

Original comment by mike.angstadt on 4 Dec 2013 at 4:30

GoogleCodeExporter · 2015-03-21T12:39:39Z

No warnings concerning quoted-printable.
Here's what I do:

  Note noteType = new Note(person.getComment());
  noteType.getParameters().setCharset(charset);
  vcard.addNote(noteType);
  ...
  StringWriter writer = new StringWriter();
  VCardWriter vCardWriter = new VCardWriter(writer, VCardVersion.V2_1, null, "\r\n");
  log.debug(vcard.validate(VCardVersion.V2_1));
  Ezvcard.write(vcard).version(VCardVersion.V2_1).go(writer);

Note contains:
"test
äöüß
test"

Result is:
NOTE;CHARSET=ISO-8859-1;ENCODING=quoted-printable:test=0A=C3=A4=C3=B6=C3=BC=
 =C3=9F=0Atest

Decoded:
testÃ¤Ã¶Ã¼ Ã�test

:(

Original comment by tom_vo...@gmx.de on 9 Dec 2013 at 5:24

GoogleCodeExporter · 2015-03-21T12:39:40Z

Ah, I see.  Ok, fixed it.  Thanks :D

Original comment by mike.angstadt on 12 Dec 2013 at 3:49

Changed state: Fixed

GoogleCodeExporter · 2015-03-21T12:39:40Z

I thought about this some more.  My first solution didn't solve the root of the 
problem, which is that the character encoding of the ***Writer*** object should 
be used by default when encoding a quoted-printable value.  You shouldn't need 
to manually set the CHARSET parameter.

The fix I've just committed will use the Writer object's character encoding if 
no CHARSET parameter is provided.  If it can't determine the Writer's character 
encoding, it will use your system's default character encoding.  If a CHARSET 
parameter is set, then it will use that character encoding instead of the 
Writer's.

Attached is the patched JAR.

Original comment by mike.angstadt on 13 Dec 2013 at 5:29

Attachments:

ez-vcard-0.9.1-SNAPSHOT.jar

GoogleCodeExporter · 2015-03-21T12:39:40Z

Hi,

great, thanks!

Original comment by tom_vo...@gmx.de on 16 Dec 2013 at 9:06

GoogleCodeExporter added Type-Defect Priority-Medium auto-migrated labels Mar 21, 2015

GoogleCodeExporter closed this as completed Mar 21, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quoted-printable values ignore charset (always UTF-8) #10

Quoted-printable values ignore charset (always UTF-8) #10

GoogleCodeExporter commented Mar 21, 2015

GoogleCodeExporter commented Mar 21, 2015

GoogleCodeExporter commented Mar 21, 2015

GoogleCodeExporter commented Mar 21, 2015

GoogleCodeExporter commented Mar 21, 2015

GoogleCodeExporter commented Mar 21, 2015

GoogleCodeExporter commented Mar 21, 2015

GoogleCodeExporter commented Mar 21, 2015

GoogleCodeExporter commented Mar 21, 2015

GoogleCodeExporter commented Mar 21, 2015

Quoted-printable values ignore charset (always UTF-8) #10

Quoted-printable values ignore charset (always UTF-8) #10

Comments

GoogleCodeExporter commented Mar 21, 2015

GoogleCodeExporter commented Mar 21, 2015

GoogleCodeExporter commented Mar 21, 2015

GoogleCodeExporter commented Mar 21, 2015

GoogleCodeExporter commented Mar 21, 2015

GoogleCodeExporter commented Mar 21, 2015

GoogleCodeExporter commented Mar 21, 2015

GoogleCodeExporter commented Mar 21, 2015

GoogleCodeExporter commented Mar 21, 2015

GoogleCodeExporter commented Mar 21, 2015