Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to convert from utf-8 to ISO-8859-8 ? #43

Closed
mderazon opened this issue Dec 29, 2013 · 5 comments
Closed

How to convert from utf-8 to ISO-8859-8 ? #43

mderazon opened this issue Dec 29, 2013 · 5 comments

Comments

@mderazon
Copy link

Hi, I am trying to convert a string in utf-8 to a string in iso-8859-8. I am doing this:

var str = 'this is a string';
var buf = iconv.encode(str, 'ISO-8859-8');
str = buf.toString();

Is this the right way of using your lib ?

Thanks 馃憤

@ashtuchkin
Copy link
Owner

When you work with text data in javascript, you usually work in the following steps:

  1. Get some bytes with known encoding from external source (you get Buffer).
  2. Convert the source Buffer to a native js string (which is itself utf-16), using iconv.decode or buf.toString('utf-8').
  3. Do something with native js strings. These are the only strings you can meaningfully work with.
  4. Convert output native js strings to an output Buffer encoded with destination encoding. (iconv.encode or new Buffer(str, 'utf-8'))
  5. Send output Buffer (bytes) to external party.

So, in your case I assume that the str is given to you as a native js string and so it's utf-16, not utf-8. If not, please ensure it's correctly decoded (just print it to console).

Second step, where you're converting it to an iso-8859-8 Buffer, is good.

The last line (str = buf.toString()) is meaningless because you're trying to read iso-8859-8 encoded buffer as a utf-8 encoded buffer -> you'll get garbage.

After conversion to something other than native js strings, you should work with Buffers, not trying to convert it back. Usually you'll need some concatenation, use Buffer.concat([buf1, buf2]).

@mderazon
Copy link
Author

I need to send the ISO-8859-8 encoded string as a request body to a 3rd party server that expects xml input in that encoding with the header

<?xml version='1.0' encoding='ISO-8859-8'?>

I am doing it with request node module. So are you saying that I need to feed response body with Buffer instead of string ? I'm not sure it's possible

Thanks !

@ashtuchkin
Copy link
Owner

Why not?

xmlStr = "<?xml version='1.0' encoding='ISO-8859-8'?>" + originalXmlStr;

buf = iconv.encode(xmlStr, 'iso-8859-8');

request({
  url: "..",
  method: "POST",
  body: buf,
  ...
}, function(err, res, body) {
  // do something.
});

The thing is, with iso-8859 family of encodings, ASCII chars are kept as-is, so the xml header will be kept.

@ashtuchkin
Copy link
Owner

From request documentation:

body - entity body for PATCH, POST and PUT requests. Must be a Buffer or String.

@mderazon
Copy link
Author

I wasn't aware of this :)

Thanks a lot !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants