Option to prevent double encoding #78

diegomansua · 2022-10-31T17:05:05Z

Hello,

First of all thanks to everyone that has made this lib possible.

This is not a bug report but rather than a feature suggestion.

I'm using this lib to import data from a third party into an old database that only supports ISO-8859-1.

I was using it like encode(<text>, {mode: 'nonAscii'}).

But I hit an issue as it turns out that the third party already uses entities for some characters. This means that I ended up with &#39; whenever there was a ' entity, for example.

So I thought it'd be nice to have a preventDoubleEncoding option (only with a better name), to prevent encoding the ampersand whenever it's already part of an entity. E.g.:

encode('you & me', {mode: 'nonAscii', preventDoubleEncoding: true}); -> returns you & me
encode('you & me', {mode: 'nonAscii', preventDoubleEncoding: false}); -> returns you & me
encode('you & me', {mode: 'nonAscii', preventDoubleEncoding: true}); -> returns you & me
encode('you & me', {mode: 'nonAscii', preventDoubleEncoding: false}); -> returns you &amp; me

The text was updated successfully, but these errors were encountered:

mdevils · 2023-06-05T19:16:45Z

Hello @diegomansua,

Sorry for a long delay in the response. Can it be that this PR #86 solves your problem?

diegomansua · 2023-06-08T08:25:59Z

@mdevils unless I'm doing something wrong it doesn't seem like it would solve my problem; I've checked out the PR branch and built it and tried the following:

console.log(encode('you &amp; me', {mode: 'nonAsciiPrintableOnly'})); // prints 'you &amp; me' ✅
console.log(encode('you & me', {mode: 'nonAsciiPrintableOnly'})); // prints 'you & me' ❌ (expected 'you &amp; me')

I've tried also with level: 'xml'.

Basically what I'd need is an option so that if an entity is already encoded (e.g. &), it shouldn't encode it again (i.e. it should leave it as & instead of doing &amp;).

mdevils · 2023-06-24T22:14:24Z

Hello @diegomansua.

I'm afraid you have a very specific use-case.

I'd suggest you to use a combination of encode and decode like so:

console.log(encode(decode('you &amp; me and you & me'), {mode: 'nonAscii'}));

Hope this helps.

mdevils closed this as completed Jun 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option to prevent double encoding #78

Option to prevent double encoding #78

diegomansua commented Oct 31, 2022 •

edited

Loading

mdevils commented Jun 5, 2023

diegomansua commented Jun 8, 2023

mdevils commented Jun 24, 2023

Option to prevent double encoding #78

Option to prevent double encoding #78

Comments

diegomansua commented Oct 31, 2022 • edited Loading

mdevils commented Jun 5, 2023

diegomansua commented Jun 8, 2023

mdevils commented Jun 24, 2023

diegomansua commented Oct 31, 2022 •

edited

Loading