Backwards name/description of `encodeEntites` option #612

mindplay-dk · 2022-01-06T08:58:39Z

Either the name or the description of the encodeEntities option looks wrong for what it does:

  const elem = parseDocument(`<img src="/foo?bar=bat&quote=&quot;" width="1" height="1">`).childNodes[0] as Element;

  console.log(render(elem));                            // <img src="/foo?bar=bat&amp;quote=&quot;" width="1" height="1">

  console.log(render(elem, { decodeEntities: true }));  // <img src="/foo?bar=bat&amp;quote=&quot;" width="1" height="1">

  console.log(render(elem, { decodeEntities: false })); // <img src="/foo?bar=bat&quote=&quot;" width="1" height="1">

Per the inline documentation:

dom-serializer/src/index.ts

Lines 41 to 46 in 45e4123

    
             /** 
        
              * Encode characters that are either reserved in HTML or XML, or are outside of the ASCII range. 
        
              * 
        
              * @default true 
        
              */ 
        
             decodeEntities?: boolean;

And the README:

decodeEntities

• Optional decodeEntities: boolean

Encode characters that are either reserved in HTML or XML, or are outside of the ASCII range.

default true

The description doesn't match the name: it literally says the decodeEntities option will "Encode characters".

This make it sound like the default behavior of the render function is to decode entities, when it fact the opposite is true: by default, it encodes the entities in the rendered HTML.

If you want it to encode the entities, you have to pass decodeEntities: true, which is pretty confusing as well.

This option probably should have been named encodeEntities?

The examples above and documentation would make more sense then:

  const elem = parseDocument(`<img src="/foo?bar=bat&quote=&quot;" width="1" height="1">`).childNodes[0] as Element;

  console.log(render(elem));                            // <img src="/foo?bar=bat&amp;quote=&quot;" width="1" height="1">

  console.log(render(elem, { encodeEntities: true }));  // <img src="/foo?bar=bat&amp;quote=&quot;" width="1" height="1">

  console.log(render(elem, { encodeEntities: false })); // <img src="/foo?bar=bat&quote=&quot;" width="1" height="1">

Also, the description could be more accurate - it won't bypass the encoding of entities entirely, only for characters where this still produces HTML that can be parsed. As per my example, it will encode " no matter what.

I would suggest deprecating this option in favor of a correctly-named encodeEntities with the same behavior. (probably less confusing that changing the default and reversing the behavior, which would be a breaking change.)

The text was updated successfully, but these errors were encountered:

fb55 · 2022-01-06T11:59:33Z

Agreed that the name isn't ideal. It is here to have equivalent options between htmlparser2 and dom-serializer.

My instinct would be to add a encodeEntities option, which is aliased as decodeEntities. @mindplay-dk What do you think about that?

mindplay-dk · 2022-01-11T17:12:16Z

You mean this?

/**
 * Decode entities within the document.
 *
 * @default true
 */
decodeEntities?: boolean;

It is here to have equivalent options between htmlparser2 and dom-serializer.

Ah, but the parser actually does decode entities - the serializer works in the opposite direction and encodes entities.

These options have opposite effects and should have opposite names.

BREAKING: By default, the output will now include non-ASCII characters. Fixes #612

fb55 added a commit that referenced this issue Apr 9, 2022

feat: Escape text by default, add encodeEntities option

a136480

BREAKING: By default, the output will now include non-ASCII characters. Fixes #612

fb55 mentioned this issue Apr 9, 2022

feat: Add encodeEntities option with utf8 output #685

Merged

fb55 closed this as completed in #685 Apr 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backwards name/description of `encodeEntites` option #612

Backwards name/description of `encodeEntites` option #612

mindplay-dk commented Jan 6, 2022

`decodeEntities`

fb55 commented Jan 6, 2022

mindplay-dk commented Jan 11, 2022

Backwards name/description of encodeEntites option #612

Backwards name/description of encodeEntites option #612

Comments

mindplay-dk commented Jan 6, 2022

decodeEntities

fb55 commented Jan 6, 2022

mindplay-dk commented Jan 11, 2022

Backwards name/description of `encodeEntites` option #612

Backwards name/description of `encodeEntites` option #612

`decodeEntities`