HtmlEncoding consistent with rules in handlebars.js #473

tommysor · 2021-11-25T20:46:03Z

The original issue this PR was intended to solve have been fixed in PR #477.
This PR now deals with general rules for encoding in Handlebars.Net vs handlebars.js.

Using this PR since it contains the history of how this change came to be.

~~[WIP] Configuration.NoEscape inconsistent fix~~

~~Regarding issue: HandlebarsDotNet.Handlebars.Configuration.NoEscape Applied Inconsistently #468~~

Change in commit: Reset SuppressEncoding to configured instead of false. (old) UnencodedStatementVisitor resets value to previous (new) makes both tests for Handlebars_Should_Encode_Chinese pass.

Both tests for Handlebars_Should_Not_Encode_Chinese still fail (output is Chinese characters, rather than expected &#xxxx;).
I don't know if this should be considered correct behavior, or possibly a separate issue.

Looking forward to your comments.

rexm · 2021-11-25T21:04:17Z

What’s the behavior in the JS library for the Chinese characters?

tommysor · 2021-11-26T16:49:04Z

What’s the behavior in the JS library for the Chinese characters?

I have no idea. Or any idea of how to figure it out.
Closest I see is this page: http://tryhandlebarsjs.com/
But I don't know what settings they are using and if that matches what's relevant for this case.

On http://tryhandlebarsjs.com/ with "expression with raw HTML"
"<" is shown as <
While non-ascii characters "öä ñ øå 产品" are shown as entered.

Any insight here @zjklee or @tonysneed ?

tommysor · 2021-11-28T09:00:56Z

I had another look, still cant test if I'm correct.
Looks to me like the JS library does not escape anything except this small list of chars.

https://github.com/handlebars-lang/handlebars.js/blob/master/lib/handlebars/utils.js

const escape = {
  '&': '&amp;',
  '<': '&lt;',
  '>': '&gt;',
  '"': '&quot;',
  "'": '&#x27;',
  '`': '&#x60;',
  '=': '&#x3D;'
};

const badChars = /[&<>"'`=]/g,
  possible = /[&<>"'`=]/;

function escapeChar(chr) {
  return escape[chr];
}

[...]

export function escapeExpression(string) {
  if (typeof string !== 'string') {
    // don't escape SafeStrings, since they're already safe
    if (string && string.toHTML) {
      return string.toHTML();
    } else if (string == null) {
      return '';
    } else if (!string) {
      return string + '';
    }

    // Force a string conversion as this will be done by the append regardless and
    // the regex test will do this transparently behind the scenes, causing issues if
    // an object's to string has escaped characters in it.
    string = '' + string;
  }

  if (!possible.test(string)) {
    return string;
  }
  return string.replace(badChars, escapeChar);
}

https://github.com/Handlebars-Net/Handlebars.Net/blob/master/source/Handlebars/IO/HtmlEncoder.cs

[...]

private static void EncodeImpl<T>(T text, TextWriter target) where T: IEnumerator<char>
{
	while (text.MoveNext())
	{
		var value = text.Current;
		switch (value)
		{
			case '"':
				target.Write("&quot;");
				break;
			case '&':
				target.Write("&amp;");
				break;
			case '<':
				target.Write("&lt;");
				break;
			case '>':
				target.Write("&gt;");
				break;

			default:
				if (value > 159)
				{
					target.Write("&#");
					target.Write((int)value);
					target.Write(";");
				}
				else target.Write(value);
				break;
		}
	}
}

tommysor · 2021-11-28T11:03:01Z

Cleaned up branch. The two commits now address 2 separate issues.

Commit: UnencodedStatementVisitor resets value to previous
is a strait forward bugfix, and I believe it should be enough to fix the specific problem in issue #468.

Commit: HtmlEncoder escape chars based on https://github.com/handlebars-lang/…
is a danger zone.
Encoding rules copied from JS based on my understanding.
I left out (with comment) 2 chars ( ' and = ) because encoding them breaks existing tests.
Would likely break real world implementations that rely on current behavior.

tonysneed · 2021-12-02T16:57:12Z

@tommysor Thanks for working to fix #468. I'll pull your branch to see if the tests pass in my repro.

@rexm @zjklee I'm looking forward to your review of this PR.

tonysneed · 2021-12-06T13:53:58Z

@tommysor I puled your branch and ran my tests. The tests which set Handlebars.Configuration.NoEscape = true all pass.

However, setting Handlebars.Configuration.NoEscape = false has no effect. The Chinese characters are not encoded when they should be. Is this expected behavior?

tommysor · 2021-12-06T20:06:18Z

@tonysneed That, I believe, is the $1,000 question for this pull request.

Under assumption: Existing encoding rules in Handlebars.Net is correct (https://github.com/Handlebars-Net/Handlebars.Net/blob/master/source/Handlebars/IO/HtmlEncoder.cs).
Then expected result is that the Chinese characters are encoded to &#xxx; when NoEscape = false.
This is what you should get if you cherry pick only the commit UnencodedStatementVisitor resets value to previous.
@tonysneed Your tests will not pass as is. They will pass if you make a change in Class.hbs to double curlys {{> properties}} from triple curlys {{{> properties}}}.

Under assumption: Encoding rules in JavaScript repo is correct (https://github.com/handlebars-lang/handlebars.js/blob/master/lib/handlebars/utils.js).
Then expected result is that the Chinese characters are kept unchanged regardless of NoEscape setting.
This is the current behavior of this branch. Some deviations from js rules would need some cleanup work.

Maintainers (so @rexm and/or @zjklee) should decide the direction for this.
My own suggestion would be that I split the commit UnencodedStatementVisitor resets value to previous out into a new pull request. That commit can be released as a patch with very low chance of breaking anything.
While the change to general handling of encoding would be a separate larger change, or possibly not done, according to the maintainers decision.

oformaniuk · 2021-12-17T02:57:51Z

@tommysor , thanks for taking time for creating the PR.
I'd probably go with a safer option - first merge non-breaking changes. In a separate PR introduce the breaking changes but hide them behind feature toggle. The default behaviour should stay the same in current version but can be changed in the next major release.

tommysor · 2021-12-20T14:37:42Z

Add feature toggle Compatibility.UseLegacyHandlebarsNetHtmlEncoding

true: No change from current behavior (default).
false: Use rules from https://github.com/handlebars-lang/handlebars.js/blob/master/lib/handlebars/utils.js

oformaniuk

Can you please also add corresponding documentation to the README?

source/Handlebars/IO/HtmlEncoder.cs

source/Handlebars.Test/BasicIntegrationTests.cs

source/Handlebars/IO/HtmlEncoder.cs

source/Handlebars/IO/HtmlEncoderLegacy.cs

tommysor · 2021-12-22T14:52:24Z

Can you please also add corresponding documentation to the README?

Added section to README.

oformaniuk · 2021-12-22T21:47:20Z

Rebase your branch to the latest master and it will merge automatically 👍

…handlebars.js/blob/master/lib/handlebars/utils.js

UseLegacyHandlebarsNetHtmlEncoding was set to false.

…coder.

Add unit tests for overloads.

Run BasicIntegrationTests with new version of HtmlEncoder.

sonarcloud · 2021-12-22T22:13:41Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

100.0% Coverage
16.6% Duplication

tommysor force-pushed the NoEscape_inconsistent_fix branch from 5fffc26 to 5b68f40 Compare November 28, 2021 10:35

tonysneed mentioned this pull request Dec 2, 2021

[Bug] Issue with final text being encoded TrackableEntities/EntityFrameworkCore.Scaffolding.Handlebars#184

Closed

tommysor mentioned this pull request Dec 17, 2021

UnencodedStatementVisitor resets value to previously #477

Merged

oformaniuk linked an issue Dec 18, 2021 that may be closed by this pull request

HandlebarsDotNet.Handlebars.Configuration.NoEscape Applied Inconsistently #468

Closed

tommysor force-pushed the NoEscape_inconsistent_fix branch from 5b68f40 to a2dd13a Compare December 20, 2021 14:24

tommysor changed the title ~~[WIP] Configuration.NoEscape inconsistent fix~~ HtmlEncoding consistent with rules in handlebars.js Dec 20, 2021

oformaniuk requested changes Dec 20, 2021

View reviewed changes

source/Handlebars/IO/HtmlEncoder.cs Outdated Show resolved Hide resolved

oformaniuk added enhancement handlebars lang compatibility labels Dec 20, 2021

oformaniuk reviewed Dec 21, 2021

View reviewed changes

source/Handlebars.Test/BasicIntegrationTests.cs Show resolved Hide resolved

oformaniuk reviewed Dec 21, 2021

View reviewed changes

source/Handlebars/IO/HtmlEncoder.cs Outdated Show resolved Hide resolved

oformaniuk reviewed Dec 21, 2021

View reviewed changes

source/Handlebars/IO/HtmlEncoderLegacy.cs Outdated Show resolved Hide resolved

tommysor requested a review from oformaniuk December 22, 2021 15:09

oformaniuk approved these changes Dec 22, 2021

View reviewed changes

oformaniuk enabled auto-merge December 22, 2021 21:47

tommysor added 4 commits December 22, 2021 23:01

HtmlEncoder escape chars based on https://github.com/handlebars-lang/…

60b6506

…handlebars.js/blob/master/lib/handlebars/utils.js

Change tests that would fail if default for

aa087ce

UseLegacyHandlebarsNetHtmlEncoding was set to false.

Replace bool feature toggle with alternative implementation of HtmlEn…

7e81feb

…coder.

Make EncodeImpl<T> static.

cc80d19

Add unit tests for overloads.

Add readme section.

2fee90c

Run BasicIntegrationTests with new version of HtmlEncoder.

auto-merge was automatically disabled December 22, 2021 22:05
Head branch was pushed to by a user without write access

tommysor force-pushed the NoEscape_inconsistent_fix branch from e679fd4 to 2fee90c Compare December 22, 2021 22:05

oformaniuk enabled auto-merge December 22, 2021 22:08

oformaniuk merged commit 02e1794 into Handlebars-Net:master Dec 22, 2021

tommysor deleted the NoEscape_inconsistent_fix branch December 22, 2021 22:18

oformaniuk mentioned this pull request Dec 22, 2021

Fix Handlebars Encoding Issue TrackableEntities/EntityFrameworkCore.Scaffolding.Handlebars#200

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HtmlEncoding consistent with rules in handlebars.js #473

HtmlEncoding consistent with rules in handlebars.js #473

tommysor commented Nov 25, 2021 •

edited

rexm commented Nov 25, 2021

tommysor commented Nov 26, 2021

tommysor commented Nov 28, 2021

tommysor commented Nov 28, 2021

tonysneed commented Dec 2, 2021

tonysneed commented Dec 6, 2021

tommysor commented Dec 6, 2021

oformaniuk commented Dec 17, 2021

tommysor commented Dec 20, 2021

oformaniuk left a comment

tommysor commented Dec 22, 2021

oformaniuk commented Dec 22, 2021

sonarcloud bot commented Dec 22, 2021

HtmlEncoding consistent with rules in handlebars.js #473

HtmlEncoding consistent with rules in handlebars.js #473

Conversation

tommysor commented Nov 25, 2021 • edited

rexm commented Nov 25, 2021

tommysor commented Nov 26, 2021

tommysor commented Nov 28, 2021

tommysor commented Nov 28, 2021

tonysneed commented Dec 2, 2021

tonysneed commented Dec 6, 2021

tommysor commented Dec 6, 2021

oformaniuk commented Dec 17, 2021

tommysor commented Dec 20, 2021

oformaniuk left a comment

Choose a reason for hiding this comment

tommysor commented Dec 22, 2021

oformaniuk commented Dec 22, 2021

sonarcloud bot commented Dec 22, 2021

tommysor commented Nov 25, 2021 •

edited