feature(xtask): add encoding detection to compare #1907

leops · 2022-01-07T09:52:50Z

Summary

On PowerShell the default behavior for pipes it to output an UTF-16 encoded file, so cargo xtask coverage --json > results.json will create a file that can't be decoded directly by cargo xtask compare since serde_json and Rust in general assumes an UTF-8 encoding.

This adds a simple encoding detection logic to the loading of test results file that detects (and skips over) Unicode Byte Order Mark, and re-encodes UTF-16 content into UTF-8 before handing it to serde_json

github-actions · 2022-01-07T10:06:52Z

Test262 comparison coverage results on windows-latest

Test result	`main` count	This PR count	Difference
Total	44951	44951	0
Passed	42866	42866	0
Failed	2084	2084	0
Panics	1	1	0
Coverage	95.36%	95.36%	0.00%

Boshen · 2022-01-07T10:13:31Z

Perhaps it is better to configure your Powershell to output utf-8 instead? I see a lot of people do this on stackoverflow 😃

ematipico · 2022-01-07T10:45:19Z

I'm torned here, both arguments are valid. Although, so far we haven't had this issue even though we have people that use Windows as their primary OS. So my question is, how likely people would encounter this issue?

We should also highlight the fact that these commands are there for us maintainers/contributors. If this error might occur often, especially for contributors, maybe it's worth fixing it.

Although this is also an information that we can add inside the CONTRIBUTING.md section. What do you think, team?

MichaReiser · 2022-01-07T10:49:42Z

I'm in favour of this change. This is an issue for everyone using the "classic" Windows PowerShell (not the cross-platform version) as explained in the stack overflow article that @Boshen linked.

It only adds little complexity and eases the setup of many. I also don't think that it should require engineers to change their dev tooling to work on Rome because they may then end up in a situation where they need UTF8 for Rome but UTF16 for some other project they're working on.

leops · 2022-01-07T10:57:07Z

I think it's probably a better solution to use a shell that emits UTF-8 files, but if anyone like me is also using PowerShell because it's the default shell that comes with Windows, then even the latest Windows 11 seems to ship with Windows PowerShell (5.1) which always emits a BOM that serde will fail on. Configuring Powershell is not that easy because besides creating a profile script you also need to change the execution policy to allow it to run, and if you go in there you're just better off installing PowerShell Core or a completely different shell anyway. Overall this just an annoying speed bump for the Windows-based contributor experience, not a major issue but it just adds unnecessary friction for first-time contributors.

This adds a simple encoding detection logic to the loading of test results file that detects (and skips over) Unicode Byte Order Mark, and re-encodes UTF-16 content into UTF-8 before handing it to serde_json

cloudflare-workers-and-pages · 2022-01-07T11:10:46Z

Deploying with Cloudflare Pages

Latest commit:	`b5df252`
Status:	✅ Deploy successful!
Preview URL:	https://2b7ff3a5.tools-8rn.pages.dev

View logs

xunilrj · 2022-01-07T16:00:24Z

xtask/src/compare/results.rs

+	file.read_to_end(&mut buffer)
+		.unwrap_or_else(|err| panic!("Can't read the file of the {} results: {:?}", name, err));
+
+	enum FileEncoding {


I am wondering here if we can use something like: https://github.com/kena0ki/aconv or https://gitlab.com/philbooth/unicode-bom

xunilrj · 2022-01-07T16:03:30Z

I'm in favour of this change [..] eases the setup of many.

+1

MichaReiser requested a review from xunilrj January 7, 2022 09:56

ematipico approved these changes Jan 7, 2022

View reviewed changes

feature(xtask): add encoding detection to compare

b5df252

This adds a simple encoding detection logic to the loading of test results file that detects (and skips over) Unicode Byte Order Mark, and re-encodes UTF-16 content into UTF-8 before handing it to serde_json

leops force-pushed the feature/compare-unicode branch from 913253c to b5df252 Compare January 7, 2022 11:10

leops merged commit d448edb into main Jan 7, 2022

leops deleted the feature/compare-unicode branch January 7, 2022 13:11

xunilrj reviewed Jan 7, 2022

View reviewed changes

ematipico mentioned this pull request Nov 28, 2022

🐛 Rome VS Code Extension crashes by inputting Japanese #3872

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature(xtask): add encoding detection to compare #1907

feature(xtask): add encoding detection to compare #1907

leops commented Jan 7, 2022

github-actions bot commented Jan 7, 2022 •

edited

Loading

Boshen commented Jan 7, 2022

ematipico commented Jan 7, 2022

MichaReiser commented Jan 7, 2022

leops commented Jan 7, 2022 •

edited

Loading

cloudflare-workers-and-pages bot commented Jan 7, 2022 •

edited

Loading

xunilrj Jan 7, 2022 •

edited

Loading

xunilrj commented Jan 7, 2022

feature(xtask): add encoding detection to compare #1907

feature(xtask): add encoding detection to compare #1907

Conversation

leops commented Jan 7, 2022

Summary

github-actions bot commented Jan 7, 2022 • edited Loading

Test262 comparison coverage results on windows-latest

Boshen commented Jan 7, 2022

ematipico commented Jan 7, 2022

MichaReiser commented Jan 7, 2022

leops commented Jan 7, 2022 • edited Loading

cloudflare-workers-and-pages bot commented Jan 7, 2022 • edited Loading

Deploying with Cloudflare Pages

xunilrj Jan 7, 2022 • edited Loading

Choose a reason for hiding this comment

xunilrj commented Jan 7, 2022

github-actions bot commented Jan 7, 2022 •

edited

Loading

leops commented Jan 7, 2022 •

edited

Loading

cloudflare-workers-and-pages bot commented Jan 7, 2022 •

edited

Loading

xunilrj Jan 7, 2022 •

edited

Loading