-
Notifications
You must be signed in to change notification settings - Fork 660
feature(xtask): add encoding detection to compare #1907
Conversation
Test262 comparison coverage results on windows-latest
|
Perhaps it is better to configure your Powershell to output utf-8 instead? I see a lot of people do this on stackoverflow 😃 |
I'm torned here, both arguments are valid. Although, so far we haven't had this issue even though we have people that use Windows as their primary OS. So my question is, how likely people would encounter this issue? We should also highlight the fact that these commands are there for us maintainers/contributors. If this error might occur often, especially for contributors, maybe it's worth fixing it. Although this is also an information that we can add inside the |
I'm in favour of this change. This is an issue for everyone using the "classic" Windows PowerShell (not the cross-platform version) as explained in the stack overflow article that @Boshen linked. It only adds little complexity and eases the setup of many. I also don't think that it should require engineers to change their dev tooling to work on Rome because they may then end up in a situation where they need UTF8 for Rome but UTF16 for some other project they're working on. |
I think it's probably a better solution to use a shell that emits UTF-8 files, but if anyone like me is also using PowerShell because it's the default shell that comes with Windows, then even the latest Windows 11 seems to ship with Windows PowerShell (5.1) which always emits a BOM that serde will fail on. Configuring Powershell is not that easy because besides creating a profile script you also need to change the execution policy to allow it to run, and if you go in there you're just better off installing PowerShell Core or a completely different shell anyway. Overall this just an annoying speed bump for the Windows-based contributor experience, not a major issue but it just adds unnecessary friction for first-time contributors. |
This adds a simple encoding detection logic to the loading of test results file that detects (and skips over) Unicode Byte Order Mark, and re-encodes UTF-16 content into UTF-8 before handing it to serde_json
913253c
to
b5df252
Compare
Deploying with Cloudflare Pages
|
file.read_to_end(&mut buffer) | ||
.unwrap_or_else(|err| panic!("Can't read the file of the {} results: {:?}", name, err)); | ||
|
||
enum FileEncoding { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering here if we can use something like: https://github.com/kena0ki/aconv or https://gitlab.com/philbooth/unicode-bom
+1 |
Summary
On PowerShell the default behavior for pipes it to output an UTF-16 encoded file, so
cargo xtask coverage --json > results.json
will create a file that can't be decoded directly bycargo xtask compare
since serde_json and Rust in general assumes an UTF-8 encoding.This adds a simple encoding detection logic to the loading of test results file that detects (and skips over) Unicode Byte Order Mark, and re-encodes UTF-16 content into UTF-8 before handing it to serde_json