-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JENKINS-48923] Core should use UTF-8 by default #3231
Conversation
Although I generally I agree with the change, it may have a serious impact on instances using another encoding, because they may improperly read previously saved files after the migration. It needs to be really well justified and tested if we want to merge it |
// It's not possible to retrieve the charset that the writer is using; | ||
// however, for all uses of this constructor, the writer is an instance | ||
// of StringWriter, so it's okay to assume UTF-8. | ||
this(new WriterOutputStream(w), StandardCharsets.UTF_8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The WriterOutputStream
constructor needs to specify the encoding too, I think, or you will get junk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Added.
@@ -1697,15 +1709,14 @@ protected final void execute(@Nonnull RunExecution job) { | |||
long start = System.currentTimeMillis(); | |||
|
|||
try { | |||
Computer computer = Computer.currentComputer(); | |||
if (computer != null) { | |||
setCharset(computer.getDefaultCharset()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is what I think is really wrong. We should change the charset of all builds to be UTF-8, unconditionally, so that miscellaneous messages can always be safely written to the log file; but make sure that anything which copies external process output (namely, Launcher
) transcodes that output on either the master or remote side.
Cf. my corresponding changes for Pipeline in jenkinsci/workflow-job-plugin#27 and jenkinsci/durable-task-plugin#29, with integration test (ShellStepTest.encoding
) in jenkinsci/workflow-durable-task-step-plugin#21.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sure you are right. Ideally only UTF-8 would be used, though I was hesitant to go through and rip out anything relating to charsets. Right now I have changed the implementations of Computer.getDefaultCharset() to return StandardCharsets.UTF_8
, so effectively this is what is being done, but without removing the scaffolding just yet.
* @since TODO | ||
*/ | ||
public final void setCharset(Charset charset) { | ||
this.charsetInstance = charset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not enough. onLoad
must also restore charsetInstance
based on the persisted charset
, or getCharset
will wind up always returning UTF-8 after a restart.
This PR needs to address comments && there is a serious merge conflict there. |
See JENKINS-48923.
This started off as a follow-up to PR #3210 (see #3210 (comment) and PR #3224), but it appeared that fixing the one use of the default charset could be problematic, as there are several other uses of the default charset.
So, I have attempted to change the charset that Jenkins uses by default to UTF-8 across core.
This should be considered an experimental, work-in-progress PR.
Proposed changelog entries
Submitter checklist
* Use the
Internal:
prefix if the change has no user-visible impact (API, test frameworks, etc.)Desired reviewers
@daniel-beck
@oleg-nenashev