Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-71182] Correct Unicode behavior of XML_1_1 #7924

Merged
merged 5 commits into from May 5, 2023

Conversation

jglick
Copy link
Member

@jglick jglick commented May 2, 2023

See JENKINS-71182. There does not seem to be any good solution, until PrettyPrintWriter supports Unicode properly. #7875 (comment)

Testing done

See test cases. (Writing emojis was apparently not tested before, only reading.)

Proposed changelog entries

  • Fixed writing of emojis to XML (regression in 2.403).

Proposed upgrade guidelines

N/A

Maintainer checklist

Before the changes are marked as ready-for-merge:

  • There are at least two (2) approvals for the pull request and no outstanding requests for change.
  • Conversations in the pull request are over, or it is explicit that a reviewer is not blocking the change.
  • Changelog entries in the pull request title and/or Proposed changelog entries are accurate, human-readable, and in the imperative mood.
  • Proper changelog labels are set so that the changelog can be generated automatically.
  • If the change needs additional upgrade steps from users, the upgrade-guide-needed label is set and there is a Proposed upgrade guidelines section in the pull request title (see example).
  • If it would make sense to backport the change to LTS, a Jira issue must exist, be a Bug or Improvement, and be labeled as lts-candidate to be considered (see query).

@jglick
Copy link
Member Author

jglick commented May 2, 2023

(Oh, and I tried to use WriterWrapper to add indentation to StaxWriter, without success—the > at the end of an element is written only after text content or an end element is encountered.)

@basil
Copy link
Member

basil commented May 2, 2023

I filed x-stream/xstream#337 to fix XStream. Do you think it is worth temporarily inlining that patch into Jenkins while waiting for upstream to merge/release it, or do you think it is preferable to revert back to quirks mode?

@Bananeweizen
Copy link
Contributor

I'm not sure if this is of any help here, but I have seen filtered streams/readers in other real world applications just stripping offending characters from the input stream before passing it to the XML handling (e.g. the NUL character that started these changes). Examples of software I have actually been using:

@NotMyFault NotMyFault added the regression-fix Pull request that fixes a regression in one of the previous Jenkins releases label May 3, 2023
@jglick
Copy link
Member Author

jglick commented May 3, 2023

Do you think it is worth temporarily inlining that patch into Jenkins while waiting for upstream to merge/release it, or do you think it is preferable to revert back to quirks mode?

Seems simplest to me to just use quirks mode until a fix makes it into a release. The fail-fast behavior is nice to have but hardly essential, assuming the junit update is adopted. No strong opinion.

filtered streams/readers […] just stripping offending characters

Yes, and/or conversely rejecting � on output. Would add a little overhead. Again seems not worth the bother especially if a fixed XStream comes reasonably soon, but if someone wants to propose such a patch I will happily close this PR in favor of it.

@jglick
Copy link
Member Author

jglick commented May 3, 2023

(Whatever the approach, let us get some fix into the next weekly.)

@basil
Copy link
Member

basil commented May 3, 2023

My preference would be to inline the XStream fix, since that would provide additional real-world testing which would increase the likelihood that the fix would be accepted upstream.

@jglick
Copy link
Member Author

jglick commented May 3, 2023

OK, I should be able to do that today.

}

private void writeText(final String text, final boolean isAttribute) {
text.codePoints().forEach(c -> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and matching lines are the fix from the upstream PR.

public static int XML_1_1 = 1;

private final QuickWriter writer;
private final FastStack elementStack = new FastStack(16);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had to drop type parameter

}

@Override
public void startNode(final String name, final Class clazz) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had to switch to rawtypes

* Created on 07. March 2004 by Joe Walnes
*/

package hudson.util;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edited from original, obviously

@@ -0,0 +1,340 @@
// TODO adapted from https://github.com/x-stream/xstream/blob/32e52a6519a25366bbb5774bb536b5e290b64a42/xstream/src/java/com/thoughtworks/xstream/io/xml/PrettyPrintWriter.java pending release of https://github.com/jenkinsci/jenkins/pull/7924
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reformatting for Checkstyle, plus a couple minor changes noted inline.

@basil basil self-assigned this May 3, 2023
@jglick jglick changed the title [JENKINS-71182] Revert to XML_QUIRKS [JENKINS-71182] Correct Unicode behavior of XML_1_1 May 3, 2023
* @author Joe Walnes
* @author Jörg Schaible
*/
class PrettyPrintWriter extends AbstractXmlWriter {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dropped public on class and constructors

Copy link
Member

@NotMyFault NotMyFault left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, Jesse!

@basil
Copy link
Member

basil commented May 4, 2023

This PR is now ready for merge. We will merge it after approximately 24 hours if there is no negative feedback. Please see the merge process documentation for more information about the merge process. Thanks!

@basil basil added the ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback label May 4, 2023
@basil basil merged commit 8d9e8b9 into jenkinsci:master May 5, 2023
16 checks passed
@jglick jglick deleted the emoji-JENKINS-71182 branch May 5, 2023 16:28
NotMyFault pushed a commit to NotMyFault/jenkins that referenced this pull request May 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback regression-fix Pull request that fixes a regression in one of the previous Jenkins releases
Projects
None yet
4 participants