Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add jackson-dataformat-xml to Jackson 2 API plugin #82

Merged
merged 2 commits into from
Jul 3, 2021

Conversation

basil
Copy link
Member

@basil basil commented Jul 2, 2021

Adds jackson-dataformat-xml to the Jackson 2 API plugin for the benefit of other plugins that need it and want to get it through the Jackson 2 API plugin. This also bundles the following two new transitive dependencies:

[WARNING] Bundling transitive dependency woodstox-core-6.2.4.jar (via jackson-dataformat-xml)
[WARNING] Bundling transitive dependency stax2-api-4.2.1.jar (via jackson-dataformat-xml)

These are up-to-date versions of StAX 2 (which the Java Platform does not provide) and Woodstox (the recommended StAX implementation for jackson-dataformat-xml.

I included a new integration test with RealJenkinsRule that works against Jenkins 2.297 and earlier, as well as tip-of-trunk with jenkinsci/jenkins#5604. Without jenkinsci/jenkins#5604, the test fails on tip-of-trunk with the same error as in jenkinsci/azure-storage-plugin#194. So somehow we end up relying on the old version of Woodstox (3.x) from Jenkins core, even though jackson-dataformat-xml depends on a recent version of Woodstox as a compile-time dependency. I am not sure why. The test passes with regular JenkinsRule, so this must be a classpath issue with jenkins.war. This would be a useful starting point for anyone who is investigating removing Woodstox from Jenkins core.

CC @timja

Copy link
Member

@timja timja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I tried adding this here to see if it fixed it on it's own but it didn't, good to know you had the same result

@basil
Copy link
Member Author

basil commented Jul 2, 2021

To reproduce the Azure problem, add this on top of the present PR:

diff --git a/pom.xml b/pom.xml
index 4eba2d1..7f7b7a1 100644
--- a/pom.xml
+++ b/pom.xml
@@ -67,7 +67,7 @@
     <revision>2.12.4</revision>
     <changelist>-SNAPSHOT</changelist>
     <java.level>8</java.level>
-    <jenkins.version>2.222.4</jenkins.version>
+    <jenkins.version>2.300</jenkins.version>
     <jackson.version>2.12.3</jackson.version>
     <jackson-databind.version>${jackson.version}</jackson-databind.version>
   </properties>
@@ -89,8 +89,8 @@
     <dependencies>
       <dependency>
         <groupId>io.jenkins.tools.bom</groupId>
-        <artifactId>bom-2.222.x</artifactId>
-        <version>807.v6d348e44c987</version>
+        <artifactId>bom-2.289.x</artifactId>
+        <version>887.vae9c8ac09ff7</version>
         <scope>import</scope>
         <type>pom</type>
       </dependency>

BTW I see the root cause of the problem now, but I don't believe there is a practical solution. Perhaps @jglick would be interested in checking my analysis.

com.fasterxml.jackson.dataformat.xml.XmlFactory#_createParser(byte[], int, int, com.fasterxml.jackson.core.io.IOContext) creates a new Stax2ByteArraySource here:

https://github.com/FasterXML/jackson-dataformat-xml/blob/75d9056bf344d2c18192eb128f177ee21a586017/src/main/java/com/fasterxml/jackson/dataformat/xml/XmlFactory.java#L631

That is then pased to javax.xml.stream.XMLInputFactory#createXMLStreamReader(javax.xml.transform.Source).

If that XMLInputFactory happens to be com.ctc.wstx.stax.WstxInputFactory from woodstox-core, all is well.

If that XMLInputFactory happens to be com.sun.xml.internal.stream.XMLInputFactoryImpl from the Java Platform's rt.jar, we have problems. Because that implementation of XMLInputFactory contains

https://github.com/AdoptOpenJDK/openjdk-jdk8u/blob/64cc2bc73816f3e0a91776805482fee37fa37bbe/jaxp/src/com/sun/xml/internal/stream/XMLInputFactoryImpl.java#L282-L303

resulting in

java.lang.UnsupportedOperationException: Cannot create XMLStreamReader or XMLEventReader from a org.codehaus.stax2.io.Stax2ByteArraySource
	at com.sun.xml.internal.stream.XMLInputFactoryImpl.jaxpSourcetoXMLInputSource(XMLInputFactoryImpl.java:302)
	at com.sun.xml.internal.stream.XMLInputFactoryImpl.createXMLStreamReader(XMLInputFactoryImpl.java:143)
	at com.fasterxml.jackson.dataformat.xml.XmlFactory._createParser(XmlFactory.java:631)
	at com.fasterxml.jackson.dataformat.xml.XmlFactory._createParser(XmlFactory.java:29)
	at com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:1124)
	at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3609)
[…]

So clearly com.fasterxml.jackson.dataformat.xml.XmlFactory is expecting com.ctc.wstx.stax.WstxInputFactory and is incompatible with com.sun.xml.internal.stream.XMLInputFactoryImpl. Now most users never get com.sun.xml.internal.stream.XMLInputFactoryImpl, but Jenkins plugins do. Why is this only an issue with Jenkins plugins? Because the way the XMLInputFactory is chosen:

https://github.com/FasterXML/jackson-dataformat-xml/blob/75d9056bf344d2c18192eb128f177ee21a586017/src/main/java/com/fasterxml/jackson/dataformat/xml/XmlFactory.java#L113

The use of XMLInputFactory.newInstance() ends up invoking ServiceLoader through this stack trace:

load:538, ServiceLoader (java.util)
run:347, FactoryFinder$1 (javax.xml.stream)
doPrivileged:-1, AccessController (java.security)
findServiceProvider:341, FactoryFinder (javax.xml.stream)
find:313, FactoryFinder (javax.xml.stream)
find:227, FactoryFinder (javax.xml.stream)
newInstance:154, XMLInputFactory (javax.xml.stream)

which ends up using Thread.currentThread().getContextClassLoader(). Normally this gives Jackson users com.ctc.wstx.stax.WstxInputFactory but this is problematic for Jenkins plugins as described here. The context class loader for the plugin is WebAppClassLoader, which doesn't have access to the woodstox-core dependency of the plugin (assuming Woodstox is removed from Jenkins core as in jenkinsci/jenkins#5559; with the status quo i.e. jenkinsci/jenkins#5604 we end up getting a version of com.ctc.wstx.stax.WstxInputFactory from an older version of Woodstox!).

This is a similar problem as described in JENKINS-58130 (comment):

Investigating this extensively, it would appear that the root of this exception is a fundamental problem in Jenkins or JAX-WS. Briefly, JAX-WS has a ServiceLoaderUtil class shared among the SAAJ, JAXB, and JAXWS libraries to invoke java's ServiceLoader and get implementation classes, but it only ever invokes ServiceLoader.load(Class spiClass), the method that tries to load the class from the current thread's context ClassLoader.

I'm not sure there is a practical solution here. In the case of e.g. jenkinsci/azure-storage-plugin#194 we are just consuming some upstream Microsoft Azure SDK that ends up calling Jackson to parse XML, so we can't change the way we invoke Jackson. And I can't think of any way to patch Jackson to work around this either. Patching javax.xml.stream.FactoryFinder sounds impractical as well.

Given the above, I think we have to accept that core will have to bundle Woodstox if we intend to have plugins that use Jackson to parse XML work with Jenkins' architecture. That is basically the status quo, with an ancient version of Woodstox (3.x). If we must do that, then I think we ought to ship the latest version of Woodstox rather than the ancient Woodstox 3.x that we're shipping today. At least that way what ServiceLoader returns will be something close to what the plugin's Jackson/Woodstox dependency tree is expecting, rather than the frighteningly old version of Woodstox in the current status quo.

@jglick
Copy link
Member

jglick commented Jul 2, 2021

we can't change the way we invoke Jackson

Sure you can. Just set the CCL before using the Azure SDK. jenkinsci/jira-plugin#353 for example

@basil
Copy link
Member Author

basil commented Jul 2, 2021

Of course, but I don't think that meaningfully changes my analysis. Yes every plugin that either directly or transitively calls Jackson to parse XML could wrap such calls in a try/finally like the one you linked to, but how would one ever enumerate all such calls in all plugins? Even usage-in-plugins won't be of much help here, since while it can identify plugins that directly or transitively depend on jackson-databind-xml, it can't print out a report of every call site in Jenkins plugin code that may end up directly or transitively invoking jackson-databind-xml. Moreover, even if it could, there could be hundreds of these call sites (there are a dozen or so plugins that directly or transitively consume jackson-databind-xml) and I doubt it is practical to go and add try/finally blocks to all call sites in all plugins.

Yes we could keep wstx-asl out of Jenkins core and patch the one call site found above in azure-storage, but who knows how many other bugs remain? Do we really want to just wait and see, and patch one plugin at a time as users encounter regressions? That doesn't seem like a great user experience to me.

So I think my conclusion from the above remains:

Given the above, I think we have to accept that core will have to bundle Woodstox if we intend to have plugins that use Jackson to parse XML work with Jenkins' architecture. If we must do that, then I think we ought to ship the latest version of Woodstox rather than the ancient Woodstox 3.x that we're shipping today. At least that way what ServiceLoader returns will be something close to what the plugin's Jackson/Woodstox dependency tree is expecting, rather than the frighteningly old version of Woodstox in the current status quo.

Can you see a flaw in this argument?

@jglick
Copy link
Member

jglick commented Jul 2, 2021

how would one ever enumerate all such calls in all plugins?

Are any plugins other than azure-storage known to be affected?

there are a dozen or so plugins that directly or transitively consume jackson-databind-xml

That does not mean they are using it.

@jglick
Copy link
Member

jglick commented Jul 2, 2021

Or why not just offer a patch to jackson-dataformat-xml? It is fault for assuming that it is being used with a particular contextClassLoader.

@basil
Copy link
Member Author

basil commented Jul 2, 2021

That does not mean they are using it.

Indeed. My point isn't that the risk is high. It's that there is no way to calculate the risk.

Are any plugins other than azure-storage known to be affected?

No, but that doesn't mean other plugins aren't affected. It is a matter of judging risk. If you think the risk is acceptable, then we can keep wstx-asl out of core and keep patching callers as we discover them. Hopefully there won't be too many and users won't be too upset at us. If that changes, we can always re-add wstx-asl to core.

Or why not just offer a patch to jackson-dataformat-xml?

Because as I wrote above:

I can't think of any way to patch Jackson

because Jackson is just using the obvious StAX API, which is where the problem really lies. (And the StAX API doesn't really provide many options to work around its deficiencies in Jackson) (Edit: Tim points out that XMLInputFactory#newFactory(String, ClassLoader) is available for use in Jackson.) And as I also wrote above:

Patching javax.xml.stream.FactoryFinder sounds impractical as well.

@timja
Copy link
Member

timja commented Jul 3, 2021

Are any plugins other than azure-storage known to be affected?

azure-artifact-manager is also broken,

Also:   java.lang.Exception: #block terminated with an error
		at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:99)
		at reactor.core.publisher.Flux.blockLast(Flux.java:2519)
		at com.azure.core.util.paging.ContinuablePagedByIteratorBase.requestPage(ContinuablePagedByIteratorBase.java:94)
		at com.azure.core.util.paging.ContinuablePagedByItemIterable$ContinuablePagedByItemIterator.<init>(ContinuablePagedByItemIterable.java:50)
		at com.azure.core.util.paging.ContinuablePagedByItemIterable.iterator(ContinuablePagedByItemIterable.java:37)
		at com.azure.core.util.paging.ContinuablePagedIterable.iterator(ContinuablePagedIterable.java:106)
		at com.microsoftopentechnologies.windowsazurestorage.service.DownloadFromContainerService.scanBlobs(DownloadFromContainerService.java:68)
		at com.microsoftopentechnologies.windowsazurestorage.service.DownloadFromContainerService.execute(DownloadFromContainerService.java:52)
		at com.microsoft.jenkins.artifactmanager.AzureArtifactManager.unstash(AzureArtifactManager.java:408)
		at org.jenkinsci.plugins.workflow.flow.StashManager.unstash(StashManager.java:154)
		at org.jenkinsci.plugins.workflow.support.steps.stash.UnstashStep$Execution.run(UnstashStep.java:76)
		at org.jenkinsci.plugins.workflow.support.steps.stash.UnstashStep$Execution.run(UnstashStep.java:63)
		at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
		at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
		at java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264)
		at java.base/java.util.concurrent.FutureTask.run(FutureTask.java)
		at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
		at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.lang.UnsupportedOperationException: Cannot create XMLStreamReader or XMLEventReader from a org.codehaus.stax2.io.Stax2ByteArraySource
	at java.xml/com.sun.xml.internal.stream.XMLInputFactoryImpl.jaxpSourcetoXMLInputSource(XMLInputFactoryImpl.java:302)
	at java.xml/com.sun.xml.internal.stream.XMLInputFactoryImpl.createXMLStreamReader(XMLInputFactoryImpl.java:143)
	at com.fasterxml.jackson.dataformat.xml.XmlFactory._createParser(XmlFactory.java:631)
	at com.fasterxml.jackson.dataformat.xml.XmlFactory._createParser(XmlFactory.java:29)
	at com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:1124)

Possibly others

@timja
Copy link
Member

timja commented Jul 3, 2021

As far as I can see the required API's are in the JDK?

I've submitted a patch to Jackson: FasterXML/jackson-dataformat-xml#480
and raised an issue with the azure SDK: Azure/azure-sdk-for-java#22764.

I can't see a nice way to plug this into the Azure SDK though, it's setup to handle all serialisation there, maybe they can give us a knob to tune though.

@basil
Copy link
Member Author

basil commented Jul 3, 2021

As far as I can see the required API's are in the JDK?

Ah yes, I missed that. XMLInputFactory#newFactory(String, ClassLoader) should work.

@basil
Copy link
Member Author

basil commented Jul 3, 2021

With my latest commit to this PR, the RealJenkinsRule-based test now passes on all versions of Jenkins core (those that bundle Woodstox and those that don't), because the test now explicitly sets the right class loader. @timja and @jglick, are we in agreement that this PR is ready to merge and release (the mechanics of who has permissions and time to do so notwithstanding)?

Copy link
Member

@jglick jglick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine. I am not a maintainer.

@timja
Copy link
Member

timja commented Jul 3, 2021

@jvz and @oleg-nenashev from memory

@jvz jvz merged commit cb832da into jenkinsci:master Jul 3, 2021
@jvz
Copy link
Member

jvz commented Jul 3, 2021

If you need a release soon, ask Oleg. Otherwise, I can handle it in a couple days.

@basil basil deleted the xml branch July 4, 2021 17:22
@oleg-nenashev
Copy link
Member

CC @batmat whose team is currently looking after this plugin at CloudBees.
I am taking a break from the Jenkins community these weeks (TBA announced publicly, but the Jenkins board and officers are informed).

@jglick
Copy link
Member

jglick commented Jul 6, 2021

Also @dwnusbaum is registered in RPU.

@dwnusbaum
Copy link
Member

I filed jenkins-infra/repository-permissions-updater#2014 to give release permissions to the appropriate people.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants