Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve build: add CI, run xalan-test in CI, download jars from Central #2

Closed
wants to merge 7 commits into from

Conversation

vlsi
Copy link
Contributor

@vlsi vlsi commented Jul 25, 2022

This PR fixes build configuration, and removes many jars from the source repository and from the source distribution.

Here are the files that are still present:

  • tools/stylebook-1.0-b3_xalan-2.jar
  • tools/xalan2jdoc.jar
  • tools/xalan2jtaglet.jar

Issue: https://issues.apache.org/jira/browse/XALANJ-2634
Sample CI log: https://github.com/vlsi/xalan-java/runs/7495309789?check_suite_focus=true#step:4:1

@vlsi vlsi changed the base branch from master to xalan-j_2_7_x July 25, 2022 07:33
@vlsi vlsi changed the base branch from xalan-j_2_7_x to xalan-j_2_7_1_maint July 25, 2022 07:33
@vlsi vlsi force-pushed the fix_build branch 2 times, most recently from 111f0f5 to 1c8dd1b Compare July 25, 2022 08:01
@vlsi vlsi force-pushed the fix_build branch 4 times, most recently from 0c79f4d to 54da5bf Compare July 25, 2022 08:20
@vlsi vlsi changed the title Improve build: add CI, download jars from Central Improve build: add CI, run xalan-test in CI, download jars from Central Jul 25, 2022
Copy link
Contributor

@mukulga mukulga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, we should'nt be doing some of the changes, as suggested within the file build.xml. These have never, been problems with Xalan-J build.

@carlosame
Copy link

Hello,

I'm the maintainer of EchoXSL, a recent fork of Apache Xalan-J. I'm happy that the Xalan Project wants to produce an additional release with a fix for CVE-2022-34169, however the way it is done in this PR would be problematic for modular JDKs.

As explained in css4j#2, the vulnerability belongs in fact to Apache BCEL, and Xalan is vulnerable because it bundles an old version of it in the jar (together with java-cup etc). However other software do also depend on BCEL or java-cup, and in modular projects this would lead to a split packages problem. Moreover if you just download the vulnerable BCEL jar file and add its packages, Xalan would still be vulnerable.

My suggestion would be to fix the vulnerability in the BCEL project, and then set it up as a dependency in Xalan. This is the approach that EchoXSL followed and works fine. The resulting Maven POM should include something like this in the dependencies section:

    <dependency>
      <groupId>org.apache.bcel</groupId>
      <artifactId>bcel</artifactId>
      <version>6.5.1</version>
      <scope>compile</scope>
    </dependency>

Feel free to reuse the Gradle build in EchoXSL if it is of any help.

@vlsi
Copy link
Contributor Author

vlsi commented Jul 25, 2022

however the way it is done in this PR would be problematic for modular JDKs.

@carlosame , I did not intend to fix CVEs in this PR. I just wanted to add CI so all the further modifications could be tested.

@vlsi
Copy link
Contributor Author

vlsi commented Jul 25, 2022

I'm the maintainer of EchoXSL

It was tempting to rip off build.xml and replace it with Gradle, yet it sounds like a "too much" change for fixing a CVE :)

@vlsi
Copy link
Contributor Author

vlsi commented Jul 25, 2022

@mukulga , would you please clarify which changes you suggest skipping?

@mukulga
Copy link
Contributor

mukulga commented Jul 25, 2022

@mukulga , would you please clarify which changes you suggest skipping?

why would, changes specified as "Remove bootclasspath customization" within the PR for build.xml needed? May be, another XalanJ committer can suggest as well, on this point.

@vlsi
Copy link
Contributor Author

vlsi commented Jul 25, 2022

What is the need for bootclasspath cusomization?

As far as I remember, bootclasspath removal was needed when I tried building Xalan with Java 11.
On the other hand, if we use Java 1.8 for the build, then bootclasspath could still be there: https://github.com/vlsi/xalan-java/actions/runs/2732942617 (it is the same change where I excluded Remove bootclasspath customization commit)

@carlosame
Copy link

@carlosame , I did not intend to fix CVEs in this PR. I just wanted to add CI so all the further modifications could be tested.

To "fix" the CVE, all you need to do is to remove the BCEL packages from the jar, and then list BCEL as a dependency. Now the CVE belongs to somebody else.

And to be friendly to modular JDKs, you have to do the same for the rest of the foreign packages that are currently shipped with Xalan. All of this could be done in this PR.

@vlsi
Copy link
Contributor Author

vlsi commented Jul 25, 2022

All of this could be done in this PR.

Replacing the embedded dependencies with external ones is not backward compatible change, so I would refrain from that for the next release.

@vlsi
Copy link
Contributor Author

vlsi commented Jul 27, 2022

I reverted bootclasspath-related changes for now since it is not needed for Java 8

Copy link
Contributor

@mukulga mukulga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say that, the pr change, mentioned as, "Remove -static option from JLex call" shouldn't be done. This has not been, issue with XalanJ builds earlier.

It seems that, you're suggesting to enhance jlex version to 1.2.6 (which is latest jlex version as of 2003). Why we should do this jlex version upgrade on XalanJ, when the jlex jar stored on XalanJ repos is serving well for the current XalanJ codebase? I'd suggest that, lets not enhance versions of XalanJ supporting libraries, if XalanJ doesn't have known issues with current library versions that it uses.

@vlsi
Copy link
Contributor Author

vlsi commented Jul 27, 2022

I'd say that, the pr change, mentioned as, "Remove -static option from JLex call" shouldn't be done. This has not been, issue with XalanJ builds earlier.

Where did you get JLex.jar then?
The one from https://www.cs.princeton.edu/~appel/modern/java/JLex/ does not support -static option.

@mukulga
Copy link
Contributor

mukulga commented Jul 27, 2022

Where did you get JLex.jar then? The one from https://www.cs.princeton.edu/~appel/modern/java/JLex/ does not support -static option.

I could see, jlex jar available at https://github.com/apache/xalan-java/tree/xalan-j_2_7_1_maint/tools. I guess, we should be using this version for the next XalanJ release, unless there's a known jlex issue with current XalanJ codebase.

@vlsi
Copy link
Contributor Author

vlsi commented Jul 27, 2022

@mukulga , the ASF policy forbids including binary (compiled, non-text) code in the source artifacts.

https://www.apache.org/legal/release-policy.html#source-packages

Every ASF release MUST contain one or more source packages, which MUST be sufficient for a user to build and test the release provided they have access to the appropriate platform and tools. A source release SHOULD not contain compiled code.

@vlsi
Copy link
Contributor Author

vlsi commented Jul 27, 2022

@mukulga , please check https://lists.apache.org/thread/otx07h6vbjrsqd9r9sqpcpjscvjwtmfc

Roy, 2012-03-27: Please point those packages out to me and I will ask Joe to give me root
access again so that I can go through and personally delete them from
our dist directories. Seriously. I am so tired of having to send these
emails, write the documentation, and then watch Java projects to do the
wrong things again and again. It is time for the sledgehammer.

@vlsi
Copy link
Contributor Author

vlsi commented Jul 27, 2022

In this case I can replace "retrieval the offical JLex" with "download from https://github.com/apache/xalan-java/blob/xalan-j_2_7_1_maint/tools/JLex.jar", however, that is really moot license-wise.

I think the proper resolution is to embed JLex sources into xalan-java source code, however, I'm inclined it should better be in another PR.

@vlsi
Copy link
Contributor Author

vlsi commented Jul 27, 2022

An alternative option is to comment out the call to "jlex", and postpone the decision till xpath.lex modification would be needed.

@vlsi
Copy link
Contributor Author

vlsi commented Jul 27, 2022

I commented the call to JLex.jar, so JLex.jar is kept intact in Git repository, however, it is not included into -src.zip, and it is not used during the build.

@vlsi
Copy link
Contributor Author

vlsi commented Jul 27, 2022

By the way, I would say XPathLexer violates that binary rule as well.
This code looks pretty much like an unreadable compiled code to me:

static private int yy_cmap[] = unpackFromString(1,65538,
"54:9,27:2,54,27:2,54:18,27,17,53,54,15,54:2,55,25,26,1,3,11,4,13,2,56:10,10" +
",54,18,16,19,54,12,44,57:3,46,57:3,51,57:4,48,52,43,57,47,50,45,57:3,49,57:" +
"2,41,54,42,54,58,54,35,38,29,5,21,39,33,36,6,57,20,37,8,28,9,30,57,31,32,23" +
",34,7,40,24,22,57,54,14,54:58,60,54:8,57:23,54,57:31,54,57:58,58:2,57:11,58" +
":2,57:8,58,57:53,58,57:68,58:9,57:36,58:3,57:2,58:4,57:30,58:56,57:89,58:18" +
",57:7,58:62,60:70,54:26,60:2,54:14,58:14,54,58:7,57,58,57:3,58,57,58,57:20," +
"58,57:44,58,57:7,58:3,57,58,57,58,57,58,57,58,57:18,58:13,57:12,58,57:66,58" +
",57:12,58,57:36,58:14,57:53,58:2,57:2,58:2,57:2,58:3,57:28,58:2,57:8,58:2,5" +
"7:2,58:55,57:38,58:2,57,58:7,57:38,58:73,57:27,58:5,57:3,58:46,57:26,58:6,5" +
"7:10,58:21,59:10,58:7,57:71,58:2,57:5,58,57:15,58,57:4,58,57,58:15,57:2,58:" +
"9,59:10,58:523,57:53,58:3,57,58:26,57:10,58:4,59:10,58:21,57:8,58:2,57:2,58" +
":2,57:22,58,57:7,58,57,58:3,57:4,58:34,57:2,58,57:3,58:4,59:10,57:2,58:19,5" +
"7:6,58:4,57:2,58:2,57:22,58,57:7,58,57:2,58,57:2,58,57:2,58:31,57:4,58,57,5" +
"8:7,59:10,58:2,57:3,58:16,57:7,58,57,58,57:3,58,57:22,58,57:7,58,57:2,58,57" +
":5,58:3,57,58:34,57,58:5,59:10,58:21,57:8,58:2,57:2,58:2,57:22,58,57:7,58,5" +
"7:2,58:2,57:4,58:3,57,58:30,57:2,58,57:3,58:4,59:10,58:21,57:6,58:3,57:3,58" +
",57:4,58:3,57:2,58,57,58,57:2,58:3,57:2,58:3,57:3,58:3,57:8,58,57:3,58:45,5" +
"9:9,58:21,57:8,58,57:3,58,57:23,58,57:10,58,57:5,58:38,57:2,58:4,59:10,58:2" +
"1,57:8,58,57:3,58,57:23,58,57:10,58,57:5,58:36,57,58,57:2,58:4,59:10,58:21," +
"57:8,58,57:3,58,57:23,58,57:16,58:38,57:2,58:4,59:10,58:145,57:46,58,57,58," +
"57:2,58:12,57:6,58:10,59:10,58:39,57:2,58,57,58:2,57:2,58,57,58:2,57,58:6,5" +
"7:4,58,57:7,58,57:3,58,57,58,57,58:2,57:2,58,57:2,58,57,58,57:2,58:9,57,58:" +
"2,57:5,58:11,59:10,58:70,59:10,58:22,57:8,58,57:33,58:310,57:38,58:10,57:39" +
",58:9,57,58,57:2,58,57:3,58,57,58,57:2,58,57:5,58:41,57,58,57,58,57,58:11,5" +
"7,58,57,58,57,58:3,57:2,58:3,57,58:5,57:3,58,57,58,57,58,57,58,57,58:3,57:2" +
",58:3,57:2,58,57,58:40,57,58:9,57,58:2,57,58:2,57:2,58:7,57:2,58,57,58,57:7" +
",58:40,57,58:4,57,58:8,57,58:3078,57:156,58:4,57:90,58:6,57:22,58:2,57:6,58" +
":2,57:38,58:2,57:6,58:2,57:8,58,57,58,57,58,57,58,57:31,58:2,57:53,58,57:7," +
"58,57,58:3,57:3,58,57:7,58:3,57:4,58:2,57:6,58:4,57:13,58:5,57:3,58,57:7,58" +
":3,54:12,58:2,54:98,58:182,57,58:3,57:2,58:2,57,58:81,57:3,58:13,54:2672,58" +
":1008,54:17,58:64,57:84,58:12,57:90,58:10,57:40,58:31443,57:11172,58:92,54:" +
"8448,58:1232,54:32,58:526,54:2,0:2")[0];

@vlsi
Copy link
Contributor Author

vlsi commented Jul 27, 2022

I've created https://issues.apache.org/jira/browse/XALANJ-2635 regarding JLex.jar and XPathLexer.java

@jkesselm
Copy link
Contributor

This seems to be mixing a number of issues -- the JLex question, the more general question of where the dependency jarfiles should live and how they are fetched, and running continuous integration. I'd be happier if we could divide these and address them separately.


XPathLexer appears to be generated from xpath.lex using JLex, according to the build.xml file. See the property ${generated.xpathlexer" and its usage. So we DO have checked-in source for that in the Xalan-Java project.

The problem is that we don't have either source, or a source, for JLex.

Last I checked, it wasn't clear who actually owns/maintains JLex. If anyone. The proper solution would be to find a supported (or at least clearly open-sourced) Java lex implementation compatible with any JLex quirks (and/or to rework the input to work with the new lex) and swap it in. That's a somewhat scary proposal, deserving its own work item.

Note that JLex and XPathLexer are part of the xsltc "compiled xslt processor" code, originally contributed to Apache by Sun Microsystems. I did a lot of the work to reconcile that code with Xalan and glue them together as a single system... but I didn't go very deeply into it at the time, just enough to sew the monster together. A significant portion of Sun's code was accepted unexamined before I even started that process, which is how JLex got brought in. Yes, these days Apache would insist on knowing the source of all the pieces, but things were a bit looser then; as long as Sun took responsibility for the code donation, we trusted that they had either written it themselves or sourced it ethically.

Good luck finding someone at Sun who remembers where they got JLex from, especially since they pretty much vanished from the Xalan project after the integration.

I think we just have to accept this as grandfathered code until/unless someone is willing to tackle either tracking it down (and dealing with any changes since we got our copy that might affect our use of it) or replacing it. Either way, I'd want to see that tested to death before committing to it.

@vlsi
Copy link
Contributor Author

vlsi commented Jun 13, 2023

I'd be happier if we could divide these and address them separately.

Thank you for the review, however, I truly do not understand if you want an action from me or not.

I believe the commits are self-contained, so feel free to commit all of them or some of them to the main branch.

Last I checked, it wasn't clear who actually owns/maintains JLex

Please check https://issues.apache.org/jira/browse/XALANJ-2635 description. It includes the link to the official maintainer: https://www.cs.princeton.edu/~appel/modern/java/JLex/

I believe JLex is not connected with Sun Microsystems, so "Good luck finding someone at Sun who remembers where they got JLex from" comment does not apply.

@jkesselm
Copy link
Contributor

jkesselm commented Jun 13, 2023

Re XALANJ-2635: Assuming that is the same JLex we're using, which I haven't verified, the "official maintainer" hasn't touched that page in two decades. I can try pinging them.... Copying the source for this tool locally is theoretically permitted by its license, but since it isn't one of the "standard" opensource licenses we might need an official OK.

I see java_cup.jar is on Maven. If JLex got posted to Maven, or github, would that satisfy your request for provenance? (That would also make it more visible/available for non-Xalan users, of course.)

Lemme take a longer look at this.

@jkesselm
Copy link
Contributor

OK, I'm a bit confused.

For java_cup.jar, you aren't downloading it from Maven (even though it's listed as a Maven project); you're explicitly downloading from java_cup's page at Technische Universität München.

I'm not following why the same basic solution -- but download and build rather than download and untar -- wouldn't be the right answer for JLex, fetching https://www.cs.princeton.edu/~appel/modern/java/JLex/Archive/1.2.6/Main.java or whichever other specific release is desired.

Yes, there is risk that the JLex page Goes Away at some point. But it appears to be equivalent to the risk for java_cup.

What am I missing?

@vlsi
Copy link
Contributor Author

vlsi commented Jun 14, 2023

wouldn't be the right answer for JLex, fetching https://www.cs.princeton.edu/~appel/modern/java/JLex/Archive/1.2.6/Main.java or whichever other specific release is desired.

Please read #2 (comment)
Apparently, JLex.jar within xalan-java is modified, so there's no way to fetch a pre-built jar.
I would suggest integrating JLex in a source form, and building it during xalan build, however, it would be too many changes for the current PR which focuses on adding CI.

But it appears to be equivalent to the risk for java_cup.

xalan-java uses java-cup version 11b or something like that.
That version is not available on Central, so the only way to download it is to fetch from the project webpage and/or ask the maintainers to publish on Central.

@jkesselm
Copy link
Contributor

jkesselm commented Jun 14, 2023 via email

@vlsi
Copy link
Contributor Author

vlsi commented Jun 14, 2023

Have we tested with the newer java_cup release?

Frankly speaking, I do not find the question relevant to this PR.

Whenever possible, I tried to avoid doing unnecessary modifications, so I selected the versions that were the same or close to the previously used versions.

I believe, the existing java_cup was 11b, and I download 11b.
For java_cup, that is the latest version.

What do you want to know by asking "tested with the newer java_cup release"?

There's no "newer java_cup" release.


and it's too messy to apply programmatically do a downloaded copy?

Would you please discuss JLex.jar patching in https://issues.apache.org/jira/browse/XALANJ-2635 ? create a JIRA ticket for that and discuss it there?
I do not see how the question is related to this PR.

@jkesselm
Copy link
Contributor

This is resolved in the the migration from Ant to Maven, now in progress.

Since JLex wasn't available in Maven Central, I went with JFlex instead, modifying the grammar to perform the lookahead via the regular expressions rather than by digging into the lex system's internal variables. Both necessary and cleaner, and quite possibly more performant though I haven't attempted to test that.

@jkesselm
Copy link
Contributor

The downloads aspects of this will be subsumed under the migration to Maven-based builds.

Running CI looks useful. However, note that we are considering moving xalan-test into the test directories of xalan-java, so its invocation would change.

@jkesselm
Copy link
Contributor

Status check: This appears to be outdated, and to overlap with other work in progress.

I believe we have already dealt with CI as a separate change..

I am dealing with bootclasspath. This was needed in earlier Javas because Sun insisted on shipping versions of of xerces and xalan in the standard libraries as org.apache.* (without repackaging them), preventing users from running the new code unless bootclasspath was prefixed or the -endorsedlib mechanism was used (which is essentially a more official version of the same thing). Since the introduction of JAXP and TrAX (circa Java 1.5-1.6), the java libraries have been changed to ship with the Apache code moved to com.sun.org.apache.*, removing that conflict. There may still be users who have the old workarounds in place so we should tolerate running in that mode (see recent discussion of Version)... but we don't need to use it ourselves.

The Maven build prototype fixes most or all of the binary dependencies by downloading from Central. I expect to merge that soon.

(Best practice is one issue per PR, though sometimes that issue unavoidably subsumes multiple tightly internlinked sub-issues, as is the case with the Maven migration uber-PR. Separation lets us discuss, refine, and approve them individually.)

@jkesselm jkesselm closed this Nov 30, 2023
@jkesselm
Copy link
Contributor

(Closing. Your more recent CI work is in other PRs, and I think the downloading-dependencies thing is going to be addressed by Maven cutover before too much longer so the Ant-based version, I think, is more distraction than useful. If you really feel it's needed as stopgap, please open a PR for that change separately and we can discuss.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants