Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove a few duplicate jars in distributions #6522

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

wetneb
Copy link
Sponsor Member

@wetneb wetneb commented Apr 9, 2024

For #6515.

There are still a lot more duplicates that could be removed, but somehow I can't avoid to ship two copies of butterfly and its dependencies without making the app crash.

At least this helps reduce the size of our distributions somewhat:

  • openrefine-linux-3.9-SNAPSHOT.tar.gz: from 153M to 146M
  • openrefine-mac-3.9-SNAPSHOT.dmg: from 290M to 282M (uncompressed)
  • openrefine-win-with-java-3.9-SNAPSHOT.zip: from 194M to 188M

(sizes are computed with #6513 applied).

The jars that are no longer duplicated are:

localizer-1.31.jar
listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar
jetty-util-10.0.16.jar
jetty-io-10.0.16.jar
jetty-http-10.0.16.jar
guava-33.1.0-jre.jar
failureaccess-1.0.2.jar
checker-qual-3.42.0.jar (from 3 to 2 duplications)

The change feels fairly safe but still I wouldn't include it in 3.8.

For OpenRefine#6515.

There are still a lot more duplicates that could be removed,
but somehow I can't avoid to ship two copies of butterfly and
its dependencies without making the app crash.
Copy link
Member

@tfmorris tfmorris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minimizing on-disk size of kits isn't the only measure to consider here. Declaring Jetty as a provided dependency is fine because that's an intrinsic part of the implementation, but I'm wondering about Guava and Apache Commons Lang3 which seem more like an internal implementation details than things which should be exported. Having them exposed to extensions increases coupling when I think we want to be moving the opposite direction.

@wetneb
Copy link
Sponsor Member Author

wetneb commented Apr 10, 2024

I see where you are coming from with this… I definitely agree those dependencies shouldn't be part of the extension interface. I wouldn't want external extensions to rely on those jars being present in the main tool.
But for bundled extensions, I see little point in shipping those duplicate jars, given that we're synchronizing dependency versions anyway. Do you think this would be setting a bad example for extension developers out there? I didn't consider that.

@tfmorris
Copy link
Member

I think it's better to have consistent rules for everyone. Guava and its dependencies is only 3.3MB, so not a big deal in the grand scheme of things. There also may be straightforward ways for the extensions to not depend on Guava (I haven't looked).

I'm happy to have Jetty & the Wikipedia localizer package be declared as always being exported and documented as part of the extension API.

@tfmorris
Copy link
Member

Actually, looking at the GData extension, it doesn't use Guava at all. It's being pulled in as a transitive dependency of the OpenRefine main module which is tagged as provided, but for some reason Guava is listed as compile. Sorting that out for GData (and the database extension, if the same is true there), might be another solution to this piece of the puzzle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants