Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove no-jdk distributions for 8.0 release #65109

Closed
4 tasks done
mark-vieira opened this issue Nov 17, 2020 · 17 comments · Fixed by #76896
Closed
4 tasks done

Remove no-jdk distributions for 8.0 release #65109

mark-vieira opened this issue Nov 17, 2020 · 17 comments · Fixed by #76896
Labels
:Delivery/Packaging RPM and deb packaging, tar and zip archives, shell and batch scripts Team:Delivery Meta label for Delivery team

Comments

@mark-vieira
Copy link
Contributor

mark-vieira commented Nov 17, 2020

When we started bundled a JDK into Elasticsearch with 7.0 we also introduced a "no-jdk" distribution for folks that had no intention of using the included JDK. In hindsight this added complexity and confusion around our distribution variants for little benefit. We only provide no-jdk variants of our archive distributions, for packages (rpm/deb) and Docker we only publish artifacts with an included JDK. For the release of 8.0 we want to unify this and remove the no-jdk altogether so that all Elasticsearch distributions will include a bundled JDK.

There are a few items to consider:

  • Remove generation of of no-jdk artifacts from the build.
  • Remove publishing of no-jdk artifacts from release manager.
  • Update download webpage to no longer link to no-jdk distributions and remove no-jdk page.
  • Remove any documentation specifically related to no-jdk distribution (if any).
@mark-vieira mark-vieira added the :Delivery/Packaging RPM and deb packaging, tar and zip archives, shell and batch scripts label Nov 17, 2020
@elasticmachine elasticmachine added the Team:Delivery Meta label for Delivery team label Nov 17, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-delivery (Team:Delivery)

@ari
Copy link

ari commented Dec 14, 2020

How will this work for third party packaging like FreeBSD?

@mark-vieira
Copy link
Contributor Author

@ari I'm not sure, do you mind elaborating on this? I presume many *nix distributions build their own packages for Elasticsearch. If they currently do so from the no-jdk distributions they would simply have to adapt to use the ones with a bundled JDK and rip it out if they don't need it.

@ari
Copy link

ari commented Dec 17, 2020

@mark-vieira Sure. I use FreeBSD and the packages there have suddenly started displaying messages on startup that elasticsearch will soon be no longer supported unless it is run with your custom bundled JDK.

I can't guess at your reasons to make the artifacts non-portable (one of the key benefits of the JVM), but you appear to have encoded these warnings inside the application rather than just in the bundling. You'll require users to download several hundred Mb of JVM even if they must then discard that for their own environment.

Surely there is a distinction in your process between releasing elasticsearch itself and then subsequently creating whatever bundles you think will be popular with your users?

@mark-vieira
Copy link
Contributor Author

@ari I presume FreeBSD is simply building their package based on the existing distribution which does not include a JDK. The only difference going forward will be that they will need to update their process to use the "default" archive, which includes the JDK bundled inside. They are then free to discard it as part of their packaging process.

Right now the default archives (which include a JDK) are around 150MB larger and we have plans to investigate reducing this delta even further to some level which would be negligible.

The intent here is to make deploying Elasticseach as frictionless as possible. Packaging it with an existing compatible and tested JDK aids users in that goal, and we encourage folks to use the included JDK. We later introduced a "no-jdk" variant which in hindsight we regretted, as it introduced significant confusion for users and complexity around releasing.

Nothing will change with behaviour here. There is still no requirement to use the bundled JDK. You are free to delete it after download, and third-party vendors like FreeBSD will likely do so when building their own packages.

@ari
Copy link

ari commented Dec 18, 2020

FreeBSD does not bundle separate distributions, rather they are just a script wrapper around your artifacts. So you'll be requiring every user to download a JDK which will not run. Plus your startup messaging makes it sound like elasticsearch will ONLY work with an embedded JDK.

I'm not sure what release complexity it added, but if possible please consider keeping the bare artifacts without JDK as an option even if you bury that on your website to avoid confusing users who don't know what Java is.

@mark-vieira
Copy link
Contributor Author

FreeBSD does not bundle separate distributions, rather they are just a script wrapper around your artifacts. So you'll be requiring every user to download a JDK which will not run.

Understood, as mentioned, there are things we can do to reduce that size. Also, is there a reason why the bundled JDK isn't, or couldn't simply be used?

your startup messaging makes it sound like elasticsearch will ONLY work with an embedded JDK

The intention is only to indicate that the currently used distribution will no longer be available with the 8.0 release. We can potentially improve this message to clarify that. Additionally, in Elasticsearch 8.0 this message will never appear, since all distributions will include an embedded JDK.

I'm not sure what release complexity it added, but if possible please consider keeping the bare artifacts without JDK as an option

Every variant of Elasticsearch we support incurs a build and testing cost. Eliminating the no-jdk variant effectively cuts the number of artifacts we need to test and support in half, which we see as a justifiable cost given the only side effect is a larger distributable, and we intend to mitigate that at some point as mentioned before.

At this point we should be focusing the discussion on any problematic side effects that might arise from this change. So far, it seems the main concern is that the artifact size will increase, is that correct or do you anticipate using the distribution which includes a bundled JDK pose other problems aside from just including superfluous content for those users supplying their own JRE?

@ari
Copy link

ari commented Dec 21, 2020

Also, is there a reason why the bundled JDK isn't, or couldn't simply be used?

Why a Windows or Linux JDK binary will not run on BSD?

So far, it seems the main concern is that the artifact size will increase, is that correct or do you anticipate using the distribution which includes a bundled JDK pose other problems aside from just including superfluous content for those users supplying their own JRE?

  1. It adds a significant confusion to running the application with the binaries will not actually work on some platform.
  2. It seem like your application is no longer portable. Maybe you don't care and an ecosystem of just linux, windows and osx is all you care about. But I feel like we are in 1985 when vendors would say "stop running toy operating system and just use Windows"
  3. Your team may then decide that you could get performance or other benefit from customising the JDK or environment which means it really would not run on a generic JVM and you'll reject any patch which fixes that compatibility as no longer supported.

@mark-vieira
Copy link
Contributor Author

mark-vieira commented Dec 21, 2020

Why a Windows or Linux JDK binary will not run on BSD?

🤦 of course.

It seem like your application is no longer portable.

There are no such guarantees that Elasticsearch is portable. If it runs on unsupported platforms that have a JVM implementation, great, but there is no explicit support for this. In addition to the JVM, Elasticsearch contains other native-code, such as our machine learning engine which is built per-platform.

Maybe you don't care and an ecosystem of just linux, windows and osx is all you care about.

It's not a matter of "caring" about those platforms. They are simply not officially supported as we have no test coverage for them. If you are running Elasticseach on FreeBSD you are accepting the risk that there might be unknown issues with that configuration.

Your team may then decide that you could get performance or other benefit from customising the JDK or environment which means it really would not run on a generic JVM and you'll reject any patch which fixes that compatibility as no longer supported.

We have no intention of this being the case. Bringing your own JVM will continue to work and we have a full compatibility matrix with various JVMs specifically for this purpose.

Shipping a "generic" distribution without a JDK in the fashion you recommend brings along implications that Elasticsearch will run just fine on any system for which a JVM exists and that simply is not the case. We don't want to give folks the wrong impression regarding platform support here.

@NerdSec
Copy link

NerdSec commented Jun 21, 2021

I stumbled on here while searching how to remove the embedded JAVA in Elasticsearch. The embedded JAVA causes issues in vulnerability assessments.
Should there be an installation flag or environment variable that disables installation of the embedded JDK? I understand the intention of it being easy to boot and use, but it would make sense to have the option to not use the embedded JDK or have it installed itself. Reduce the attack surface if we can.
This will allow an installed and maintained version of JAVA to be used without cause security issues in the long run.

@bytebilly
Copy link
Contributor

Hi @NerdSec thanks for the feedback. I'd like to get more details about your concern if possible.

Which kind of security issue does the bundled JDK cause? Generally it is always up to date with the latest release, could you point us to some example of vulnerability findings and scanning tools that we could reproduce?

Are issues concerning even if the bundled JDK is not used by Elasticsearch? You can use an external JDK to run Elasticsearch even if the bundled one is in place, so it should not have any impact. It is just as easy as setting the ES_JAVA_HOME environment variable before running Elasticsearch. Since there is no suid binary, which is the threat of having it on disk? Would deleting the bundled JDK folder a solution?

Thanks.

@NerdSec
Copy link

NerdSec commented Jun 22, 2021

Yes, it is up to date for the version of Elasticsearch that it ships with, but the problem we face is that frequent updates of Elasticsearch is not possible. So, over time, the version of Java does show some vulnerabilities.

I agree that if we set the ES_JAVA_HOME variable does resolve this by forcing Elasticsearch to use a managed version of JDK; and in such scenarios the JAVA binary simply resides on disk and is not used by any process. The risk is minimal, but it still is a risk for an attacker that is aware of the ES version and the corresponding JDK shipped with it. In such a scenario, the attacker can hypothetically pivot by leveraging another process. This is highly unlikely and depends on the fact that the attacker leverages an LFI or something similar to run something on the ES nodes. Removing the JDK is obviously an option, but I was not sure if that may cause any issues with the ES setup.

Also, for every VA cycle we are stuck with exception management! 😄

Here's a reference finding for 7.10.

Plugin Name Family Severity Plugin Text
Oracle Java SE 1.7.0_291 / 1.8.0_281 / 1.11.0_10 / 1.15.0_2 Information Disclosure (Jan 2021 CPU) Misc. Medium Plugin Output: The following vulnerable instance of Java is installed on the remote host :   Path              : /usr/share/elasticsearch/jdk   Installed version : 1.15.0_1   Fixed version     : 1.7.0_291 / 1.8.0_281 / 1.11.0_10 / 1.15.0_2

@bytebilly
Copy link
Contributor

If Elasticsearch runs on an external JVM, you can safely remove the bundled JDK folder.

I see how this issue could turn into a compliance problem rather than a real risk, but I strongly suggest to keep Elasticsearch up to date as it also may have security flaws that have been fixed in later releases.
We are working to ensure that upgrading to newer versions is even more consistent that what it is today, so hopefully in the future this will not be a blocker anymore.

Could you please share which is the tool/setup you use for vulnerability checking? Thanks!

@NerdSec
Copy link

NerdSec commented Jun 22, 2021

I agree. We upgrade ES when there is a serious CVE identified on the release we run or every 3-4 months. But the upgrade on VM is painful. ECK can solve the problem in the long run, but running on K8S has it's limitations on the storage drivers for k8s.

I used Tenable to do the scan, You could even use Nessus OSS to get the same results.

@mark-vieira
Copy link
Contributor Author

Update download webpage to no longer link to no-jdk distributions and remove no-jdk page.

@bytebilly this bit is being taken care of, right?

@bytebilly
Copy link
Contributor

Yes, changes to the download page have been staged and will go live with 8.0.0 alpha2 and will go live soon.

@pauldraper
Copy link

pauldraper commented Jan 30, 2022

This....is truly unfortunate.

You have a 500MB download/install size, half of which is just the JDK.

:/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Delivery/Packaging RPM and deb packaging, tar and zip archives, shell and batch scripts Team:Delivery Meta label for Delivery team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants