New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove classpath scanning to improve JerseyService startup speed #10005
Conversation
This reduces the startup time of the JerseyService by 83% on my development machine when running the production artifacts. (3.5s vs 20.5s) Since scanning package paths for resources is pretty slow in Jersey, this replaces the usage of ResourceConfig#packages with ResourceConfig#registerClasses. We now scan for resource classes by using the classgraph library.
With the server startup now being way faster, we run into a race condition where we run the full-backend tests before we calculated the index ranges. That fails any test that is searching for messages. - We now wait for the ES container to be started before starting the Graylog node - We now wait until the index ranges for the default index have been calculated before declaring the Graylog node as started
@dennisoelkers @mpfz0r @thll The server startup is now so fast that the full-backend tests failed due to a race condition. 😂 I pushed a fix for that. (see: 832cfd6) |
Code looks good and I didn't find any issues when trying it out 👍 I would approve the PR but as we've started a discussion about abandoning the scanning and start binding resource classes explicitly, I'll leave it be for now. |
This avoids the class scanning overhead completely and also allows us to disable resources via feature flags or binding profiles in the future.
@dennisoelkers @mpfz0r @thll I updated the PR to replace package scanning with explicit resource bindings. |
This needs more work. The API browser is broken with this because it's also doing package scanning. |
This makes the API documentation browser work with the explicitly bound resource classes we now use.
@dennisoelkers @mpfz0r @thll Okay, this should be ready for review again. I adjusted the API documentation browser to use the list of resource classes instead of doing package scanning. |
* | ||
* @param restResourceClass the resource to add | ||
*/ | ||
protected void addSystemRestResource(Class<?> restResourceClass) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to extend Graylog2Module
with something like CoreModule
that PluginModule
does not extend from and put addSystemResource
in there? I am trying to avoid that plugin authors make use of this methods because it's available and e.g. suggested in the IDE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a good point.
PluginModule
currently extends Graylog2Module
and lots of modules in the server extend Graylog2Module
. We also have several modules in the server which extend PluginModule
.
So we would need to introduce a CoreModule
that extends Graylog2Module
and then switch all server modules, those that extend Graylog2Module
and those that extend PluginModule
, over to CoreModule
to get access to the API resource binding helper. That means we probably have to push down some helpers from PluginModule
to Graylog2Module
, depending on what the server modules that currently extend PluginModule
need.
Correct?
Sounds like a separate PR. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I am fine with doing that in a separate PR.
This reduces the startup time of the
JerseyService
by 83% on my development machine when running the production artifacts.(3.5s vs 20.5s)
Since scanning package paths for resources is pretty slow in Jersey, we now register all REST resources explicitly via
addSystemRestResource()
in the guice bindings. This will also allow us to disable resources via feature flags and runtime profiles in the future.I compared the output of the
PrintModelProcessor
with both implementations and we get the same resources registered with the new one. (tested in a production artifact deployment including enterprise plugins)In addition to the resource changes I also had to adjust the full-backend tests to get rid of a race condition that was shadowed by the slow server startup before.