Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JCRVLT-517 fix package cache init in OSGi containers #136

Merged
merged 5 commits into from May 5, 2021

Conversation

kwin
Copy link
Member

@kwin kwin commented Apr 28, 2021

get rid of lazy-loading of package cache in packages()
improve test coverage
fix another issue in FSInstallState not correctly persisting excludes

get rid of lazy-loading of package cache in packages()
improve test coverage
fix another issue in FSInstallState not correctly persisting excludes
@kwin kwin force-pushed the bugfix/JCRVLT-517-init-package-cache-in-osgi branch from 717e0ea to 0364b48 Compare April 28, 2021 19:38
Copy link
Contributor

@tripodsan tripodsan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kwin I'm not comfortable with this change, I think we added the lazy loading for a reason. so that bundle activation is fast.....

@kwin
Copy link
Member Author

kwin commented Apr 29, 2021

Any reason why lazy loading was not used in the other constructor?

@tripodsan
Copy link
Contributor

Any reason why lazy loading was not used in the other constructor?

which other constructor?

@kwin
Copy link
Member Author

kwin commented Apr 29, 2021

which other constructor?

Also lazy loading is not at all implemented correctly. It needs to be done IMHO on almost all operations (not only on packages and contains but also all register method which should throw exceptions in case of duplicate package ids). Lazy loading also requires proper synchronization, as multiple methods may be called from different threads. Also bundle activation is not deferred by long-running component activators.

Why do you think that long-running activate() methods in DS are a problem?

Copy link
Contributor

@tripodsan tripodsan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok then :-)

@stoerr
Copy link
Contributor

stoerr commented Apr 29, 2021

I did consider getting rid of lazy loading completely when doing #135 , but I was afraid the original author might have had a good reason to do it. The problem is that the method loadPackages() is a quite expensive method, as it accesses the filesystem and opens all files to collect metadata. If the filesystem is a network filesystem (which might even make sense for distributing packages across instances), then this can even block indefinitely. That's fine in tests (at least within Jackrabbit Vault the FSPackageRegistryIT and your new test seem to be the only places the additional constructors are used), but that might not be OK during OSGI activation. Could that block your bundles from coming up? And the registry might even not be used at all later...

You could consider encapsulating stateCache to switch it completely to lazy loading to be consistent (in the original code getInstallState / setInstallState methods also didn't call the loadPackages()), but that does change the behavior of the public constructors somewhat, so it might theoretically break e.g. a test someone did in a project using Vault, though that's admittedly a strange case. Not sure whether it's worth the risk.

@@ -185,6 +182,7 @@ public void activate(BundleContext context, Config config) {
log.info("Jackrabbit Filevault FS Package Registry initialized with home location {}", this.homeDir.getPath());
this.scope = InstallationScope.valueOf(config.scope());
this.securityConfig = new AbstractPackageRegistry.SecurityConfig(config.authIdsForHookExecution(), config.authIdsForRootInstallation());
loadPackageCache();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think doing file system accesses within the OSGI activation is dangerous - please see my comment #136 (comment) .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for not loading the cache here is because package operations are no frequent use case while the activate happens at any start of the system. This was designed for a Cloud-Service scenario with frequent starts & stops. Any expensive operation should be cut out of the startup procedure. The non-osgi initialization on the other hand is only required for initialization of an 'immutable' state of this is registry which would then only be read at system runtime (reused state) to validate dependencies for the mutable part being handled by the jcr registry. In case we have new scenarios we should make sure to not sacrifice those performance aspects.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarification - installation of packages happens with an osgi startup at the moment as other mechanisms may write to repositorylike repo init but the accessible containers would only initialize the cache when there is package installation activity which is a rare scenario and is not supposed to delay startup

Copy link
Member Author

@kwin kwin Apr 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loading the package cache is crucial before running ExecutionPlanBuilder (look at https://github.com/apache/sling-org-apache-sling-jcr-packageinit/pull/7/files). But I will come up with a fix which prevents reading the state from FileSystem in case the FSPackageRegistry is just referenced but no methods are called (as it is the case sometimes in https://github.com/apache/sling-org-apache-sling-jcr-packageinit/blob/master/src/main/java/org/apache/sling/jcr/packageinit/impl/ExecutionPlanRepoInitializer.java


@Override
public int hashCode() {
final int prime = 31;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for fun: I always thought using 31 as a factor is a quite bad choice, as it easily creates conflicts. https://stackoverflow.com/a/2816747/21499 explains why I think 92821 is much better when you are using that pattern. :-)

@kwin
Copy link
Member Author

kwin commented Apr 29, 2021

I just checked again and the default timeout in Felix SCR for activate methods is 5 seconds (https://github.com/apache/felix-dev/blob/3e5671ae7e5107f4f849ef9d5f0a89b1ba9d7439/scr/src/main/java/org/apache/felix/scr/impl/manager/ScrConfiguration.java#L60), there are other alternatives outlined at https://stackoverflow.com/a/57250631.


@Override
public String toString() {
return "FSInstallState [" + (packageId != null ? "packageId=" + packageId + ", " : "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I like it when you're adding toString() wherever it makes sense. That's often more pleasant in the debugger.

@stoerr
Copy link
Contributor

stoerr commented Apr 29, 2021

While we're at it: there are some log messages in FSPackageRegistry that could log the file / package they apply to to make it easier to find which of the potentially gazillion packages they apply to. I don't find a way to add a comment to these lines here in the PR since there are no changes near those, but I would suggest extending the log messages at lines 471, 581 (also typo in NOTREGISTERD), 684, 695, 704, 727 with some information about the package concerned. Thank you so much!

Copy link
Contributor

@DominikSuess DominikSuess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't introduce costly synchronous processing in the osgi startup as package activities are rather a rare scenario in contrast to system start-up in containerized Cloud-Service setups.

Copy link
Contributor

@tripodsan tripodsan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the comments from the others, please remove the package loading from the activation method.

@kwin kwin force-pushed the bugfix/JCRVLT-517-init-package-cache-in-osgi branch from c4df25c to dc5fdd5 Compare May 4, 2021 10:10
reintroduce lazy loading
refactor tests
@kwin kwin force-pushed the bugfix/JCRVLT-517-init-package-cache-in-osgi branch from dc5fdd5 to e16c467 Compare May 4, 2021 10:12
@kwin kwin requested a review from tripodsan May 4, 2021 10:13
@sonarcloud
Copy link

sonarcloud bot commented May 4, 2021

Copy link
Contributor

@tripodsan tripodsan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't really judge the changes... but I like the decoupling of the install state.

@kwin kwin merged commit 3dba5a2 into master May 5, 2021
@kwin kwin deleted the bugfix/JCRVLT-517-init-package-cache-in-osgi branch May 5, 2021 06:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants