Skip to content

Split out EntitlementsCache #127774

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Conversation

prdoyle
Copy link
Contributor

@prdoyle prdoyle commented May 6, 2025

Splits out the caching of entitlements information out of PolicyManager into a separate EntitlementsCache component. This allows us to have different behaviour for production and unit tests:

  • Production can cache per module, while tests can cache per class because there are no modules in tests
  • Tests can clear the cache (as demonstrated in this PR) whenever appropriate, while the production one is never cleared
  • Tests can have additional "always allowed" rules, on top of the usual ones, that cause isTriviallyAllowed to return true for test harness code like gradle and junit.

Future work

  • Refine the criteria for isAlwaysAllowed. It currently just checks for org.gradle and org.junit packages.
  • Prepopulate the cache, rather than computing it lazily.

@prdoyle prdoyle self-assigned this May 6, 2025
@prdoyle prdoyle requested a review from a team as a code owner May 6, 2025 18:06
@prdoyle prdoyle added >non-issue auto-backport Automatically create backport pull requests when merged v8.19.0 v9.1.0 :Core/Infra/Entitlements Entitlements infrastructure v8.18.2 v9.0.2 labels May 6, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label May 6, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple quick thoughts

@@ -843,6 +803,10 @@ private static boolean isTriviallyAllowed(Class<?> requestingClass) {
generalLogger.debug("Entitlement trivially allowed from system module [{}]", requestingClass.getModule().getName());
return true;
}
if (moduleEntitlementsCache.isAlwaysAllowed(requestingClass)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was to move the logic in isTriviallyAllowed, so that the test infra can add "is this a test class". Otherwise this additional condition is only ever exercised in tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. A branch that's dead in production feels wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little awkward because isTriviallyAllowed checks SYSTEM_LAYER_MODULES. PolicyManager seems like a good home for logic that's the same in production and unit tests.

Let me try ending it with an unconditional call to moduleEntitlementsCache and see if you like how that looks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't SYSTEM_LAYER_MODULES be defined inside this lookup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could I guess. That seemed like part of PolicyManager's responsibility, along with similar things like DEFAULT_FILESYSTEM_CLASS and MODULES_EXCLUDED_FROM_SYSTEM_MODULES, but yeah, it could go elsewhere.

/**
* Quickly provides entitlement info for a given class. Performance-critical.
*/
public interface EntitlementsCache {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we've described this as a cache, but it's not really. We happen to lazily populate it, but it's a system of record: what are the entitlements available for a given class. The fact we lazily populate it is an implementation detail. Can we make this more of a lookup? So rather than "computeIfAbsent", just lookup for a given class. How that get's populated is an implementation detail, in prod we will use a lazily populated map, in tests a map that we clear between tests (maybe).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that would be nice. I initially avoided the name "cache" but I think we still want a lazily-initialized map of modules that have default entitlements, since the returned ModuleEntitlements object is module-specific. So I went with "cache" after all.

Definitely open to improvements here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The picture I'm getting here is that you want this new swappable class to do two things:

  • IsTriviallyAllowed
  • getEntitlements

The testing and production versions of this can share code to whatever extent that makes sense.

Is that what you're picturing?

Let me capture some thoughts on this...

In the current scheme, getEntitlements occurs in two steps:

  1. Requesting class -> PolicyScope
  2. PolicyScope -> ModuleEntitlements

#1 is different in production vs testing, and #2 is not, which is why I made an effort to leave #2 inside PolicyManager.

Similarly, isTriviallyAllowed has two parts:

  1. The actual trivially-allowed rules from production
  2. Additional allowances for the test framework

My first try at this refactoring was attempting to make these distinctions as clear as possible, so it was obvious what's the same vs what differs. I thought folks diving into this in the future might want to know when they're in the "mock world" where things are potentially less realistic.

If we simply make isTriviallyAllowed and getEntitlements swappable, then that distinction is not as clear, though the code probably ends up simpler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO wholly replacing these methods makes the code more clear. Having the implementation split across a pluggable thing and PolicyManager makes it more difficult to know where to look. eg by having a clear place where isTriviallyAllowed is implemented (on this interface) a dev can locate the code to walk through easier. That isn't to say isTriviallyAllowed shouldn't be split, we certainly don't want to duplicate production checks in tests, but eg having the test impl of this interface extend the production one (or even just having the prod class be the base and the test variant extends it) and then calling super would make things more clear, IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't mean "make the code more clear" in the abstract. I meant specifically clarifying the distinction between what's the same versus what's different in tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is, as clean as it might appear to swap out these two methods for testing, that does not align with our understanding of what's actually being swapped out. For testing, we do not want wholly novel implementations of these two methods, because they will be mostly the same between prod and test.

The requirements are:

  • Each test should have its own file structure represented in the permissions, and if there are multiple nodes in the test, they should all have their files represented
  • Tests require some additional classes to be trivially allowed
  • Tests need to map Class -> Scope differently due to the lack of modules

The code should look like what I just wrote, or else it's not clearly describing our understanding of the situation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it should be clear what additions/changes the test code is doing relative to production. What you described can be done through inheritance of the production class and calling super, right?

@prdoyle
Copy link
Contributor Author

prdoyle commented May 12, 2025

The team consensus seems to be that we're not going with this approach. Continuing with #128004.

@prdoyle prdoyle closed this May 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged :Core/Infra/Entitlements Entitlements infrastructure >non-issue Team:Core/Infra Meta label for core/infra team v8.18.2 v8.19.0 v9.0.2 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants