Issue 35680 intermittent 401 on public pages after push publishing may be caused by empty permission list being cached during permission reference rebuild#35684
Conversation
|
Claude finished @freddyDOTCMS's task in 1m 55s —— View job PR Review — current state
Scoping the review to the PR as it stands (narrow guard at Worth a second look1. Comment overstates the pre-fix behavior —
It wasn't permanent. 2. The line-814 caller can still poison the cache — bitPermissionsList = loadPermissions( permissionable );
permissionCache.addToPermissionCache( permissionable.getPermissionId(), bitPermissionsList );
3. Test will leak an orphan After
Either is a few lines and stops slow DB pollution across CI runs. 4. Test asymmetry — no positive assertion The test asserts the empty case isn't cached but never asserts the non-empty walk-up still is cached. A future change that "fixes" the guard too aggressively (e.g. someone moves the check up and accidentally short-circuits a non-empty result) would pass this test. Add a second case that creates a permissionable whose walk-up returns a real list and asserts Nits
Not flagging (already settled in thread)
NetThe fix's direction is correct and minimal. The four items above are the ones I'd want addressed before merge — (1) and (3) are the most material (misleading comment, DB pollution), (2) and (4) are insurance against re-regression. --- · branch · |
@claude take account in your review thatin dotCMS every asset resolves through the hierarchy to the System Host which always carries |
|
Claude finished @freddyDOTCMS's task in 1m 16s —— View job Re-review given the "System Host always has inheritable perms" invariantYou're right — that invariant materially changes the analysis. Conceding where my prior review was off and tightening the parts that still stand. Revised: Major #1 — negative-cache regressionLargely withdrawn. If the walk-up loop in The one residual question is whether Still standing: Major #2 — line 814 inconsistencyThis one isn't resolved by the invariant. bitPermissionsList = loadPermissions( permissionable );
permissionCache.addToPermissionCache( permissionable.getPermissionId(), bitPermissionsList );If Given the invariant ("empty walk-up = broken state, never desired"), the cleanest expression is to push the guard into the cache so it can't be reintroduced: // PermissionCacheImpl.addToPermissionCache
protected List<Permission> addToPermissionCache(final String key, List<Permission> permissions) {
if (permissions == null || permissions.isEmpty()) {
return Collections.emptyList(); // refuse to poison cache with empty walk-up
}
cache.put(primaryGroup + key, permissions, primaryGroup);
return permissions;
}Then delete Still standing: Major #3 — comment + root cause
Still standing minor points (unaffected by the invariant)
NetWith the invariant in mind, the fix's direction is unambiguously correct and the negative-cache concern largely dissolves. The one thing I'd still push to land is moving the guard into --- · branch |
@claude really we are not completly sure about what cause the problem so this is why is better no chance the return value for now |
|
Claude finished @freddyDOTCMS's task in 28s —— View job Acknowledged — keeping the change minimal is the right callFair point, and I think it's the right call given the uncertainty. Conceding the suggestion. Why your reasoning holds up: The proposal to move the guard into
Point 2 is the one we haven't verified. The current PR's scoping — guard only at line 1725, the one site we have evidence is causing the bug — is the conservative move. If the line-814 path turns out to also be implicated, that can be a follow-up PR with its own evidence. Tearing out Net: the PR as it stands is defensible — narrow fix targeting the observed symptom, leaves the broader cache machinery alone until we understand it better. Approving the scoping decision. The minor test points (cleanup of the orphan --- · branch |
…-after-Push-Publishing-may-be-caused-by-empty-permission-list-being-cached-during-permission_reference-rebuild
Summary
permissionCache.addToPermissionCache()call inPermissionBitFactoryImpl.loadPermissions()to skip caching when the walk-up returns an empty permission listPermissionBitFactoryImplTestverifying that an empty walk-up result is not written to cacheContext
During Push Publishing,
resetPermissionReferences()is called for every existing HTML page in the bundle. This deletes each page'spermission_referencerow and broadcasts a cluster-wide cache invalidation. When a request arrives before the reference is rebuilt,loadPermissions()falls through to the walk-up (_loadParentPermissions()), which callspage.getParentPermissionable()to traverse the parent hierarchy.If
getParentPermissionable()returnsnullat that instant (transient DB state, folder cache miss under load), the walk-up loop never executes and returns an empty list. The previous code cached that empty result unconditionally — causing every subsequent request for the page to be served[]from cache, bypassing the walk-up entirely, and resulting in a persistent 401 for anonymous users until the nextresetPermissionReferences()evicted the entry.The fix refuses to cache empty walk-up results. If the walk-up fails transiently, the next request retries it rather than being served a stale empty result.
Checklist
This PR fixes: #35680
This PR fixes: #35680