Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SLING-7754: implementation of Resource Based Queues #12

Closed
wants to merge 4 commits into from

Conversation

mpetria
Copy link
Contributor

@mpetria mpetria commented Oct 10, 2018

No description provided.

@mpetria mpetria requested a review from tmaret October 10, 2018 14:07

package org.apache.sling.distribution.queue.impl;

public interface DistributionQueueProviderFactory {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, the API (although internal) is meant to abstract various implementations of queue providers, by default the historical Sling Jobs based one and the new queue implementation. The need for the API makes sense, but the API may leak some details from the new queue implementation, the serviceName and requirement to release a cleanup thread.

I suggest to modify the API and implementation a little bit, such that the API fits the factory concept.

The DistributionQueueProviderFactory would only contain the signature

DistributionQueueProvider getProvider(String agentName);

Instead of registering one ResourceQueueCleanupTask per queue, we'd register a single ResourceQueueCleanupTask to cleanup all queues. Registration and unregistration would be done via the ResourceQueueProviderFactory service either by registering/unregistering the cleanupTask service explicitly (as it is done now via the activate/deactivate methods) or by letting the ResourceQueueProviderFactory implements Runnable and schedule it using the whiteboard pattern.

The ResourceQueueCleanupTask would use the QUEUES_ROOT as entry point instead of a agent queue path.

The session performing the cleanup in ResourceQueueCleanupTask would be obtained via

    resolverFactory.getServiceResourceResolver(null);

import java.util.Iterator;
import java.util.Stack;

public class ResourceIterator implements Iterator<Resource> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this iterator is central to the resource based queue implementation and has to I. reverse the natural order for sling:OrderedFolder which puts the head items/folders at the "end" of folder and II. flatten the date based hierarchy of sling:OrderedFolder. It is also configurable, thus I suggest to add a few tests to cover its behaviour.

buf.append(")");

try {
QueryManager qManager = session.getWorkspace().getQueryManager();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, the query computes the number of leaf nodes which translates directly to the count of entries in a queue.
This query seems to be the only dependency on the JCR API, ideally we could avoid it to only leverage Sling APIs as suggested by @bdelacretaz in SLING-7754.

I could think of the following approaches

A) Issuing the same XPATH query via the {{org.apache.sling.api.resource.ResourceResolver#findResources}} API and iterate over the returned Iterator in order to get the count.

B) Implement a recursive descent similar to the ResourceIterator (potentially optimised since leaf order does not matter).

Both A) and B) imply iterating over the entire queue to find the count. Iterating over the entire queue via Sling APIs will be slower than via the JCR NodeIterator#getSize API as the implementation of the latter is optimised.

Assuming A) or B) are indeed a performance issue, we could think of an ad-hoc optimisation on our side, by caching the count in the repository. The cached count could be evicted after a certain amount of time, thus yielding A) or B) on the next demand for the count value. It would certainly be more complex and ad-hoc.

Instead of going with an ad-hoc implementation, The Sling query API could be extended to allow returning an optimised result set count.

@tmaret
Copy link
Member

tmaret commented Nov 26, 2019

Closing this PR since an resource based queues have been added in 6a3670c`

@tmaret tmaret closed this Nov 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants