Skip to content

Added pause/resume sync to ClusterNode #236

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

tduffey
Copy link

@tduffey tduffey commented May 6, 2025

This PR adds three methods to ClusterNode:

  1. ClusterNode#pauseSync() acquires the syncLock to prevent any new synchronizations from running. If synchronization is currently running or the lock is otherwise acquired by another thread this one will wait indefinitely or throw a ClusterException if the thread is interrupted.
  2. ClusterNode#pauseSync(long) does the same as test request #1 but only waits the specified number of milliseconds to acquire the lock and returns true if the lock was acquired, otherwise false.
  3. ClusterNode#resumeSync() releases the lock allowing the normal synchronization process to resume.

One use case for this is performing a hot backup of a running node. We can pause synchronization to ensure this node does not try to update the index while we perform our backup action. Note this only handles updates coming into the cluster from other nodes -- you'd still have to make sure the local node isn't performing any of its own content/index updates. Incomplete in that regard but gets us part way there.

See https://issues.apache.org/jira/browse/JCR-5142

@thomasmueller
Copy link
Member

thomasmueller commented May 7, 2025

Hi,

Thanks a lot for your contribution! What is missing is:

  • a Jira issue. It should include the motivation for the change.
  • test cases

One important aspect is that Apache Jackrabbit is not currently in active development. Most of the active development currently happens on Apache Jackrabbit Oak, not Jackrabbit. So the obvious question would be: could you use Oak? I know it doesn't have the same set of features, so it might not be an option for you.

@tduffey
Copy link
Author

tduffey commented May 7, 2025

Hi @thomasmueller, thanks for your response. We cannot use Oak as we are using another system that depends on Jackrabbit 2 due to workspace requirement.

Our motivation is to provide better scaling on Jackrabbit 2 using clustering and with that finding a way to do a hot backup because the last thing we want to do is turn off a node during a scaling up event.

I'll post a JIRA ticket. We have some dev horsepower and could potentially redirect (For example, add workspace capabilities to Oak or backporting Elastic index to JR2) if given some direction since there's a lot going on here.

@tduffey
Copy link
Author

tduffey commented May 19, 2025

@thomasmueller I submitted another PR that might be a better solution as it is very little code and uses an existing "feature" of ClusterNode: #243

If you wouldn't mind taking a look I think we can close this PR and instead consider the other one?

@tduffey
Copy link
Author

tduffey commented Jun 10, 2025

#243 is simpler and gets me what I need.

@tduffey tduffey closed this Jun 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants