Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InterProcessMutex leaves empty node #29

Closed
ph1lm opened this issue Feb 14, 2012 · 23 comments
Closed

InterProcessMutex leaves empty node #29

ph1lm opened this issue Feb 14, 2012 · 23 comments

Comments

@ph1lm
Copy link

ph1lm commented Feb 14, 2012

Below is groovy code I used for testing:

CuratorFramework framework = CuratorFrameworkFactory.builder().
        connectString(System.getProperty('zookeeper.ensemble', config?.zookeeper?.ensemble ?: 'localhost:2281')).
        sessionTimeoutMs(config?.zookeeper?.session?.timeout ?: 2 * 5000).
        connectionTimeoutMs(config?.zookeeper?.connect?.timeout ?: 2 * 5000).
        retryPolicy(new RetryNTimes(0, 0)).
        build();
framework.start();

def getInteger(property, defaultValue) {
    System.getProperty(property, "${defaultValue}") as Integer
}

getInteger('zookeeper.lock.quantity', 1).times {
    def mutex = new InterProcessMutex(framework, '/some/test');
    mutex.acquire()
    try {
        Thread.sleep(getInteger('zookeeper.lock.sleep', 0))
    } finally {
        mutex.release()
    }
}

framework.close()

After execution I just connect to ZK using Cli:

[zk: localhost:2281(CONNECTED) 7] ls2 /some/test
[]
cZxid = 0x100000004
ctime = Tue Feb 14 17:20:05 EET 2012
mZxid = 0x100000004
mtime = Tue Feb 14 17:20:05 EET 2012
pZxid = 0x100000006
cversion = 2
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 0

Node /some/node still exists even after session disconnect.
That may cause a problem because if we don't clear child nodes - parent node may exceed size limit.

Jars used:
curator-client-1.1.0.jar
curator-framework-1.1.0.jar
curator-recipes-1.1.0.jar

@Randgalt
Copy link
Contributor

This is actually an inherit problem with ZooKeeper. The desired behavior is for the parent node to get deleted when it has no children. But, there is no way to do this atomically with the deletion of the last child - at least not from the client. Server support is needed for this. There is a feature enhancement being discussed here: https://issues.apache.org/jira/browse/ZOOKEEPER-723

FYI - it's my understanding that these garbage nodes don't add much overhead.

@artemip
Copy link
Contributor

artemip commented Jun 19, 2012

I've experienced an issue with Curator where, after many InterProcessMutex locks for unique paths, the leftover garbage nodes have caused a very large snapshot file to be generated. This causes leader election in ZK to fail while trying to transfer the snapshot between servers, unless initLimit and syncLimit in ZK configs are bumped up considerably.

The fix i've implemented in our code is to simply delete the node that is generated to support the lock (right above the leaf nodes) once a release() is called. This way, locks no longer leave behind garbage that collects over time.

Does this approach make sense? If so, would it be something that I should try to implement in Curator?

@Randgalt
Copy link
Contributor

I'd like to see what you've done. From what I've tried, there's no way to safely remove the parent as other processes might be assuming that it exists.

@artemip
Copy link
Contributor

artemip commented Jun 19, 2012

What I've implemented to account for this is some simple retry logic.

Considering the following situation:

Process A: Taking the lock
Process B: Releasing the lock

A: Check to see that /namespace/#{lock_id} exits, and if not create it. (Let's say it already exists)
B: Delete the ephemeral node /namespace/#{lock_id}/FOO
B: Delete /namespace/#{lock_id}, since it is now empty
A: Try to create /namespace/#{lock_id}/BAR
--> NoNodeException with message "/namespace/#{lock_id} does not exist."

The simple solution would be for Process A to continue trying to acquire() until SUCCESS (or until X number of retries) in the event of a NoNodeException (I see no other circumstances where this exception can arise in InterProcessMutex). Does this seem like an adequate solution on our end?

In terms of implementing something like this in Curator, here is my (untested) idea:

  1. Process A calls mutex.acquire()
  2. In or after the 'ensurePath' step in LockInternals (which makes sure that the path exists, creating nodes as appropriate), set the ACL of /namespace/#{lock_id} to disallow DELETE globally
  3. Upon successful acquisition, remove the NO-DELETE flag. If another process (B) has set the flag after step 2 (tried to acquire a lock), skip this step. Only the last process calling acquire() should be able to remove the flag. (not sure how to go about this)

@Randgalt Randgalt reopened this Jun 21, 2012
@Randgalt
Copy link
Contributor

That's an interesting idea to use the ACL. I'm not sure a general purpose library like Curator can do that, though, as the client may have its own ACL for that node. I'll think about that a bit.

Another avenue: I've been thinking recently that EnsurePath and the create method creatingParentsIfNeeded() are a bit backwards. Instead of pre-checking for the parent paths, they should be reactive. i.e. only create the paths on Exception. This way, deleting the parent node is safe as subsequent locks will get a NoNodeExists exception which would cause it to create the parents.

@artemip
Copy link
Contributor

artemip commented Jun 21, 2012

After attempting to implement the solution posted above (to fix things on my end), I found that any instance of EnsurePath can only have EnsurePath.ensure() be called once (subsequent calls are NOPs).
Due to this, I ended up implementing something along the lines of 'ZookeeperClient.newNamespaceAwareEnsurePath(path).ensure(ZookeeperClient)' in the retry logic for the acquisition step (Pretty much exactly the same solution you posted above, but outside of Curator).

Why is this the case? Wouldn't it be safe to assume that a path may need to be created more than once?

If this can be changed, might it be a good idea for ensurePath.ensure() to be called in both circumstances (i.e. both preemptive and reactive)?

If this (or the reactive-only version) can be implemented, I think the only missing piece would be for the InterProcessMutex, or the client, to manually remove unneeded znodes on lock release.

@Randgalt
Copy link
Contributor

The reason is for performance. It would be expensive to check for the parent paths each time.

@artemip
Copy link
Contributor

artemip commented Jun 21, 2012

Right, makes sense.

But, what happens in the event of something like this?

Processes A, B, C:

A: Release lock on #{lock_id}
A: Delete the ephemeral node /namespace/#{lock_id}/FOO
A: Delete /namespace/#{lock_id}, since it is now empty

B: Try to get lock for /namespace/#{lock_id}. NoNodeException occurs, so run EnsurePath.ensure()

C: Acquire, then release lock
C: Delete /namespace/#{lock_id}, since it is now empty

B: Try to get lock for /namespace/#{lock_id}. NoNodeException occurs, so run EnsurePath.ensure(), which is now a NOP. #{lock_id} doesn't get created, and process B loops eternally (until another lock on #{lock_id} is attempted by another process).

Thank you for your help, by the way!

@ph1lm
Copy link
Author

ph1lm commented Jun 21, 2012

Guys,
Just for the case if you don't find solution for this.
We solved the issue by using hash of lock key string instead of the key string itself - thus limiting lock space to some reasonable and limited number. e.g.:
def lockPath = "/our/lock/items/${item.hashCode() % maxNumberOfLocks}"

@Randgalt
Copy link
Contributor

artemip - I'd end not using EnsurePath or heavily modifying it. Let me try some ideas and I'll report back here.

@artemip
Copy link
Contributor

artemip commented Jun 21, 2012

Awesome, thank you.

@Randgalt
Copy link
Contributor

I don't see any way to do this with good guarantees. So, a workaround occurred to me. Why not have a reaper thread that periodically checks for registered nodes. If they are empty, delete them. Here's what I'm thinking of: https://gist.github.com/2970233

Thoughts? Should I add this?

@jlaban
Copy link

jlaban commented Jun 22, 2012

It still has the potential race condition where another process taking that lock might have the directory yanked out from under it, right? (It's greatly reduced the probability though, I think.)

@ph1lm that's an interesting approach too. Shouldn't really be done in Curator of course.

@Randgalt
Copy link
Contributor

As to the race - if the recipes/usages are written correctly there should be no race. FYI - I rewrote creatingParentsIfNeeded() as I talked about above. So, if the Reaper deletes the parent, the lock recipes in Curator will be fine as they will just re-make the parents when the error is caught.

@jlaban
Copy link

jlaban commented Jun 22, 2012

Oh ok, I thought you were going to create the reaper instead of the above change. My bad.

@Randgalt
Copy link
Contributor

So, to be clear, Curator users will need to create/start the Reaper. Curator won't do it by default. If you like, I can have it on tomorrow's release.

@jlaban
Copy link

jlaban commented Jun 23, 2012

@artemip ?

@Randgalt
Copy link
Contributor

Didn't get to it today. I tried, I swear ;) I'll have it by Monday hopefully.

@artemip
Copy link
Contributor

artemip commented Jun 23, 2012

I think this is a great solution. Would the reaper be configurable to only listen to and clean up certain directories in ZK? Would be annoying if it consistently removed nodes that are meant to be temporarily empty.

@Randgalt
Copy link
Contributor

Yes - you give the Reaper the paths to check.

@artemip
Copy link
Contributor

artemip commented Jul 4, 2012

Thank you very much for the fix - exactly what we needed.

I do have some concerns though:

  • The currently available Reaper modes do not take into account the scenario where a path is removed before the Reaper gets a chance to delete it. Would it be possible to implement a mode like REAP_UNTIL_GONE (or similar)? With this mode, you can configure the Reaper to only remove nodes that exist, and forget them if they do not (as opposed to REAP_UNTIL_DELETE, in which case the Reaper only forgets about the path after it successfully deletes it). If this is something I can do myself and send you a pull request for, that would be great.
  • There seems to be a possible memory leak with the 'activePaths' variable. In our application, we never know when a path should be removed from the Reaper. So, in our case, this set would grow indefinitely, unless the application explicitly tells the Reaper to remove the paths once it knows that they no longer need to be monitored. Ideally, once a path is deleted, the Reaper should remove all references to it forever. Is this something that can be made configurable as well?

Again, thank you very much for your help.

Let me know if there's anything I can do to assist you with this.

@Randgalt
Copy link
Contributor

Randgalt commented Jul 4, 2012

Good ideas. Please try to do the implementation and submit a pull request.

@Randgalt Randgalt reopened this Jul 4, 2012
@artemip
Copy link
Contributor

artemip commented Jul 5, 2012

Pull request submitted: #102

Thanks

@Randgalt Randgalt closed this as completed Jul 6, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants