Skip to content

Remove lock from URManager’s state only if lock znode deletion has succeeded#2206

Merged
eolivelli merged 3 commits intoapache:masterfrom
reddycharan:fixzklossissue
Jun 12, 2020
Merged

Remove lock from URManager’s state only if lock znode deletion has succeeded#2206
eolivelli merged 3 commits intoapache:masterfrom
reddycharan:fixzklossissue

Conversation

@reddycharan
Copy link
Contributor

Descriptions of the changes in this PR:

  • in ZkLedgerUnderreplicationManager.releaseUnderreplicatedLedger remove ‘lock’
    from ‘heldLocks’ only if lock znode deletion has succeeded.
  • This is needed because, if RW.logBKExceptionAndReleaseLedger fails to delete
    the lock znode, then it needs to be tried once more in the finally block of
    'rereplicate(long ledgerIdToReplicate)’ before giving up and shutting down RW.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much this patch is critical?
Should be roll a 4.10.1 hotfix release?

@reddycharan
Copy link
Contributor Author

How much this patch is critical?
Should be roll a 4.10.1 hotfix release?

This is not critical fix. Probability of happening is low, we internally configured number of zkOperations retry count to 3, but where as in the community version, it is set to infinite number.

@Ghatage
Copy link
Contributor

Ghatage commented Nov 15, 2019

HI @reddycharan is the MockZooKeeperClient class in this change a dupe of this already existing MockZooKeeperClient class?
It seems there might be some functions which are missing from either, do you think it will be a good idea to consolidate them and use them across test cases? Potentially even for #901 ?
@eolivelli what do you think?

@reddycharan
Copy link
Contributor Author

@Ghatage it is not dupe. I'm creating new class to mimic (mock/inject) certain behavior needed for this particular testcase. Scope of this Mock class is local to this testsuite. So it is not meant to be merged.

@eolivelli
Copy link
Contributor

Yes, we can have two classes, they are doing different things.
Having a single MockZookeeperClient can complicate things and maybe affect each other test suites.
So I'd stay as Charan is proposing

/*
* Start RW.
*/
ReplicationWorker rw = new ReplicationWorker(baseConf, bkWithMockZK, false, NullStatsLogger.INSTANCE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to shutdown this object and the other ones in a finallyblock in order to teardown properly the test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eolivelli fixed it

@eolivelli
Copy link
Contributor

@reddycharan do you have time to move forward this work?
It would be awesome to have this fix in 4.11

…cceeded. (apache#462)

- in ZkLedgerUnderreplicationManager.releaseUnderreplicatedLedger remove ‘lock’
 from ‘heldLocks’ only if lock znode deletion has succeeded.
- This is needed because, if RW.logBKExceptionAndReleaseLedger fails to delete
 the lock znode, then it needs to be tried once more in the finally block of
 'rereplicate(long ledgerIdToReplicate)’ before giving up and shutting down RW.
@reddycharan
Copy link
Contributor Author

hey @eolivelli can you approve this PR.

Also, btw, can I use this "Squash and Merge" option for merging code to master?

@eolivelli
Copy link
Contributor

We are using the script not this button.
I will merge your patch.
It is better to not self merge if not needed

Thank you very much!

@eolivelli
Copy link
Contributor

I can't merge it now because I don't have my laptop. I will do it as soon as possible

@eolivelli eolivelli added this to the 4.11.0 milestone Jun 12, 2020
@eolivelli eolivelli merged commit 0bea124 into apache:master Jun 12, 2020
Ghatage pushed a commit to Ghatage/bookkeeper that referenced this pull request Jun 19, 2020
…cceeded




Descriptions of the changes in this PR:

- in ZkLedgerUnderreplicationManager.releaseUnderreplicatedLedger remove ‘lock’
 from ‘heldLocks’ only if lock znode deletion has succeeded.
- This is needed because, if RW.logBKExceptionAndReleaseLedger fails to delete
 the lock znode, then it needs to be tried once more in the finally block of
 'rereplicate(long ledgerIdToReplicate)’ before giving up and shutting down RW.

Reviewers: Enrico Olivelli <eolivelli@gmail.com>

This closes apache#2206 from reddycharan/fixzklossissue
Ghatage pushed a commit to Ghatage/bookkeeper that referenced this pull request Oct 6, 2020
…cceeded




Descriptions of the changes in this PR:

- in ZkLedgerUnderreplicationManager.releaseUnderreplicatedLedger remove ‘lock’
 from ‘heldLocks’ only if lock znode deletion has succeeded.
- This is needed because, if RW.logBKExceptionAndReleaseLedger fails to delete
 the lock znode, then it needs to be tried once more in the finally block of
 'rereplicate(long ledgerIdToReplicate)’ before giving up and shutting down RW.

Reviewers: Enrico Olivelli <eolivelli@gmail.com>

This closes apache#2206 from reddycharan/fixzklossissue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants