New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get currentEnsemble from ledger rather than directly from metadata #1586
Conversation
This will allow us to override the currentEnsemble being used during recovery without modifying the metadata itself. Master issue: apache#281
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
I think thr suggestions I made are around code that you are going to change soon, so this change looks good to me
+1
@@ -1215,7 +1215,7 @@ private boolean isWritesetWritable(DistributionSchedule.WriteSet writeSet, | |||
|
|||
int nonWritableCount = 0; | |||
for (int i = 0; i < sz; i++) { | |||
if (!bk.getBookieClient().isWritable(getLedgerMetadata().currentEnsemble.get(i), key)) { | |||
if (!bk.getBookieClient().isWritable(getCurrentEnsemble().get(i), key)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sample currentEnsemble before starting the loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also in favor of saving the ensemble to a variable before starting the loop. However I think the logic here needs to be revisited, just to make sure ensemble changes won't impact the logic here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed. it won't affect the logic. this code is an advisary thing to apply backpressure. The ensemble could still change between a writeset being seen as writable, and the pendingAddOp being triggered, but this won't cause a problem.
@@ -138,7 +138,7 @@ long getEntryId() { | |||
void sendWriteRequest(int bookieIndex) { | |||
int flags = isRecoveryAdd ? FLAG_RECOVERY_ADD | FLAG_HIGH_PRIORITY : FLAG_NONE; | |||
|
|||
lh.bk.getBookieClient().addEntry(lh.getLedgerMetadata().currentEnsemble.get(bookieIndex), | |||
lh.bk.getBookieClient().addEntry(lh.getCurrentEnsemble().get(bookieIndex), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sample the list outside of this method ? in the loop I mean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -266,7 +266,7 @@ public void writeComplete(int rc, long ledgerId, long entryId, BookieSocketAddre | |||
int bookieIndex = (Integer) ctx; | |||
--pendingWriteRequests; | |||
|
|||
if (!lh.getLedgerMetadata().currentEnsemble.get(bookieIndex).equals(addr)) { | |||
if (!lh.getCurrentEnsemble().get(bookieIndex).equals(addr)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
for (int i = 0; i < metadata.currentEnsemble.size(); i++) { | ||
lh.bk.getBookieClient().readLac(metadata.currentEnsemble.get(i), | ||
for (int i = 0; i < currentEnsemble.size(); i++) { | ||
lh.bk.getBookieClient().readLac(currentEnsemble.get(i), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is perfect to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eolivelli can you elaborate why you made 'perfect' comment just for this? Just curious. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jvrao I was exaggerating :-)
Only just because in the last past days for 2/3 times I had requested Ivan to sample currentEnsemble out of the loop, this time I did not add the comment.
@@ -1215,7 +1215,7 @@ private boolean isWritesetWritable(DistributionSchedule.WriteSet writeSet, | |||
|
|||
int nonWritableCount = 0; | |||
for (int i = 0; i < sz; i++) { | |||
if (!bk.getBookieClient().isWritable(getLedgerMetadata().currentEnsemble.get(i), key)) { | |||
if (!bk.getBookieClient().isWritable(getCurrentEnsemble().get(i), key)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also in favor of saving the ensemble to a variable before starting the loop. However I think the logic here needs to be revisited, just to make sure ensemble changes won't impact the logic here.
List<BookieSocketAddress> getCurrentEnsemble() { | ||
// Getting current ensemble from the metadata is only a temporary | ||
// thing until metadata is immutable. At that point, current ensemble | ||
// becomes a property of the LedgerHandle itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know LedgerHandle
is the mix of write and read handle. However this method is exposed to ReadOnlyHandle. It is a bit confused when we are saying current ensemble
for a readonly handle. so I would suggest adding more comments here to clarify the meanings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There'll be a big comment on the ReadOnlyLedgerHandle implementation. The whole class hierarchy here is broken. LedgerHandle should derive from ReadOnlyLedgerHandle, not vice versa. ReadOnlyLedgerHandle shouldn't do recovery. Recovery should be a separate operation that occurs before the ledger handle is even created. However, currently this is a big change because of the lack of separation between the *Op classes and LedgerHandle. This getCurrentEnsemble() is the ugliness I was referring to on dev@ when I said the solution isn't nice. The add op has to call back to the ledger handle to get the new ensemble after a failure. It would be better if, when an ensemble change occurs, all outstanding pending add ops are passed the new ensemble by the ledger handle. I might actually change it do that. The ops shouldn't depend on LedgerHandle. The objects they require should be passed in as parameters. There's so much janky stuff in this code, it can be painful to not fix it all at once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ivankelly we all understand the problem and motivation behind this. however that's a different topic to address all these. so let's not try to couple that here.
Here what I am asking is adding a comment to make thing clear. That's it. I want people who might be working on this file or when people look back at the git commit, we know exactly what was the thought of getCurrentEnsemble
.
AddPendingOp now gets updates to the ensemble via "unsetSuccess..." Added a comment to getCurrentLedger about ROLedgerHandle's usage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very clearer +1
Checkstyle is not happy |
@eolivelli a lot of things are unhappy, and once I started looking into them, there's some more fundamental issues. Looking into them now. |
This will allow us to override the currentEnsemble being used during
recovery without modifying the metadata itself.
Master issue: #281