New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make LedgerHandle injectable #1589
Conversation
This patch removes the hard dependency of LedgerHandle on BookKeeper. This allows us to create an instance of LedgerHandle independently, with a fully mocked BookieClient or LedgerManager and be clear on where the objects used by the LedgerHandle are coming from.
What is the use-case of this change? |
@jvrao testability of the client. In particular, I need fine grain control of when "bookies" and the "ledger-manager" respond to demonstrate a data loss bug (will push later today). Currently, to get a ledger handle, you have to create a BookKeeper object, which creates zookeeper connections, bookie connections and a whole bunch of crap. These can be mocked with mockito, but this mocking is brittle; It's unclear as you are writing which objects you need to mock on the bookkeeper object. There's no defined interface between LedgerHandle and BookKeeper, so it ended up reaching into it's internals all over the place. With this change, the interface between LedgerHandle and BookKeeper is explicit. If it changes in a breaking way, it will break at compile time, not when you run the tests. I going to make mock BookieClient, BookieWatcher, and LedgerManager implementations, which can be used directly with LedgerHandle. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
frankly speaking, I can't see the connection between this change and making ledger metadata immutable. The problem I see is we are accessing variables directly which make testing/mocking a bit hard. However if we just avoid direct access to variables and use getters, it would be more friendly to testing, and that's probably good enough for ledger metadata immutable change.
anyway I might be misreading this PR. if there is a strong reason about this PR for immutable metadata, please clarify it.
} | ||
|
||
/** | ||
* Allow to extend BookKeeper for mocking in unit tests. | ||
*/ | ||
@VisibleForTesting | ||
BookKeeper() { | ||
conf = new ClientConfiguration(); | ||
internalConf = ClientInternalConf.defaultValues(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we construct a default conf already, better to use it rather than call defaultValues
.
internalConf = ClientInternalConf.fromConfig(conf)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed.
metadataDriver = null; | ||
readSpeculativeRequestPolicy = Optional.absent(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all the assignments in this constructor are for testing purpose. when you remove them, you basically changing the bookkeeper client behavior. I don't think it is a good idea do to so. in general, please don't do any behavior changes in a code refactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is only used in one test, and I've rejigged that test to pass.
@@ -611,7 +544,7 @@ int getReturnRc(int rc) { | |||
} | |||
} | |||
|
|||
void scheduleBookieHealthCheckIfEnabled() { | |||
void scheduleBookieHealthCheckIfEnabled(ClientConfiguration conf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since you are introducing an internal conf, why not make the behavior consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to change this in a second pass where I make BookKeeper use ClientContext too. This way you can use a fully fledged bk client with all the outcalls mocked. It would also get rid of the need for a separate MockBookKeeper implementation.
@@ -101,9 +106,18 @@ | |||
|
|||
static final long PENDINGREQ_NOTWRITABLE_MASK = 0x01L << 62; | |||
|
|||
final ClientInternalConf conf; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the motivation behind this PR. However I think this changes the memory footprint for a client that opens many ledgers. Because before this change, LedgerHandle points back to BK, where it is the shared place to access shared resources. Now we have increased so many fields for each ledger. I am kind of hesitated on this change.
I think what you need to change is just when LedgerHandle try to access shared resources in bk
, it calls a method in bk
, not access the field in bk
directly. In this way, it always provides a clean path for mocking.
so I am not convinced (an more worried) on this change and I don't think it is really related to making ledger metadata immutable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what you need to change is just when LedgerHandle try to access shared resources in bk, it calls a method in bk, not access the field in bk directly. In this way, it always provides a clean path for mocking.
What i want to do is get rid of the dependency on the concrete BookKeeper object. How we do it isn't so important. If you don't want these extra fields, we can create a ClientContext interface, which can be used to pull the stuff from bookkeeper. I think moving all the individual config flags to a single object still makes sense though.
so I am not convinced (an more worried) on this change and I don't think it is really related to making ledger metadata immutable.
It's only indirectly related to immutable metadata, but it does kinda block it. I think I found a data loss bug in deferred failure handling. I don't want to proceed with the metadata stuff until that bug is verified, fixed, and we have a solid test in place for it. Testing the bug depends on being able to control bookie, client and metadata interaction very tightly, which means injecting all of these. Hence this patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either ClientContext
or ClientResources
work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Will rework it in morning
@@ -68,6 +71,8 @@ | |||
boolean completed = false; | |||
|
|||
LedgerHandle lh; | |||
BookieClient bookieClient; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same concern as LedgerHandle. All these shared resources probably should come from one place, rather than introducing more fields here. Especially this is in critical write path.
@@ -68,6 +71,11 @@ | |||
private final BitSet heardFromHostsBitSet; | |||
private final Set<BookieSocketAddress> sentToHosts = new HashSet<BookieSocketAddress>(); | |||
LedgerHandle lh; | |||
private final BookieClient bookieClient; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same concern as LedgerHandle. don't think it is a good idea to add 4 fields for each read op.
@@ -56,6 +59,14 @@ | |||
|
|||
final long requestTimeNano; | |||
private final LedgerHandle lh; | |||
private final BookieClient bookieClient; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same concern as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, just one comment that needs clarification.
import org.apache.bookkeeper.meta.LedgerManager; | ||
import org.apache.bookkeeper.proto.BookieClient; | ||
|
||
interface ClientContext { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
} | ||
}).parallelRead(true).readBatchSize(newConf.getRecoveryReadBatchSize()); | ||
@Override | ||
public void operationComplete(int rc, Void result) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this changes the testing behavior. the original testing behavior is parallelRead
only enabled on LedgerRecoveryOp
. but this PRs changes the behavior to apply to all behaviors on newBk
, hences all recovery ops will inherit the settings. I am not sure is it okay or not, but I am not very conformatable about such behavior change in a refactor PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that it's not actually important for the test, but I've changed it to the exact original behaviour in any case.
@eolivelli can you review this? @jvrao can anyone from Salesforce review this? @jiazhai can you also spend some time on reviewing this? there is a bit code churn here, just wanna to have more eyes on it. |
I am looking. It is a big change |
@@ -1046,6 +1047,7 @@ private void replicateLedgerFragment(LedgerHandle lh, | |||
SingleFragmentCallback cb = new SingleFragmentCallback( | |||
resultCallBack, | |||
lh, | |||
bkc.getMainWorkerPool(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getClientCtx?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eolivelli as I said in another comment, this patch I only want to focus on the LedgerHandle injectability. This line change is just because it was previously using the raw field. In a later pass, i'd like to get rid of all the methods like getMainWorkerPool from bookkeeper completely, but doing all that now would make an already huge patch even bigger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay.
Patch looks good to me. Great work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left just a nit. Overall seems okay
overall lgtm. |
@jiazhai overall picture is to make metadata immutable on the client side (#281), which needs to be done to allow metadata upgrade to be done cleanly, and has been an outstanding bit of tech debt since 2014. This particular change was to allow a test to be written that triggered a data loss bug that I discovered during the immutable metadata work. The rest of the stack of patches is immutable metadata work. |
okay from me |
@dlg99 Can you please review? |
This patch removes the hard dependency of LedgerHandle on BookKeeper.
This allows us to create an instance of LedgerHandle independently,
with a fully mocked BookieClient or LedgerManager and be clear on
where the objects used by the LedgerHandle are coming from.