New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A wide variety of bug fixes and performance improvements related to multi region configurations #892

Merged
merged 35 commits into from Nov 10, 2018

Conversation

Projects
None yet
3 participants
@etschannen
Copy link
Contributor

etschannen commented Nov 2, 2018

No description provided.

etschannen added some commits Nov 2, 2018

account for the overhead of tags on a message when batching transacti…
…ons into a version

increased the default location cache size
removed the countHealthyTeams check, because it was incorrect if it t…
…riggered during the wait(yield()) at the top of team tracker
fix: if the team started unhealthy and initialFailureReactionDelay wa…
…s ready, we would not send relocations to the queue

print wrong shard size team messages in simulation
fix: trackShardBytes was called with the incorrect range, resulting i…
…n incorrect shard sizes

reduced the size of shard tracker actors by removing unnecessary state variable. Because we have a large number of these actors these extra state variables add up to a lot of memory
fix: nested multCursors would improperly hang on getMore, because an …
…inner pop of cursors would not be detected by the outer instance
clients cache storage server interfaces individually, instead of as a…
… team. This is needed because in fearless every shard has storage servers from two separate teams, leading to a lot of possible combinations

allAlternatives failed logic was simplified, because we are already doing a global rate limiting, so a per shard limit is unnecessary
reduced unnecessary state variables in waitMetrics requests
fix: if a storage server already exists in a remote region after conv…
…erting to fearless, it did not receive mutations between the known committed version and the recovery version
fix: even if a peek cursor cannot find a local set for the most recen…
…t data, it still may be able to find data from older log sets

@etschannen etschannen requested a review from ajbeamon Nov 2, 2018

@alexmiller-apple alexmiller-apple self-requested a review Nov 2, 2018

@ajbeamon
Copy link
Contributor

ajbeamon left a comment

I've looked over all but 3 of the commits and had one question that I figured I'd go ahead and post.

self->dbSizeEstimate->set( self->dbSizeEstimate->get() + metrics.bytes - shardSize->get().get().bytes );

shardSize->set( metrics );
} catch( Error &e ) {

This comment has been minimized.

@ajbeamon

ajbeamon Nov 2, 2018

Contributor

It's not obvious to me why this can be removed. I see that there is one wait on a call to waitStorageMetrics. Is the idea that it can't throw any errors where we would retry them?

This comment has been minimized.

@etschannen

etschannen Nov 5, 2018

Contributor

yes, waitStorageMetrics is not transactional, and is not subject to our normal retry loop. The only reason it a function of a transaction is to use the client's cache of storage server locations. It would probably be more appropriate to make it a function of DatabaseContext.

@@ -51,6 +51,7 @@ ClientKnobs::ClientKnobs(bool randomize) {
init( DEFAULT_MAX_BACKOFF, 1.0 );
init( BACKOFF_GROWTH_RATE, 2.0 );
init( RESOURCE_CONSTRAINED_MAX_BACKOFF, 30.0 );
init( PROXY_COMMIT_OVERHEAD_BYTES, 23 ); //The size of serializing 7 tags (3 primary, 3 remote, 1 log router) + 2 for the tag length

This comment has been minimized.

@alexmiller-apple

alexmiller-apple Nov 2, 2018

Contributor

Why is this a knob? In what situation would we wish to change this value?

This comment has been minimized.

@etschannen

etschannen Nov 5, 2018

Contributor

My thought was that we may want to make it zero again in case there is a problem with the change. In retrospect, it is hard to imagine a problem caused by the batching could which would require changing it.


Void _ = wait(timeKeeperSetVersion(self));

loop {
state Reference<ReadYourWritesTransaction> tr = Reference<ReadYourWritesTransaction>(new ReadYourWritesTransaction(self->cx));
loop {
try {
if(!g_network->isSimulated()) {

This comment has been minimized.

@alexmiller-apple

alexmiller-apple Nov 2, 2018

Contributor

Leave a comment as to:

  1. That this is done just to provide an arbitrary logged transaction every ~10s.
  2. That this should be replaced by something that aggregates these statistics in a way that could be monitored and reacted to programmatically.
@@ -849,11 +849,15 @@ void ILogSystem::MultiCursor::advanceTo(LogMessageVersion n) {
}

Future<Void> ILogSystem::MultiCursor::getMore(int taskID) {
auto startVersion = cursors.back()->version();

This comment has been minimized.

@alexmiller-apple

alexmiller-apple Nov 2, 2018

Contributor

Isn't this auto just Version?

This comment has been minimized.

@etschannen

etschannen Nov 5, 2018

Contributor

it is actually a LogMessageVersion, but I changed it for you

etschannen added some commits Nov 8, 2018

fix: data distribution who not always add all subsets of emergency teams
fix: data distribution would not stop tracking bad teams after all their data was moved to other teams
fix: data distribution did not probably handle a server changing locality such that the teams it used to be on no longer satisfy the policy
fix: workers could only create a shared transaction log for one store…
… type. This resulted in the old store type being used for new transaction logs after configuration changes which changed the store type
@ajbeamon

This comment has been minimized.

Copy link
Contributor

ajbeamon commented Nov 8, 2018

This also needs release notes updates.

@etschannen etschannen merged commit 4bfb05f into apple:release-6.0 Nov 10, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment