Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set logs as getting flushed before releasing lock, race condition fix #1909

Closed
wants to merge 1 commit into from

Conversation

reidHoruff
Copy link
Contributor

@reidHoruff reidHoruff commented Feb 23, 2017

Relating to #1903:

In MaybeFlushColumnFamilies() we want to modify the 'getting_flushed' flag before releasing the db mutex when SwitchMemtable() is called.

The following 2 actions need to be atomic in MaybeFlushColumnFamilies()

  • getting_flushed is false on oldest log
  • we determine that all CFs can be flushed to successfully release oldest log
  • we set getting_flushed = true on the oldest log.

  • getting_flushed is false on oldest log
  • we determine that all CFs can NOT be flushed to successfully release oldest log
  • we set unable_to_flush_oldest_log_ = true on the oldest log.

In the 2pc case:

T1 enters function but is unable to flush all CFs to release log
T1 sets unable_to_flush_oldest_log_ = true
T1 begins flushing all CFs possible

T2 enters function but is unable to flush all CFs to release log
T2 sees unable_to_flush_oldes_log_ has been set so exits

T3 enters function and will be able to flush all CFs to release oldest log
T3 sets getting_flushed = true on oldest log
T3 begins flushing all remaining CFs to release logs

Important piece of atomicity: Once we determine that we will be able to flush sufficient CFs to release the oldest log, we must set getting_flushed on that log BEFORE doing the flushing.

@reidHoruff reidHoruff closed this Feb 23, 2017
@reidHoruff reidHoruff reopened this Feb 24, 2017
@reidHoruff reidHoruff changed the title et logs as getting flushed before releasing lock, race condition fix Set logs as getting flushed before releasing lock, race condition fix Feb 24, 2017
@facebook-github-bot
Copy link
Contributor

@reidHoruff has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@siying siying left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why tests didn't trigger. Maybe you'll need to recreate a new one to trigger it.

Copy link
Contributor

@al13n321 al13n321 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits. The non-2pc path lgtm. I'll let @siying approve the 2pc stuff.

db/db_impl.cc Outdated
@@ -5058,6 +5058,26 @@ void DBImpl::MaybeFlushColumnFamilies() {
return;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems equivalent to lines 5050-5079 but shorter and (imo) clearer:

if (allow_2pc() &&
    oldest_log_with_uncommited_prep > 0 &&
    oldest_log_with_uncommited_prep <= oldest_alive_log) {
  if (unable_to_flush_oldest_log_) {
    // we already attempted to flush all column families dependent on
    // the oldest alive log but the log still contained uncommited transactions.
    // the oldest alive log STILL contains uncommited transaction so there
    // is still nothing that we can do.
    return;
  } else {
    // the oldes log contains uncommitted transactions, so we won't be able to get rid of it just yet.
    // let's flush its memtables anyway (btw, why?).
    Log(InfoLogLevel::WARN_LEVEL, immutable_db_options_.info_log,
        "Unable to release oldest log due to uncommited transaction");
    unable_to_flush_oldest_log_ = true;
  }
} else {
  alive_log_files_.begin()->getting_flushed = true;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's flush its memtables anyway (btw, why?).

We flush the memtables we can now and the remaining tables later. It makes sense intuitively to me but maybe its not actually optimal behavior? It also reduces the chance of another transaction coming in and 'preparing' into a memtable that we could have flushed now will be unable to flush in the future. This guarantees that we are making progress on releasing the log.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

db/db_impl.cc Outdated
@@ -5058,6 +5058,26 @@ void DBImpl::MaybeFlushColumnFamilies() {
return;
}

// we only mark this log as getting flushed if we have successfully
// flushed all data in this log. If this log contains outstanding prepred
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"prepred" - did you mean "prepared"?

@siying
Copy link
Contributor

siying commented Mar 2, 2017

@reidHoruff I already approved the diff. Are you going to find a way to get all the tests run and commit it?

@reidHoruff
Copy link
Contributor Author

I will see that the sandcastle people have to say again.

@facebook-github-bot
Copy link
Contributor

@reidHoruff has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@siying
Copy link
Contributor

siying commented Mar 2, 2017

@reidHoruff maybe you can try to create another pull request just to trigger the tests.

@facebook-github-bot
Copy link
Contributor

@reidHoruff updated the pull request - view changes - changes since last import

@facebook-github-bot
Copy link
Contributor

@reidHoruff has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants