Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/bluestore: refactor bluestore_sync_submit_transaction #11537

Merged
merged 7 commits into from Nov 1, 2016

Conversation

liewegas
Copy link
Member

No description provided.

@liewegas
Copy link
Member Author

unit test failure is fixed in latest master

txc->kv_submitted = true;
int r = db->submit_transaction(txc->t);
assert(r == 0);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liewegas .
I think we in kv_sync_thread, we must make sure the transaction which include new nid_max must update and then update nid_max.
Like:
nid_max_tmp =
encode(nid_max_tmp, bl)
submit_transaction(t)

nid_max = nid_max_tmp;
Avoid later txc first submit and the txc include new nid_max later submit.


virtual bool supports_parallel_transactions() {
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liewegas . I don't know for extentfreelistmanager why don't support parallel transaction? Hope your explain. Thanks!

@liewegas
Copy link
Member Author

Oh sorry, forgot to answer this.

The extent manager stores the freelist as offset=length k/v pairs. When
you take something off the freelist, it usually means deleing an old pair
and reinserting a new (smaller) one. This works as long as these updates
are ordered. But if parallel racing threads and doing these updates, the
updates won't make any sense. E.g.,

start with 0=100
txn1
rm 0=100
set 1=99

txn2
rm 1-99
set 2=88

If they commit in order all is well. If the order is reversed the
freelist is corrupted.

This was one of the motivations for the bitmap implementation, which uses
a commutative merge operator (xor).

@majianpeng
Copy link
Member

@liewegas . Thanks your detail explain. Even if it don't use parallel submit transaction, we also don't make tx1 must firstly submit to kv_queue.Is it so?

@@ -67,6 +67,10 @@ class BitmapFreelistManager : public FreelistManager {
void release(
uint64_t offset, uint64_t length,
KeyValueDB::Transaction txn) override;

bool supports_parallel_transactions() override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we inline this?

@varadakari
Copy link
Contributor

LGTM

@varadakari
Copy link
Contributor

varadakari commented Oct 21, 2016

@majianpeng _txc_finish_io() will make sure we are submitting the transaction in sequence.

}
{
std::lock_guard<std::mutex> l(kv_lock);
kv_queue.push_back(txc);
kv_cond.notify_one();
if (!txc->kv_submitted) {
kv_queue_unsubmitted.push_back(txc);
++txc->osr->kv_committing_serially;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liewegas What is the need of this atomic ? I can see it is just incremented and decremented , not checked anywhere.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liewegas Sorry missed the check , gave my comment above.

dout(20) << __func__
<< " last_{nid,blobid} exceeds max, submit via kv thread"
<< dendl;
} else if (txc->osr->kv_committing_serially) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liewegas I think this may hurt the parallelism, may be we should guard against a number rather than even 1 ?
I will run through my test and see how frequent we are hitting this.

@liewegas
Copy link
Member Author

liewegas commented Oct 25, 2016 via email

@liewegas
Copy link
Member Author

we should only fall into this case for low values of prealloc* tunables.
I set them low in store_test.cc to exercise this path but in the real
world they will be quite bit (1024 or something).

And drop bluestore_sync_transaction, at least for now.

The key change here is to make a per-txc flag indicating
whether the txn was already submitted.  This allows us
to make a choice between sync and not-sync on a per-txn
basis.

Signed-off-by: Sage Weil <sage@redhat.com>
This ensures the txn will not commit before an update to the
global max.

Signed-off-by: Sage Weil <sage@redhat.com>
It doesn't support unordered transactions.

Signed-off-by: Sage Weil <sage@redhat.com>
- single case to cover increases ahead of schedule or just in time
- update global max only after txn commits, eliminating race
- drop unneeded id_lock
- improve t naming

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
@liewegas liewegas merged commit 0b7577f into ceph:master Nov 1, 2016
@liewegas liewegas deleted the wip-bluestore-parallel branch November 1, 2016 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants