New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os: write file journal optimization #6484
Conversation
3d2b66d
to
7b0dc69
Compare
@liewegas , please review it. |
@@ -71,6 +72,8 @@ class Journal { | |||
|
|||
virtual bool should_commit_now() = 0; | |||
|
|||
virtual int _op_journal_transactions_prepare(list<ObjectStore::Transaction*>& tls, bufferlist& tbl) = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we call this prepare_entry() so that it looks like submit_entry (they are now coupled)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, while we are here, let's make the tbl arg a bufferlist *tbl so that it's clear it's an output argument (most of this old code doesn't follow the coding style, but we should update it when it's convenient to do so)
7b0dc69
to
5be4b71
Compare
Ah, that looks much better! |
@liewegas , http://pulpito.ceph.com/sage-2015-11-11_14:51:49-rados-wip-sage-testing---basic-multi/ assert at filejournal throttle is introduced by this PR. Because the peek_write().bl.length() is not the orig length. I would push the new patch. |
repush the code, sorry for the bug. diff --git a/src/os/FileJournal.cc b/src/os/FileJournal.cc
index 4fa7ae8..c09b62c 100644
--- a/src/os/FileJournal.cc
+++ b/src/os/FileJournal.cc
@@ -893,7 +893,7 @@ int FileJournal::prepare_multi_write(bufferlist& bl, uint64_t& orig_ops, uint64_
// throw out what we have so far
full_state = FULL_FULL;
while (!writeq_empty()) {
-' put_throttle(1, peek_write().bl.length());
+' put_throttle(1, peek_write().orig_len);
pop_write();
}
print_header(header); |
5be4b71
to
d4db6b2
Compare
Thank you! I will retest. On Wed, 11 Nov 2015, Chi Xinze wrote:
|
d4db6b2
to
755e9bc
Compare
@liewegas , I am so sorry that there some 3 places calling |
Currently, there is single write thread for file journal, so it would be bottleneck. It is important to keep logic of the journal write thread simple. According to the implementation of transaction encoding, it is almost impossible that the write bufferlist would be align. So write journal would call rebuild_aligned almost every time. Because of the memory fragmentation, the bufferlist crc and rebuild would be bottleneck. My implementation would move the complex logic out of journal write thread. Signed-off-by: Xinze Chi <xinze@xsky.com> Reviewed-by: Haomai Wang <haomai@xsky.com>
passed testing. I think this should get a second set of eyes on it, and/or anothre run through for good measure. http://pulpito.ceph.com/sage-2015-11-12_19:19:23-rados-wip-sage-testing---basic-multi/ |
@@ -1959,7 +1959,10 @@ int FileStore::queue_transactions(Sequencer *posr, list<Transaction*> &tls, | |||
journal->throttle(); | |||
//prepare and encode transactions data out of lock | |||
bufferlist tbl; | |||
int data_align = _op_journal_transactions_prepare(o->tls, tbl); | |||
int orig_len = -1; | |||
if (journal && journal->is_writeable()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was already done on line 1956.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
Signed-off-by: Xinze Chi <xinze@xsky.com>
755e9bc
to
c6a2ec2
Compare
os: write file journal optimization Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: David Zafman <dzafman@redhat.com>
Currently, there is single write thread for file journal, so it would be bottleneck.
It is important to keep logic of the journal write thread simple. According to the
implementation of transaction encoding, it is almost impossible that the write
bufferlist would be align. So write journal would call rebuild_aligned almost every time.
Because of the memory fragmentation, the bufferlist crc and rebuild would be bottleneck.
My implementation would move the complex logic out of journal write thread.
Signed-off-by: Xinze Chi xinze@xsky.com
Reviewed-by: Haomai Wang haomai@xsky.com