Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: write file journal optimization #6484

Merged
merged 2 commits into from Nov 15, 2015

Conversation

XinzeChi
Copy link
Contributor

@XinzeChi XinzeChi commented Nov 6, 2015

Currently, there is single write thread for file journal, so it would be bottleneck.
It is important to keep logic of the journal write thread simple. According to the
implementation of transaction encoding, it is almost impossible that the write
bufferlist would be align. So write journal would call rebuild_aligned almost every time.
Because of the memory fragmentation, the bufferlist crc and rebuild would be bottleneck.

My implementation would move the complex logic out of journal write thread.

Signed-off-by: Xinze Chi xinze@xsky.com
Reviewed-by: Haomai Wang haomai@xsky.com

@XinzeChi
Copy link
Contributor Author

XinzeChi commented Nov 6, 2015

@liewegas , please review it.

@@ -71,6 +72,8 @@ class Journal {

virtual bool should_commit_now() = 0;

virtual int _op_journal_transactions_prepare(list<ObjectStore::Transaction*>& tls, bufferlist& tbl) = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we call this prepare_entry() so that it looks like submit_entry (they are now coupled)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, while we are here, let's make the tbl arg a bufferlist *tbl so that it's clear it's an output argument (most of this old code doesn't follow the coding style, but we should update it when it's convenient to do so)

@liewegas
Copy link
Member

liewegas commented Nov 8, 2015

Ah, that looks much better!

@XinzeChi
Copy link
Contributor Author

@liewegas , http://pulpito.ceph.com/sage-2015-11-11_14:51:49-rados-wip-sage-testing---basic-multi/ assert at filejournal throttle is introduced by this PR. Because the peek_write().bl.length() is not the orig length. I would push the new patch.

@XinzeChi
Copy link
Contributor Author

repush the code, sorry for the bug.

diff --git a/src/os/FileJournal.cc b/src/os/FileJournal.cc
index 4fa7ae8..c09b62c 100644
--- a/src/os/FileJournal.cc
+++ b/src/os/FileJournal.cc
@@ -893,7 +893,7 @@ int FileJournal::prepare_multi_write(bufferlist& bl, uint64_t& orig_ops, uint64_
        // throw out what we have so far
        full_state = FULL_FULL;
        while (!writeq_empty()) {
-'         put_throttle(1, peek_write().bl.length());
+'         put_throttle(1, peek_write().orig_len);
          pop_write();
        }  
        print_header(header);

@liewegas
Copy link
Member

Thank you! I will retest.

On Wed, 11 Nov 2015, Chi Xinze wrote:

@liewegas , http://pulpito.ceph.com/sage-2015-11-11_14:51:49-rados-wip-sage-testing---basic-multi/ assert at filejournal throttle is introduced
by this PR. Because the peek_write().bl.length() is not the orig length. I would push the new patch.

?
Reply to this email directly or view it on GitHub.[AAabh6lxtTQSPGRLT7zoqPagqkY2gywtks5pFDFigaJpZM4Gdb9b.gif]

@XinzeChi
Copy link
Contributor Author

@liewegas , I am so sorry that there some 3 places calling put_throttle(1, peek_write().bl.length());.
I only fix one last commit. now I have fix all. Thank you for your patience.

Currently, there is single write thread for file journal, so it would be bottleneck.
It is important to keep logic of the journal write thread simple. According to the
implementation of transaction encoding, it is almost impossible that the write
bufferlist would be align. So write journal would call rebuild_aligned almost every time.
Because of the memory fragmentation, the bufferlist crc and rebuild would be bottleneck.

My implementation would move the complex logic out of journal write thread.

Signed-off-by: Xinze Chi <xinze@xsky.com>
Reviewed-by: Haomai Wang <haomai@xsky.com>
@liewegas
Copy link
Member

passed testing. I think this should get a second set of eyes on it, and/or anothre run through for good measure. http://pulpito.ceph.com/sage-2015-11-12_19:19:23-rados-wip-sage-testing---basic-multi/

@ghost ghost changed the title os: write file journal optimazation os: write file journal optimization Nov 13, 2015
@@ -1959,7 +1959,10 @@ int FileStore::queue_transactions(Sequencer *posr, list<Transaction*> &tls,
journal->throttle();
//prepare and encode transactions data out of lock
bufferlist tbl;
int data_align = _op_journal_transactions_prepare(o->tls, tbl);
int orig_len = -1;
if (journal && journal->is_writeable()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was already done on line 1956.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

Signed-off-by: Xinze Chi <xinze@xsky.com>
liewegas added a commit that referenced this pull request Nov 15, 2015
os: write file journal optimization

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: David Zafman <dzafman@redhat.com>
@liewegas liewegas merged commit 16f23ab into ceph:master Nov 15, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants