Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/blestore/NVMEDevice: fix the I/O logic for read #13971

Merged
merged 1 commit into from Mar 20, 2017

Conversation

optimistyzy
Copy link
Contributor

@optimistyzy optimistyzy commented Mar 15, 2017

Aio_submit will submit both aio_read/write, and also there
are synchronized read and random read, so we need to
handle the read I/O completion in a correct way.

Also fix the exception logic in command send, make the style
consistent.
Signed-off-by: optimistyzy optimistyzy@gmail.com

@optimistyzy optimistyzy force-pushed the 0315_1 branch 4 times, most recently from 4d3d127 to ed7b015 Compare March 15, 2017 07:03
@optimistyzy
Copy link
Contributor Author

This patch is not ready for review, need to be tested. Thanks.

@liewegas liewegas changed the title Bluestore,NVMEDEVICE: fix the I/O logic for READ operation os/blestore/NVMEDevice: fix the I/O logic for read Mar 15, 2017
@liuhongtong
Copy link

I have test this pr based on commit f46b327 of master branch.
everything looks well.
seq read: 1 hour
rand read: 18 hours

root@node3:ceph(master)$ ceph -s
2017-03-17 09:52:02.469118 7f98e567b700 -1 WARNING: the following dangerous and experimental features are enabled: bluestore,rocksdb
2017-03-17 09:52:02.471024 7f98e567b700 -1 WARNING: the following dangerous and experimental features are enabled: bluestore,rocksdb
cluster 5a027377-d1aa-41aa-900c-cb96f2a12e84
health HEALTH_WARN
mon.node3 low disk space
monmap e2: 1 mons at {node3=192.168.200.213:6789/0}
election epoch 4, quorum 0 node3
mgr no daemons active
osdmap e18: 2 osds: 2 up, 2 in
flags sortbitwise,require_jewel_osds,require_kraken_osds,require_luminous_osds
pgmap v24506: 768 pgs, 2 pools, 92232 MB data, 23062 objects
95042 MB used, 652 GB / 745 GB avail
768 active+clean
root@node3:ceph(master)$ ceph df
2017-03-17 09:52:09.907172 7f65cff14700 -1 WARNING: the following dangerous and experimental features are enabled: bluestore,rocksdb
2017-03-17 09:52:09.910367 7f65cff14700 -1 WARNING: the following dangerous and experimental features are enabled: bluestore,rocksdb
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
745G 652G 95042M 12.45
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
rbd 0 0 0 608G 0
test 1 92232M 12.89 608G 23062
root@node3:ceph(master)$ ceph osd df
2017-03-17 10:06:45.977757 7fa14b2fc700 -1 WARNING: the following dangerous and experimental features are enabled: bluestore,rocksdb
2017-03-17 10:06:45.983255 7fa14b2fc700 -1 WARNING: the following dangerous and experimental features are enabled: bluestore,rocksdb
ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
0 0.36389 1.00000 372G 44307M 329G 11.61 0.93 373
1 0.36389 1.00000 372G 50735M 323G 13.30 1.07 395
TOTAL 745G 95042M 652G 12.45
MIN/MAX VAR: 0.93/1.07 STDDEV: 0.84

@optimistyzy
Copy link
Contributor Author

@liupan1111 @liuhongtong @yuyuyu101 I think that this patch can solve the rand read crash related issue

while (t->return_code > 0)
ioc.cond.wait(l);
while (t->return_code > 0) {
ioc.aio_wait();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't a safe check. we may enter the loop while return_code == 0, but before wait other thread change this and signal. at last we are stuck into wait infinitely!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-visited the code, and think that it is a safe check and will not cause the issue you mentioned.

std::unique_lock<std::mutex> l(ctx->lock);
ctx->cond.notify_all();
// read submitted by AIO
if(!task->return_code) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why could judge whether is a AIO read by return_code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the code. For AIO read, we set the initialized value of return_code to 0.

@optimistyzy optimistyzy force-pushed the 0315_1 branch 4 times, most recently from cdf40c7 to dceb911 Compare March 20, 2017 05:38
@optimistyzy
Copy link
Contributor Author

@yuyuyu101

Aio_submit will submit both aio_read/write, and also there
are synchronized read and random read, so we need to
handle the read I/O completion in a correct way.

Since random read has its own ioc, so the
num_reading for ioc will be at most 1, which will be easy
to handle in io_complete. And we need only to differentiate
whethere it is an aio_read.

Also fix the exception logic in command send, make the style
consistent.

Signed-off-by: optimistyzy <optimistyzy@gmail.com>
@yuyuyu101 yuyuyu101 merged commit fe66443 into ceph:master Mar 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants