New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
blk/aio: fix long batch (64+K entries) submission. #56352
base: main
Are you sure you want to change the base?
Conversation
c29dd7b
to
bc78b3b
Compare
jenkins test make check |
jenkins test api |
@@ -25,33 +25,43 @@ int aio_queue_t::submit_batch(aio_iter begin, aio_iter end, | |||
int r; | |||
|
|||
aio_iter cur = begin; | |||
struct aio_t *piocb[aios_size]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem, as I understand it, is very simple: aios_size
got an overflow.
I think solution should be to either:
- change
uint16_t
touint32_t
(preferred) - count elements between begin and end; use
alloca
to allocate piocb.
The only benefit of current solution is that it does minimize stack allocation, but at cost of added complexity. I do not think its worth it.
Note that for every entry in piocb
we will be reading at least 4K from the drive. I think we can afford having entire piocb
on stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aclamk - you might be missing one more aspect - amount of chunks (and hence amount IOs issued in a single batch) in an object is determined (to some degree) by cluster's users. So it becomes possible for them to use "bad" objects which could DDOS OSDs or do something inappropriate.
And this patch provides a sort of protection against that - at aio level at least..
Given that I'd prefer not to [ab]use stack for this task.
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
Due to aios_size being a uint16 and the source value for the actual call being an int there was a possible overflow. This was "fixed" with an assert, however that still causes a crash. This commit removes the need for aios_size completely by iterating over the list and submitting it in max_iodepth batches. Fixes: https://tracker.ceph.com/issues/46366 Signed-off-by: Robin Geuze <robin.geuze@nl.team.blue> (cherry picked from commit f87db49)
444a0ac
to
3666d9d
Compare
jenkins test api |
src/blk/aio/aio.cc
Outdated
cur->aio.aiocb.aio_sigevent.sigev_notify = SIGEV_KEVENT; | ||
cur->aio.aiocb.aio_sigevent.sigev_notify_kqueue = ctx; | ||
cur->aio.aiocb.aio_sigevent.sigev_value.sival_ptr = &(*cur); | ||
r = aio_read(&cur->aio.aiocb); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'aio_read'? How on earth POSIXAIO could have ever worked with ceph? I refuse to believe that there were never single aio request to write one chunk.
Proposal:
Lets replace implementation here with #error Implementation removed.
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's funny. Replaced with aio_write()
src/blk/aio/aio.cc
Outdated
r = io_submit(ctx, toSubmit, (struct iocb**)(piocb + itemCount0)); | ||
if (r >= 0 && r < toSubmit) { | ||
r = -EAGAIN; | ||
itemCount0 += (r > 0 ? r : 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Easy task for an optimizer: itemCount0 += 0;
, as r==-EAGAIN
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believed I've fixed that... Let me check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missed the push, should be fine now
41b2fa9
to
4790da3
Compare
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
4790da3
to
acf7f15
Compare
jenkins test make check |
jenkins test make check |
This PR is under test in https://tracker.ceph.com/issues/65796. |
This is a revival of #44065
Includes some fixes and UT.
Fixes: https://tracker.ceph.com/issues/64966
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
x
between the brackets:[x]
. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows
jenkins test rook e2e