Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: will io_uring "block" current thread when reading regular file? #826

Closed
YjyJeff opened this issue Mar 14, 2023 · 7 comments
Closed

Comments

@YjyJeff
Copy link

YjyJeff commented Mar 14, 2023

I try to create a database based on io_uring, therefore, I need to perform a lot of IO operations on regular files. After a lot of searching, I am confused about how io_uring works on the regular file.

According to lord of io_uring, io_uring works fine for regular files. However, after reading worker pool in io_uring, I think reading regular files will "block" the current thread(I am a kernel newbie, so correct me if am wrong 0.0) :

  1. After submission, io_uring will try to call a non-blocking read on the regular file.
  2. Because it is a regular file, non-blocking read always returns Ready. // Non blocking IO with regular file
  3. Read operation will execute in complete inline mode to read from the file system on the thread that calls io_uring_enter. "Blocking" happens here.

According to the above assumption, I write a simple program: submit multiple read operations in batch to io_uring and wait for completion in the end. With perf command:

sudo perf stat -a -e 'io_uring:io_uring_*' -- ./demo

I do not observe the hit on io_uring:io_uring_poll_arm and io_uring:io_uring_queue_async_work. Then I profile the program with flame graph, the flame graphs tell me that all of the cpus are used on reading from file system, which is inside the io_uring_enter syscall. All two profiles prove my thought.

So, reading from the file system is pretty expensive, will it block the thread?(Can not handle the task that is waked up by socket, etc.) Shall we be encouraged to submit the read operation on a regular file with async to put it on the worker pool?

Thanks!

@axboe
Copy link
Owner

axboe commented Mar 14, 2023

Your analysis is wrong, io_uring does not rely on poll for regular files, as that obviously doesn't work since they are always reading for reading/writing. Instead it does non-blocking issues of IO, and will retry them from io-wq if that fails. This is true for O_DIRECT for any kernel, buffered reads for quite a while, and buffered writes since a few releases ago. What is true for all kernels is that you don't need to mark them async to offload them, io_uring is aware of its limitations in those kernels and will offload them upfront as need be.

Please remember to include a kernel version and storage stack setup in your description, since things change for the better over time, it's hard to advise generically in case you are running an ancient kernel.

@YjyJeff
Copy link
Author

YjyJeff commented Mar 14, 2023

Your analysis is wrong, io_uring does not rely on poll for regular files, as that obviously doesn't work since they are always reading for reading/writing. Instead it does non-blocking issues of IO, and will retry them from io-wq if that fails. This is true for O_DIRECT for any kernel, buffered reads for quite a while, and buffered writes since a few releases ago. What is true for all kernels is that you don't need to mark them async to offload them, io_uring is aware of its limitations in those kernels and will offload them upfront as need be.

Please remember to include a kernel version and storage stack setup in your description, since things change for the better over time, it's hard to advise generically in case you are running an ancient kernel.

@axboe I am using the Ubuntu 20.04 wither kernel version 5.15.0-67-generic. 5 NVMe SSDs consist RAID0 and btrfs is used as the file system. Files are opened without O_DIRECT, buffered reads are used here. I will try it in a new kernel

@axboe
Copy link
Owner

axboe commented Mar 14, 2023

raid0 got nowait support fairly recently - definitely after 5.15, but not long after... I don't fully recall, will have to check the git logs for that. I think it was 6.0.

@beef9999
Copy link

@YjyJeff Try coroutine based on io_uring, you will feel like you are writing traditional libc programs. Don't need to care about the underneath details. This is one of our experience on database.

@YjyJeff
Copy link
Author

YjyJeff commented Mar 14, 2023

raid0 got nowait support fairly recently - definitely after 5.15, but not long after... I don't fully recall, will have to check the git logs for that. I think it was 6.0.

@axboe What can I observe in the perf output, if the io_uring works well? Can I observe the io_uring:io_uring_queue_async_work?

Instead it does non-blocking issues of IO, and will retry them from io-wq if that fails

BTW: What does the it does non-blocking issues of IO mean, when it fails? I searched a lot and does not find an answer 😭 Could you describe it briefly? Sorry for the dumb question.....

@GalaxySnail
Copy link
Contributor

Instead it does non-blocking issues of IO, and will retry them from io-wq if that fails

BTW: What does the it does non-blocking issues of IO mean, when it fails? I searched a lot and does not find an answer 😭 Could you describe it briefly? Sorry for the dumb question.....

I assume that it's doing something like:

struct iovec iov = {.iov_base = buf, .iov_len = buflen};
ssize_t ret = preadv2(fd, &iov, 1, offset, flags | RWF_NOWAIT);
if (ret < 0) {
    if (errno == EAGAIN) {
        run_in_io_wq_thread_pool(...);
    } else {
        return -1;
    }
}

@axboe
Copy link
Owner

axboe commented Mar 14, 2023

Yep, what @GalaxySnail said, just on the kernel side. Some exceptions on cases where we know it doesn't work (eg we can check upfront), for those it goes straight to io-wq as a nonblocking attempt isn't possible.

In general, any io-wq activity is not ideal. Seeing some on storage workloads isn't a major source for concern however, that can happen even on cases that do fully support nonblocking issue. Examples of that include queue being quisced for whatever reason, or running out of resources.

@YjyJeff YjyJeff closed this as completed Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants