-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Low performance with volumes on btrfs #6862
Comments
This should all be kernel-side - under the hood, these are just bind mounts into the mount namespace of the container. I don't think Podman itself does (or can do) anything about performance here - if there are issues, they're likely on the kernel's side |
While I agree with @mheon that podman cannot probably fix the problem by itself, I also do not agree with the decision of closing this issue. In fact, the problem seems to emerge from the interaction of podman with the kernel (or btrfs): btrfs does work well when used on the host directly. |
@matpen Sure we can reopen it but you would need to tell us what we are doing wrong with the mount point. If you look at the container and mount namespace at how the mount point is being handled, then tell us what we are doing wrong we would have a chance of fixing it. None of us use BTRFS so us figuring this out is not likely, and we can not even identify if this is a Podman issue. One thing to check would be if this works better with Docker. It could be BTRFS does not work well when used in Namespaces. |
@rhatdan Thank you for reopening the issue and supporting for this investigation! I am definitely willing to perform all necessary work to setup a reproducible case, and collect all required information. All I ask from the podman team is a few pointers as I make progress, specifically for those podman features I have limited understanding of.
For example, it seems like you already have a good idea on how to "look" at the mount namespace, do you mind suggesting it here? Maybe
This too is worth testing: I plan to setup a droplet from scratch, with two identical volumes running |
I was thinking |
@rhatdan Thank you for the pointer. So here is the result for the actual container that is presenting the problem:
|
Ok that is showing the overlay mount, but not the volume mounts inside of the container. |
My bad, here is the correct data:
|
Does the same performance degradation happen if subvolumes are not involved? |
@mheon No. The btrfs volume makes it slow. I just performed extensive tests, and am about to post the results. |
As mentioned in my earlier comments, today I also took the time to setup a droplet from scratch and perform some comparisons. The droplet is a standard droplet with 1 CPU, Ubuntu 18.04 and kernel 4.15.0-66-generic. The test involves a simple set of SQL queries that create a new database and some tables, and import about 35 MB of CSV data into them. The command to launch the container is:
After launching the container, I manually
The four tests refer to the same container, but the data residing on
Of course, the tests have been repeated multiple times, and at different times of the day, to exclude other effects.
And what I can comment so far:
Since this is just a testing droplet, I will go ahead and perform more agressive tests, like mounting |
|
This does look like a kernel issue, but can you try running a similar test with just cgroups and no container runtime at all (to help narrow down where the issue is coming from)? You could do this by doing something like: $ for ss in /sys/fs/cgroup/*; do mkdir $ss/test-performance; done
$ echo $$ >/sys/fs/cgroup/blkio/test-performance
$ # spawn mariadb in this cgroup
$ # run the test
$ # repeat, adding the shell to a new cgroup each time and restarting mariadb inside this cgroup This would help figure out whether this is a derivative of the cgroup issue we had several years ago (that you linked in this thread). Alternatively, can you provide an example |
@cyphar Thank you for joining this discussion, and for your valuable input: this is exactly the kind of help I was hoping to get. So I went on and, based on your directions, I prepared a scripted test that I ran multiple times today:
The results are attached. These include some statistics, too, such as the total and average test duration, and variance. And here is what seems to emerge:
I am eager to receive your comments on the results. Also, at this link I found that the DO engineers suggest tweaking the "queue depth", which remind me of the solutions related to the scheduler that have been suggested in the past. However, I am unsure what values would be sensitive to try. |
I'm pretty sure this is an SQL on Btrfs issue, not a podman on Btrfs issue. But to isolate this, it would be helpful if someone could put together a generic test case, i.e. a list of reproduce steps, that makes it easy for a non-podman familiar person, to test for performance differences between different setups. i.e. it would be file system unspecific. An additional test case might then do some basic SQL tasks and time them. Something like this:
Maybe it should be 1000 containers. That may not be typical but the idea might be to do an order of magnitude greater, in order to help expose problems. If 1000 containers is not so uncommon, then maybe the test case should test creating, modifying, and destroying 10000 containers. And for SQL testing, it really should try to isolate it to specifically SQL, i.e. it would do no timing while a container is being created or destroyed. The idea of doing this with podman is as an expedient, to make it easier for those unfamiliar with such things to do the testing. Including making it easier to automate in something like openQA - and now we have a basis to also do automated regression tests for these things and get an early warning if something's gone wrong somewhere. Time to run a series of SQL tests (perhaps start with sqlite, since it's common, and also used in Firefox - there's a dual use for such a test) The synthetic nature of such tests is not necessarily a negative. The lack of detailed timing information for every conceivable test is also not necessarily a negative. Such tests won't estimate whether a particular configuration runs a particular database any slower or faster, but it might help expose edge cases and regressions. If there's a big difference between two configurations, we can get more detailed information by running such benchmarks concurrent with eBPF tracing via bcc-tools like btrfsslower, fileslower, biolatency, and figure out where any time discrepancies lie. |
@cmurf Thank you for sharing some ideas. I would appreciate if you had the chance to test them in your environment, as to have a comparison. Here some thoughts:
The tests in my latest comments are pretty much automatic: simply launch the script, and it will run the tests by itself. Sure, some setup is needed to prepare the environment, e.g. install MySQL etc.
The tests include a case without involving containers, so its just "raw" SQL on different filesystems. The tests only measure the SQL query, and not the time required to setup the test (eg. launch and destroy containers). |
A friendly reminder that this issue had no activity for 30 days. |
I just finished a long debugging process with the cloud provider, but we could not track down the problem, either. Unfortunately, tests appear to lead to random results and are very complex to setup. All that we know so far is that
I am sorry I have not much more to report: me too I hoped that someone could add more insight. If you would rather close this issue, I will re-open at a later point if I can provide more data. |
As I don't believe this is a Podman issue, since the BTRFS is just being bind mounted into the container. This is a BTRFS issue and should be taken up with the OS vendor. |
@matpen did you report this somewhere else? I tried to import a 2.5GB sql Dump today and it did not finish in hours. I first tried in docker and then tested on a native MySQL Server 8 on Ubuntu with the same result. |
@jhit thank you for reporting your experience with this! I have moved away from that setup since my last comment in this issue, so unfortunately I have nothing new to share. IIRC I also have not reported this anywhere else... but I would be interested in the conversation, if you decide to do so. |
@matpen Ok. Will see if I find some time to start a conversation in the btrfs community. Will post a link here if I get to it. |
Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug
Description
I am experiencing low performance when containers using volumes backed by a btrfs filesystem. Since I started to use podman I have been puzzled by my applications being slow when deployed (DigitalOcean droplet), but pretty fast in my development environment (laptop). Now I could finally find a reproducer, which I will try to explain below.
The issue I am having is particularly with a mariadb container, where I am importing some data from CSV files. Performing the same query on DigitalOcean (inside the container) is 6 to 10 times slower than on my laptop (again, inside the container). Today I also compared with the same query performed directly on the host (droplet) and by using another volume backed by ext4: sure enough, the issue only appears inside the container when the volume backed by btrfs.
I therefore performed some research, and came across this thread, this bug and this issue, which more or less describe my situation, but for docker. Unfortunately, the solutions described therein do not work in my case:
and from here:
however, these files do not exist in my system. See below for more connected info.
however, none of these work in my case. I am only given the options
none
andmq-deadline
, which do not seem to affect performance at all, even after restarting the containers.ext4
) and the performance increased to the expected level. This is unfortunate, as I was planning to leverage some btrfs features like RAID and snapshots.Having run out of ideas, I turn to the experts hoping to receive some insight.
Steps to reproduce the issue:
See above description.
Describe the results you received:
Low performance during SQL queries.
Describe the results you expected:
Same performance as on the host.
Additional information you deem important (e.g. issue happens only occasionally):
Although I did not measure yet, mongodb and other processes seem to be affected, too.
Howver, I did accuretely measure the performance of all other possibly involved components (CPU, RAM, even the very same volume using
dd
) and everything seems nominal: only the SQL query is reproducibly slow.Output of
podman version
:Output of
podman info --debug
:Package info (e.g. output of
rpm -q podman
orapt list podman
):Additional environment details (AWS, VirtualBox, physical, etc.):
The issue appears on a DigitalOcean droplet, where both the images and the podman volumes are stored on a btrfs filesystem (
/mnt/data
) on an "attached storage volume".The text was updated successfully, but these errors were encountered: