-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Socket bufs lock support (tcp perfomance decrease after migration fix) #1568
Socket bufs lock support (tcp perfomance decrease after migration fix) #1568
Conversation
2811a09
to
a0e14e3
Compare
Codecov Report
@@ Coverage Diff @@
## criu-dev #1568 +/- ##
============================================
- Coverage 69.07% 68.86% -0.21%
============================================
Files 136 137 +1
Lines 33248 33391 +143
============================================
+ Hits 22967 22996 +29
- Misses 10281 10395 +114
Continue to review full report at Codecov.
|
a0e14e3
to
715de39
Compare
Now when a kernel patch is in net-next https://lore.kernel.org/linux-mips/162807840601.323.12461942403559350811.git-patchwork-notify@kernel.org/ we can start reviewing it. Note that on older kernels this functionality/test would be automatically disabled by corresponding kerndat checks. |
A friendly reminder that this PR had no activity for 30 days. |
LGTM, but |
Sure, |
Enjoy your quest! |
@Snorch could you rebase the pr and fix what have wanted to fix |
715de39
to
fcc2322
Compare
Thanks for reminder! Rebased + removed "vz_" prefix. |
fcc2322
to
6f1dcad
Compare
|
6f1dcad
to
9b33b06
Compare
@checkpoint-restore/maintainers Please give some review here, I believe it's important as it fixes network (tcp) perfomance decrease after migration. |
Same error to #1628 (comment) on circleci: test-local-gcc |
Can you try to rerun the failed tests. If you click on 'Details' of the failed circleci test, there should be a button to rerun the tests. Usually I can trigger reruns but it does not work this time. Not sure why. |
9b33b06
to
7f344a7
Compare
I've rebased and push, but by just re-runing the circleci test I was seeing an error again and again... |
It is really strange. It only happens on this PR but seems unrelated to this PR. |
Not only this one
And #1628 this one. Both mine, obviousely I'm damn lucky =) (upd: let me try reruning rebased on master) |
We want to also c/r socket buf locks (SO_BUF_LOCKS) which are also implicitly set by setsockopt(SO_{SND,RCV}BUF*), so we need to order these two properly. That's why we need to wait for sk_setbufs to finish. And there is no much point in seting buffer sizes asyncronously anyway. Reviewed-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
This is a new kernel feature to let criu restore sockets with kernel auto-adjusted buffer sizes. Reviewed-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
When one sets socket buffer sizes with setsockopt(SO_{SND,RCV}BUF*), kernel sets coresponding SOCK_SNDBUF_LOCK or SOCK_RCVBUF_LOCK flags on struct sock. It means that such a socket with explicitly changed buffer size can not be auto-adjusted by kernel (e.g. if there is free memory kernel can auto-increase default socket buffers to improve perfomance). (see tcp_fixup_rcvbuf() and tcp_sndbuf_expand()) CRIU is always changing buf sizes on restore, that means that all sockets receive lock flags on struct sock and become non-auto-adjusted after migration. In some cases it can decrease perfomance of network connections quite a lot. So let's c/r socket buf locks (SO_BUF_LOCKS), so that sockets for which auto-adjustment is available does not lose it. Reviewed-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Just set all possible values 0-3 and chack if it persists. Reviewed-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Reviewed-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
7f344a7
to
d33cb4d
Compare
For some reason your runs have 'EC2: true' and you are running on much newer CPUs. It works when running on Google Cloud Engine but it fails when running on EC2: -BOOT_IMAGE=/boot/vmlinuz-5.4.0-1025-aws root=PARTUUID=ea9aeff6-01 ro console=tty1 console=ttyS0 nvme_core.io_timeout=4294967295 panic=-1
+BOOT_IMAGE=/boot/vmlinuz-5.4.0-1021-gcp root=PARTUUID=2707a646-6a60-476b-805c-ac3388be9375 ro console=ttyS0 panic=-1 -Vendor ID: GenuineIntel
-CPU family: 6
-Model: 85
-Model name: Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz
-Stepping: 4
-CPU MHz: 3104.109
-BogoMIPS: 4999.99
-Hypervisor vendor: KVM
-Virtualization type: full
+Vendor ID: GenuineIntel
+CPU family: 6
+Model: 63
+Model name: Intel(R) Xeon(R) CPU @ 2.30GHz
+Stepping: 0
+CPU MHz: 2299.998
+BogoMIPS: 4599.99
+Hypervisor vendor: KVM
+Virtualization type: full The first one is always from the logs of your CI run. Do you have anything specially configured for Circle CI? |
With the newer CPUs the test injection fails for some reason. |
I believe I did nothing special in my circle ci setup... Probably they just attach same node for reruns and if PR was lucky to get new cpu on first test run it would continue running on it, just guessing... e.g. my last PR passed all tests #1631 |
@mihalicyn your review has been requested. Can you please take a look? @avagin You also requested changes. Is the PR ready from your point of view. I also see a LGTM from @0x7f454c46 I would say almost ready to be merged. |
Thanks! |
When one sets socket buffer sizes with setsockopt(SO_{SND,RCV}BUF*), kernel sets corresponding SOCK_SNDBUF_LOCK or SOCK_RCVBUF_LOCK flags on struct sock. It means that such a socket with explicitly changed buffer size can not be auto-adjusted by kernel (e.g. if there is free memory kernel can auto-increase default socket buffers to improve performance). (see tcp_fixup_rcvbuf() and tcp_sndbuf_expand())
CRIU is always changing buf sizes on restore, that means that all sockets receive lock flags on struct sock and become non-auto-adjusted after migration. In some cases it can decrease performance of network connections quite a lot.
So let's c/r socket buf locks (SO_BUF_LOCKS), so that sockets for which auto-adjustment is available does not lose it.
Origin of the problem:
We have a customer in Virtuozzo who mentioned that nginx server becomes slower after migration. Especially it is easy to mention when you wget some big file via localhost from the same container which was just migrated.
By strace-ing all nginx processes I see that nginx worker process before c/r sends data to local wget with big chunks ~1.5Mb, but after c/r it only succeeds to send by small chunks ~64Kb (also to remote wget it is around ~128Kb). That can explain the decrease in download speed.
Before:
sendfile(12, 13, [7984974] => [9425600], 11479629) = 1440626 <0.000180>
After:
sendfile(8, 13, [1507275] => [1568768], 17957328) = 61493 <0.000675>
As a POC (this way one can check it without a kernel patch) I just commented out all buffer setting manipulations. I managed to get big chunks after c/r and download speed became the same (fast) as before migration.
This is yet a draft as a kernel patch does not hit mainstream kernel. See https://lkml.org/lkml/2021/7/30/346