Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run file verification and package signature checking in parallel threads #703

Closed

Conversation

pmatilai
Copy link
Member

@pmatilai pmatilai commented May 10, 2019

While we're playing with parallelising things...

On my rusty old laptop, this tends to just slow things down unless
things are hot in cache, in which case it roughly halves the running time.

Might be worth more on SSD based setup, I'd be interested to hear results on such systems.
For now, this is just for giggles and benchmarking, not intended for merging.

Strictly speaking, this depends on PR #701 as we cannot fork while threading.

Build librpm with OpenMP support, refactor the file loop to use the
indexed rpmfiles objects instead of iterators.

On my rusty old laptop, this tends to just slow things down unless
things are hot in cache, in which case it roughly halves the running time.
Might be worth more on SSD based setup.
@pmatilai pmatilai added the RFC label May 10, 2019
@ignatenkobrain
Copy link
Contributor

If you could provide some script for testing this, I can do some benchmarks.

@pmatilai
Copy link
Member Author

Dunno what there is to script. Try 'time rpm -Va' with and without, add "echo 1 > /proc/sys/vm/drop_caches" in between to see what happens with cold caches?

@pmatilai
Copy link
Member Author

Hmm, couple of tests are failing:

  1. because errno isn't stored in the verify results, but that only affects the produced messages
  2. ghost etc skipping isn't handled in the second loop

@ignatenkobrain
Copy link
Contributor

With your PR (with/without dropping caches):

⋊> ~/P/u/rpm on master ⨯ time ./rpm -Va >/dev/null                                                                                                                          15:26:03
Command exited with non-zero status 1
297.33user 12.27system 1:03.35elapsed 488%CPU (0avgtext+0avgdata 46136maxresident)k
26104592inputs+128outputs (245major+17499minor)pagefaults 0swaps
⋊> ~/P/u/rpm on master ⨯ time ./rpm -Va >/dev/null                                                                                                                          15:27:09
Command exited with non-zero status 1
230.18user 6.50system 0:38.76elapsed 610%CPU (0avgtext+0avgdata 44048maxresident)k
0inputs+0outputs (0major+10576minor)pagefaults 0swaps

@ignatenkobrain
Copy link
Contributor

And this is without your PR with dropping caches:

⋊> ~/P/u/rpm on master ⨯ time ./rpm -Va >/dev/null                                                                                                                          15:29:57
Command exited with non-zero status 1
53.21user 12.86system 1:51.69elapsed 59%CPU (0avgtext+0avgdata 42192maxresident)k
26088560inputs+128outputs (245major+16402minor)pagefaults 0swaps

@pmatilai
Copy link
Member Author

Just to make sure, these are SSD numbers?

@ignatenkobrain
Copy link
Contributor

Yes, one of NVMe devices. One of the fastest you can find in laptops.

@ignatenkobrain
Copy link
Contributor

More specifically, SAMSUNG MZVLB512HAJQ

On my laptop this roughly doubles the runtime with cold cache, and
halves it with a hot cache: checking packages on loopback-mounted
Fedora-Server-dvd-x86_64-28-1.1.iso image takes about 33s serially and
60s in parallel when cold, and when hot, 20s serially and 10s parallel.
@pmatilai pmatilai changed the title Run file verification in parallel threads Run file verification and package signature checking in parallel threads May 13, 2019
@pmatilai
Copy link
Member Author

Added parallel signature checking with rpmkeys -K. For spinning disks, the results are (expectedly) quite similar: with cold cache its much slower, with hot cache much faster.

@pmatilai
Copy link
Member Author

As for the parallel signature checking, rpmkeys -Kv output is (obviously) busted in parallel mode, fixing that would require more than one or two liner adjustments. So that part too is certainly just for tyre kicking and benchmarking for now.

@pmatilai
Copy link
Member Author

Okay so this is not going in at this point no matter what, closing. I'll leave the branch around for the time being though in case people want to play around.

@pmatilai pmatilai closed this May 21, 2019
@pmatilai pmatilai deleted the parallel-verify-pr branch June 21, 2021 11:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants