squashfuse Performance #665

hollowec · 2022-08-29T22:53:50Z

Version of Apptainer

$ apptainer --version
apptainer version 1.1.0~rc.2-1.el7

Expected behavior

When using SIF images with unprivileged Apptainer, execution time should be similar to unprivileged Singularity.

Actual behavior

Apptainer's move to squashfuse for unprivileged (user namespace) mounts of SIF images has significantly increased the execution time of some containers, compared to automatically unpacking SIF images to a temporary sandbox as unprivileged Singularity did. I believe this is primarily a concern for containers running multiple processes/threads, as it seems there is a single squashfuse process to handle all of the parallel I/O requests and decompression.

Steps to reproduce this behavior

apptainer run -i -c -e -B /tmp/atlasgen:/results -B /tmp docker://gitlab-registry.cern.ch/hep-benchmarks/hep-workloads/atlas-gen-bmk:v2.1 -W --threads 1 --events 200
This is an ATLAS event generation benchmark container that will run a process per logical core on the host. Execution times on a system with 2x AMD EPYC 7351 CPUs (64 logical cores total):
Singularity with user namespaces (unpack to sandbox)
Execution time: ~24 min

Apptainer with setuid (squashfs privileged mount)
Execution time: ~25 min

Apptainer with user namespaces (squashfuse mount)
Execution time: ~2 hours 50 minutes

During execution, I see the squashfuse process using 100% of a single CPU core during most of the run.

Ideally the default behavior would be to revert to automatically unpacking SIF images when used unprivileged.

What OS/distro are you running

Scientific Linux 7

How did you install Apptainer

RPM from EPEL testing repo.

The text was updated successfully, but these errors were encountered:

DrDaveD · 2022-08-31T15:04:55Z

Thanks so much for this report and the details on your benchmark!

Indeed I was able to reproduce the issue and run many of my own measurements using your benchmark. I have access to 16-core nodes that have both local disk and lustre. They have dual 2.6 GHz Intel E5-2650v2 CPUs and 128G of RAM. Because I had 16 cores I ran my tests with 32 events instead of 200, and I always pre-converted the container to the format being tested. I don't have access to a setuid-root installation on the machine so couldn't measure using kernel squashfs (hopefully I can get a sysadmin to cooperate for a test later). I included testing the image from cvmfs at

/cvmfs/unpacked.cern.ch/gitlab-registry.cern.ch/hep-benchmarks/hep-workloads/atlas-sim-bmk:v2.1

These are the timings I found in minutes and seconds:

sandbox on local disk:  6:23
sandbox on lustre:      6:45 (only one node, not parallel launches)
sandbox on cvmfs:       9:33 (warm cache)
ext3 image on lustre:  14:27
sif image on lustre:   41:11

Clearly the time for squashfuse in that last measurement is unacceptable. However, the very good news is that there is an existing squashfuse pull request that adds multithreading support to the squashfuse_ll command. I measured the following with squashfuse_ll (after removing -o uid=NN,gid=NN because that's not supported):

unpatched squashfuse_ll: 13:06
patched squashfuse_ll:    6:35

So that makes a huge difference and I plan to include the patched squashfuse_ll in apptainer packaging for now until the new feature is distributed.

hollowec · 2022-08-31T15:36:11Z

Thanks @DrDaveD! Let me know when there is an new Apptainer 1.1.0rc EL7 RPM with the patched squashfuse_ll included and I will be happy to test. However, I would be somewhat concerned about including a patched/development release of squashfuse in a production Apptainer release, as it may lead to stability or other issues. If you decide to proceed that way, could you also please include an Apptainer option to disable the use of squashfuse, and revert to the old automatic temporary sandbox creation behavior, for example as implemented in my PR #668?

DrDaveD · 2022-09-01T21:48:11Z

Normally I would also be concerned with using an unreleased patch in production code, but this has such a huge impact on the user experience with default apptainer 1.1.0 that I'm willing to risk it and work on fixing any problems that are discovered.

DrDaveD · 2022-09-02T19:57:22Z

The fix is in #673, it would be great if you could compile it from source and run your benchmark on it. Follow the updated instructions in INSTALL.md for including the enhanced performance squashfuse_ll in an rpm.

DrDaveD · 2022-09-02T19:58:58Z

Oh, I forgot: instead of compiling it yourself you can download an rpm (for now, until it gets cleaned up) from this fedora koji scratch build.

hollowec · 2022-09-03T13:46:23Z

Thanks @DrDaveD. I've installed the RPM from Koji, verified it contains squashfuse_ll, and have started some tests.

hollowec · 2022-09-09T13:01:27Z

Just an update: I can confirm the patched/multithreaded squashfuse_ll performance is considerably better, and runtimes for the above container are on par with unpacked SIF. Thanks!

DrDaveD · 2022-09-16T20:29:05Z

I redid the measurements using the same benchmark on a single node (instead of mixing them up on comparable nodes), this time including measuring kernel squashfs with setuid, the rest non-setuid, all with apptainer-1.1.0-rc.3. These are the results with the average of two runs (none of which varied between each other by more than 1%):

kernel squashfs, sif on lustre: 6:33
multithreaded squashfuse_ll:    6:29
sandbox on local disk:          6:21
sandbox on lustre:              6:32 (only one node, not parallel launches)
sandbox on cvmfs:               6:50 (warm cache)
fuse2fs with ext3 image:       14:10 
standard squashfuse_ll:        12:48
standard squashfuse:           41:33

The first 4 are nearly identical, and cvmfs is not far behind.

DrDaveD · 2022-09-20T21:31:06Z

@hollowec I also tried the cms-gen-sim-bmk with the same parameters and the differences are not as dramatic. I ran a subset of the tests one time each and got the following results:

kernel squashfs, sif on lustre: 18:00
multithreaded squashfuse_ll:    17:57
sandbox on local disk:          17:42
sandbox on lustre:              17:53
standard squashfuse_ll:         18:07
standard squashfuse:            27:37

So the most dramatic change was from standard squashfuse to standard squashfuse_ll. Even the multithreading patch didn't make that much difference.

My question for you is, are there other benchmarks that I should be trying? Or is atlas-gen-bwk the most stressful of the benchmarks on code storage?

hollowec · 2022-09-21T15:31:10Z

Hi @DrDaveD. lhcb-gen-sim-bmk:v2.1 (options --threads 1 and --events 5) was another container which appeared to be largely affected by the squashfuse performance issue.

hollowec · 2022-09-21T15:49:48Z

FYI I've run tests against the complete HEPscoreBeta benchmark set (https://gitlab.cern.ch/hep-benchmarks/hep-score - atlas-gen-bmk, cms-gen-sim-bmk, and lhcb-gen-sim-bmk are part of this set), and since the introduction of the patched squashfuse_ll binary in the 1.1.0rc3 release, runtimes are very similar to temporary unpacked SIF.

DrDaveD · 2022-09-21T20:06:01Z

Thanks for that additional info. I ran lhcb-gen-sim-bmk:v2.1 -W --threads 1 --events 2 and got a bigger spread of results than with cms but not as big as with atlas:

kernel squashfs, sif on lustre: 13:20
multithreaded squashfuse_ll:    13:24
sandbox on local disk:          13:09
standard squashfuse_ll:         14:50
standard squashfuse:            26:27

The multithreaded squashfuse_ll does have a clear advantage over standard squashfuse_ll with this one.

hollowec mentioned this issue Aug 30, 2022

Add "--no-imgdrv" option #668

Closed

DrDaveD added this to the 1.1.0 milestone Aug 31, 2022

DrDaveD mentioned this issue Aug 31, 2022

Multithreaded rebase 2 vasi/squashfuse#70

Merged

DrDaveD mentioned this issue Sep 1, 2022

Prefer squashfuse_ll over squashfuse and include patched version #673

Merged

DrDaveD closed this as completed in #673 Sep 2, 2022

DrDaveD mentioned this issue May 25, 2023

Package gocryptfs with apptainer #1359

Merged

pachulo mentioned this issue Jul 17, 2023

Internal Server Error? nextcloud-snap/nextcloud-snap#2453

Closed

nathanweeks mentioned this issue Jul 19, 2023

Prefer squashfuse_ll over squashfuse sylabs/singularity#1910

Closed

pachulo mentioned this issue Jul 25, 2023

packaging: update squashfuse to 0.2.0 snapcore/snapd#13022

Merged

pachulo mentioned this issue Oct 17, 2023

packaging: update squashfuse to 0.5.0 snapcore/snapd#13307

Merged

DrDaveD mentioned this issue Dec 5, 2023

Change default of allow setuid-mount squashfs to no #1845

Closed

DrDaveD mentioned this issue Jan 17, 2024

Underlay/overlayfs/fuse-overlayfs performance #1918

Closed

DrDaveD mentioned this issue Feb 29, 2024

Slow first execution inside the container #2050

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

squashfuse Performance #665

squashfuse Performance #665

hollowec commented Aug 29, 2022

DrDaveD commented Aug 31, 2022 •

edited

Loading

hollowec commented Aug 31, 2022

DrDaveD commented Sep 1, 2022

DrDaveD commented Sep 2, 2022

DrDaveD commented Sep 2, 2022

hollowec commented Sep 3, 2022 •

edited

Loading

hollowec commented Sep 9, 2022 •

edited

Loading

DrDaveD commented Sep 16, 2022

DrDaveD commented Sep 20, 2022 •

edited

Loading

hollowec commented Sep 21, 2022

hollowec commented Sep 21, 2022 •

edited

Loading

DrDaveD commented Sep 21, 2022

squashfuse Performance #665

squashfuse Performance #665

Comments

hollowec commented Aug 29, 2022

Version of Apptainer

Expected behavior

Actual behavior

Steps to reproduce this behavior

What OS/distro are you running

How did you install Apptainer

DrDaveD commented Aug 31, 2022 • edited Loading

hollowec commented Aug 31, 2022

DrDaveD commented Sep 1, 2022

DrDaveD commented Sep 2, 2022

DrDaveD commented Sep 2, 2022

hollowec commented Sep 3, 2022 • edited Loading

hollowec commented Sep 9, 2022 • edited Loading

DrDaveD commented Sep 16, 2022

DrDaveD commented Sep 20, 2022 • edited Loading

hollowec commented Sep 21, 2022

hollowec commented Sep 21, 2022 • edited Loading

DrDaveD commented Sep 21, 2022

DrDaveD commented Aug 31, 2022 •

edited

Loading

hollowec commented Sep 3, 2022 •

edited

Loading

hollowec commented Sep 9, 2022 •

edited

Loading

DrDaveD commented Sep 20, 2022 •

edited

Loading

hollowec commented Sep 21, 2022 •

edited

Loading