Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High memory consumption when cataloging images with many files #2159

Open
akpsgit opened this issue Sep 20, 2023 · 5 comments
Open

High memory consumption when cataloging images with many files #2159

akpsgit opened this issue Sep 20, 2023 · 5 comments
Assignees
Labels
bug Something isn't working performance

Comments

@akpsgit
Copy link

akpsgit commented Sep 20, 2023

What happened:
While running an anchore/syft container in k8s with a 1GB memory limit, I noticed that it was killed on OOM while calculating the SBOM for the following demo image:
public.ecr.aws/ciscoeti/apa/bestbags-delivery@sha256:a560bdbb72f563b4be354f414aa812d442a9c1a0527d6687f95f67c8c57a65bc

Created a simple go binary to profile syft package cataloging to understand what causes the high memory consumption:
https://gist.github.com/akpsgit/bc14660363f3217c9df8d1e8900076cf

Got the following memory peak towards the end of cataloging:

Alloc = 2 MiB	TotalAlloc = 4 MiB	Sys = 14 MiB	NumGC = 1
Alloc = 2 MiB	TotalAlloc = 4 MiB	Sys = 14 MiB	NumGC = 1
Alloc = 2 MiB	TotalAlloc = 4 MiB	Sys = 14 MiB	NumGC = 1
Alloc = 29 MiB	TotalAlloc = 118 MiB	Sys = 40 MiB	NumGC = 24
Alloc = 34 MiB	TotalAlloc = 200 MiB	Sys = 53 MiB	NumGC = 29
Alloc = 39 MiB	TotalAlloc = 294 MiB	Sys = 78 MiB	NumGC = 33
Alloc = 58 MiB	TotalAlloc = 372 MiB	Sys = 87 MiB	NumGC = 35
Alloc = 71 MiB	TotalAlloc = 559 MiB	Sys = 131 MiB	NumGC = 40
Alloc = 110 MiB	TotalAlloc = 701 MiB	Sys = 140 MiB	NumGC = 42
Alloc = 125 MiB	TotalAlloc = 893 MiB	Sys = 165 MiB	NumGC = 45
Alloc = 130 MiB	TotalAlloc = 1120 MiB	Sys = 227 MiB	NumGC = 48
Alloc = 133 MiB	TotalAlloc = 1335 MiB	Sys = 256 MiB	NumGC = 50
Alloc = 353 MiB	TotalAlloc = 1717 MiB	Sys = 383 MiB	NumGC = 52
Alloc = 816 MiB	TotalAlloc = 2474 MiB	Sys = 986 MiB	NumGC = 54
2023/09/20 16:57:04 More than 800 MB is allocated!!! dumping memory profile
Alloc = 782 MiB	TotalAlloc = 2753 MiB	Sys = 1084 MiB	NumGC = 55

From lookin at the memory graph of go tool pprof -http=:8080 ./20230920165704-mem.prof:
In use space:
image

Allocated space:
image

looks like the issue might be related to the file tree squash or the mimetype DetectReader().

From looking at the image content to see what special about it that can explain the high consumption, looks like it has ~40K files, which can explain the high usage in tree squash and mimetype DetectReader.

What you expected to happen:

Steps to reproduce the issue:

  1. Build and run the attached go gist: https://gist.github.com/akpsgit/bc14660363f3217c9df8d1e8900076cf
  2. Check for the following print "2023/09/20 16:57:04 More than 800 MB is allocated!!! dumping memory profile"
  3. Check that a memory profile was created in the directory (e.g. 20230920165704-mem.prof)
  4. Examine the profile: go tool pprof -http=:8080 ./20230920165704-mem.prof

Anything else we need to know?:
Might be related:
gabriel-vasile/mimetype#354

Environment:

  • Output of syft version:
docker run --rm -it  anchore/syft version
Application:     syft
Version:         0.90.0
BuildDate:       2023-09-11T21:22:00Z
GitCommit:       b82c0ffc3417bdc8c38f4633af95a668ec29fa35
GitDescription:  v0.90.0
Platform:        linux/amd64
GoVersion:       go1.21.0
Compiler:        gc
  • OS (e.g: cat /etc/os-release or similar):
    K8s kind, MacOS
@akpsgit akpsgit added the bug Something isn't working label Sep 20, 2023
@tgerla
Copy link
Contributor

tgerla commented Sep 21, 2023

Hi @akpsgit, thanks for this detailed report. We will put it in our backlog for investigation as soon as we can!

@willmurphyscode willmurphyscode self-assigned this May 24, 2024
@willmurphyscode willmurphyscode added the changelog-ignore Don't include this issue in the release changelog label May 24, 2024
@willmurphyscode
Copy link
Contributor

Thanks for the detailed report @akpsgit!

Here's how the profile looks now:
image

I think this was fixed by #2814.

Please let us know if we've missed something.

@akpsgit
Copy link
Author

akpsgit commented May 26, 2024

Hello @willmurphyscode, thanks a lot for the update. I Modified the profiling gist to work with the new code from v1.4.1 based on the example from https://github.com/anchore/syft/blob/main/examples/create_simple_sbom/main.go:
https://gist.github.com/akpsgit/62af6e4232e8ddfb88cc12d050200ef3

looks like there is still a memory consumption peak at around 800 MB right at the end of SBOM generation for both versions:

v1.4.1:
Alloc = 42 MiB	TotalAlloc = 389 MiB	Sys = 85 MiB	NumGC = 38
Alloc = 43 MiB	TotalAlloc = 390 MiB	Sys = 85 MiB	NumGC = 38
Alloc = 43 MiB	TotalAlloc = 390 MiB	Sys = 85 MiB	NumGC = 38
Alloc = 45 MiB	TotalAlloc = 391 MiB	Sys = 85 MiB	NumGC = 38
Alloc = 47 MiB	TotalAlloc = 393 MiB	Sys = 85 MiB	NumGC = 38
Alloc = 48 MiB	TotalAlloc = 395 MiB	Sys = 85 MiB	NumGC = 38
Alloc = 83 MiB	TotalAlloc = 644 MiB	Sys = 117 MiB	NumGC = 45
Alloc = 99 MiB	TotalAlloc = 660 MiB	Sys = 117 MiB	NumGC = 45
Alloc = 101 MiB	TotalAlloc = 662 MiB	Sys = 117 MiB	NumGC = 45
Alloc = 103 MiB	TotalAlloc = 665 MiB	Sys = 121 MiB	NumGC = 45
Alloc = 104 MiB	TotalAlloc = 666 MiB	Sys = 121 MiB	NumGC = 45
Alloc = 105 MiB	TotalAlloc = 667 MiB	Sys = 121 MiB	NumGC = 45
Alloc = 107 MiB	TotalAlloc = 668 MiB	Sys = 125 MiB	NumGC = 45
Alloc = 109 MiB	TotalAlloc = 670 MiB	Sys = 125 MiB	NumGC = 45
Alloc = 68 MiB	TotalAlloc = 672 MiB	Sys = 129 MiB	NumGC = 46
Alloc = 69 MiB	TotalAlloc = 673 MiB	Sys = 129 MiB	NumGC = 46
Alloc = 69 MiB	TotalAlloc = 673 MiB	Sys = 129 MiB	NumGC = 46
Alloc = 122 MiB	TotalAlloc = 784 MiB	Sys = 146 MiB	NumGC = 47
Alloc = 95 MiB	TotalAlloc = 811 MiB	Sys = 146 MiB	NumGC = 48
Alloc = 103 MiB	TotalAlloc = 818 MiB	Sys = 146 MiB	NumGC = 48
Alloc = 104 MiB	TotalAlloc = 819 MiB	Sys = 146 MiB	NumGC = 48
Alloc = 104 MiB	TotalAlloc = 820 MiB	Sys = 146 MiB	NumGC = 48
Alloc = 105 MiB	TotalAlloc = 821 MiB	Sys = 146 MiB	NumGC = 48
Alloc = 106 MiB	TotalAlloc = 821 MiB	Sys = 146 MiB	NumGC = 48
Alloc = 106 MiB	TotalAlloc = 821 MiB	Sys = 146 MiB	NumGC = 48
Alloc = 132 MiB	TotalAlloc = 1237 MiB	Sys = 215 MiB	NumGC = 54
Alloc = 135 MiB	TotalAlloc = 1239 MiB	Sys = 215 MiB	NumGC = 54
Alloc = 135 MiB	TotalAlloc = 1240 MiB	Sys = 215 MiB	NumGC = 54
Alloc = 212 MiB	TotalAlloc = 1615 MiB	Sys = 296 MiB	NumGC = 57
Alloc = 464 MiB	TotalAlloc = 2177 MiB	Sys = 603 MiB	NumGC = 60
Alloc = 752 MiB	TotalAlloc = 3125 MiB	Sys = 1204 MiB	NumGC = 62 <------- Memory Peak
Alloc = 559 MiB	TotalAlloc = 3422 MiB	Sys = 1204 MiB	NumGC = 63 
Alloc = 38 MiB	TotalAlloc = 388 MiB	Sys = 77 MiB	NumGC = 43
Alloc = 38 MiB	TotalAlloc = 388 MiB	Sys = 77 MiB	NumGC = 43
Alloc = 40 MiB	TotalAlloc = 390 MiB	Sys = 77 MiB	NumGC = 43
Alloc = 42 MiB	TotalAlloc = 392 MiB	Sys = 77 MiB	NumGC = 43
Alloc = 43 MiB	TotalAlloc = 393 MiB	Sys = 77 MiB	NumGC = 43
Alloc = 88 MiB	TotalAlloc = 656 MiB	Sys = 117 MiB	NumGC = 50
Alloc = 91 MiB	TotalAlloc = 659 MiB	Sys = 117 MiB	NumGC = 50
Alloc = 94 MiB	TotalAlloc = 662 MiB	Sys = 117 MiB	NumGC = 50
Alloc = 96 MiB	TotalAlloc = 664 MiB	Sys = 117 MiB	NumGC = 50
Alloc = 99 MiB	TotalAlloc = 667 MiB	Sys = 117 MiB	NumGC = 50
Alloc = 100 MiB	TotalAlloc = 668 MiB	Sys = 117 MiB	NumGC = 50
Alloc = 102 MiB	TotalAlloc = 670 MiB	Sys = 121 MiB	NumGC = 50
Alloc = 103 MiB	TotalAlloc = 671 MiB	Sys = 121 MiB	NumGC = 50
Alloc = 103 MiB	TotalAlloc = 671 MiB	Sys = 121 MiB	NumGC = 50
Alloc = 71 MiB	TotalAlloc = 806 MiB	Sys = 150 MiB	NumGC = 53
Alloc = 78 MiB	TotalAlloc = 813 MiB	Sys = 150 MiB	NumGC = 53
Alloc = 81 MiB	TotalAlloc = 815 MiB	Sys = 150 MiB	NumGC = 53
Alloc = 81 MiB	TotalAlloc = 816 MiB	Sys = 150 MiB	NumGC = 53
Alloc = 81 MiB	TotalAlloc = 816 MiB	Sys = 150 MiB	NumGC = 53
Alloc = 82 MiB	TotalAlloc = 817 MiB	Sys = 150 MiB	NumGC = 53
Alloc = 82 MiB	TotalAlloc = 817 MiB	Sys = 150 MiB	NumGC = 53
Alloc = 83 MiB	TotalAlloc = 818 MiB	Sys = 150 MiB	NumGC = 53
Alloc = 83 MiB	TotalAlloc = 818 MiB	Sys = 150 MiB	NumGC = 53
Alloc = 148 MiB	TotalAlloc = 1001 MiB	Sys = 166 MiB	NumGC = 55
Alloc = 132 MiB	TotalAlloc = 1235 MiB	Sys = 207 MiB	NumGC = 59
Alloc = 133 MiB	TotalAlloc = 1236 MiB	Sys = 207 MiB	NumGC = 59
Alloc = 150 MiB	TotalAlloc = 1518 MiB	Sys = 248 MiB	NumGC = 62
Alloc = 811 MiB	TotalAlloc = 2702 MiB	Sys = 976 MiB	NumGC = 66.  <------- Memory Peak

Memory profile for v1.4.1:
image

@akpsgit
Copy link
Author

akpsgit commented Jun 2, 2024

@willmurphyscode, can we please reopen it?

@wagoodman
Copy link
Contributor

Thanks for the reproduction steps -- yeah, we can keep diving on this 👍

@wagoodman wagoodman reopened this Jul 9, 2024
@wagoodman wagoodman added performance and removed changelog-ignore Don't include this issue in the release changelog labels Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working performance
Projects
Status: Stalled
Development

No branches or pull requests

4 participants