-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
netboot: trim down heap allocations. #2655
Conversation
(have to run. this is a copy from my internal copy. should be in good first shape. so send for early review) |
pkg/boot/netboot/ipxe/ipxe.go
Outdated
@@ -86,7 +87,42 @@ func (c *parser) getFile(surl string) (io.ReaderAt, error) { | |||
if err != nil { | |||
return nil, fmt.Errorf("could not parse URL %q: %v", surl, err) | |||
} | |||
return c.schemes.LazyFetch(u) | |||
// return c.schemes.LazyFetch(u) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the big question I have: we're doing heap avoidance but copying a file to tmpfs. But it's all ram at that point.
If you could mmap, and then read into the file directly, avoiding the bounce buffers you are allocating, ISTM this would be faster. It's what I've done anyway.
Overall, if you are copying from ram to ram, I'd consider that as a problem and start there. But I know you've done a lot with mmap, so maybe there is something here I just don't know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are spot on, Ron. I think mmap is also a distraction -- you can use small buffers to stream reading from or writing to a file in tmpfs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The memory number that really matters, the one that you need to track with these changes, is ultimately the amount of physical memory used by a full build. (You can do that with one of the VMs we have internally or the qemu stuff here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is one of the approaches that I experimented with, and eventually went with. I think the big question you asked is mis-led by this code itself.
Let me attempt to explain how I arrived at this version, rather than others.
The problem: before this change, the c.schemes.LazyFetch(u)
returns an io.ReaderAt
(with an in heap bytes slice to support seek-able read). That in-heap cache[1] is the main thing I am trying to get ride of. (Other optimizations become intuitive later on, after I selected file based approach)
Approaches tried:
breaking interface change
: I tried change LinuxImage's reference to kernel
and initrd
to be io.Reader
, from current io.ReaderAt
. Then we don't need the cache to support seek-able read, io.ReaderAt
. It did not go well, as this is breaking interface change, and seek-able read is be relied on pervasively in other boot paths. It is a workable approach. But, eventually, I gave up, as the volume of code turned out to be a lot, and more importantly, I have to read it once into a giant bytes slice, and then have all code that previously reads io.ReaderAt
to instead read from the pre-read bytes slice. You now probably can tell that, this effectively created another "copy"
of the image in heap, that we are trying to get ride of in the first place, as well ( well, at least transiently while the bytes slice are still referenced.
Loads contents into file into tmpfs
: Reading http body io.Reader
, and writting it directly into tmpfs file help avoid maintaining big slice in heap. In addition, with this, I can skip calling curl.http.Fetch
(which uses cached reader, to create a in heap slice to support io.ReaderAt
). But given contents are written into tmpfs file, I can simply return os.File
disguising as io.ReaderAt
for kernel and initrd. So there are no breaking interfaces change needed.
I have tried multiple ways of implementing lazy Opener with file-based io.ReaderAt. This version is the one I found simplest.
On mmap
, If I go with breaking interface change from io.ReaderAt
to io.Reader
for kernel, and initrd (https://github.com/u-root/u-root/blob/main/pkg/boot/linux.go#L28), I will certainly mmap to avoid I/O. I tried it. Specifically, it would happen around here https://github.com/u-root/u-root/blob/main/pkg/boot/linux.go#L212-L233. But if the backend for io.ReaderAt
is a file in tmpfs, more efficient way is to re-use the file, other than mmap-read it into another file.
On ram
consumption of a full build, I don't know if we have any existing functions that directly let us query how much physical ram our code is currently using. Probably not, we have MMU enabled, after all. But I did print out golang runtime's view of its virtual memory, and sizes of tmpfs giles. These two are, in my view, the main consumer of the ram, and they should be a good reflection.
[1] LazyFetch(u)
uses cached reader, which maintains an in-heap bytes slice. https://github.com/u-root/u-root/blob/main/pkg/curl/schemes.go#L216, which eventually uses a uio.NewCachingReader(r)
, https://github.com/u-root/u-root/blob/main/pkg/curl/schemes.go#L260. If you read through the processing code, that cache was not taken advantage in case of ipxe.
Does those clarify ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, within the 'open' function, this PR does a 'io.Copy' from http body (io.Reader) to a tmpfs file (*os.File). It copies from network to tmpfs file (ram).
I could mem map the newly created tmpfs file, into a bytes slice, and io.Copy from http body (io.Reader) into the mem map slice and then flush.
On the surface, we avoid i/o. But as you said, it is writing to ram (tmpfs file) anyway, the speed of doing I/O write would not be penalized, as we won't be dealing with any layer of storage controller flushing to physical device.
Opening a file and writing to *os.File here can arguably be replaced with mem map and writing to mapped slice for the sake of reducing potential extra heap allocations it may need to do the reading. But, from my observations, golangs implemntation of io.Copy is quite lean, and I never observed large chunk of allocations lingering around, e.g while downloading > 1G payload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please explain that succinctly in the commit description.
Done. The commit message 1), 2), 3) did intend to say just that. I updated 3) a bit to make it clearer that we are kexec the file (re-use) in tmpfs directly.
Can you fix the caching reader instead to do this rather than making targeted fixes that only help in specific cases? What you've done would make u-root more complicated (now there's a caching reader && this tmpfs logic, but the tmpfs logic is only in... ipxe and this catinitrd thing?)
I thought about this. Are you sure changing the CacheReader's buffer to a os.File
is good for all cases? - I assumed in-heap allocation is still taken advantage in other cases.
Not sure if I agree this make u-root code more complicated. But this does add a new way, of download and cache file... It is new, and does seem complicated. But the beauty of LazyOpenerAt
is that it takes open func() (io.ReaderAt, error)
, which allows user to define their open function. This PR simply instructs ipxe to write a new open function.
This PR does add in new building blocks, specifically targeted for a specific case. They are building blocks, that users can opt in to use, or not to use.
But agree we should keep building blocks lean, and no duplicate. If there is specific suggestion on how to organize differently, happy to take them !
Now u-root is in a lot of places, I made the change to ipxe as that is where is needed for the right case. I did take a look at other netboot schemes, like FIT, they downloads, and process slightly differently that I decided to go with only ipxe.
I made the change with a goal in mind that I am extending the toolbox / building blocks, that each boot use case can opt in to use. I hope not to overstep on what is best for all cases, so I was hesitant to change the backend for CacheReader
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't be too shy to make changes because u-root is used in other places than here; we should just do some research before making them. https://github.com/search?q=uio.NewCachingReader&type=code -- it's only used in pkg/curl. So how about we delete it, and you make one called NewTmpfsCachingReader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What were the cases that uio.NewCachingReader&type=code were initially written for ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what it's initially for is somewhat irrelevant -- the only thing that's relevant is what it's used for today. That search shows the only place it's used is pkg/curl, which is only AFAICT used for downloading stuff for kexecing (see here -- you have to enroll the github code search beta to actually find stuff). That tells me it's safe to replace.
As for the history, not that it's relevant, but the caching reader moved from pkg/pxe to pkg/uio in this commit. The schemes were still in pkg/pxe at the time, and that was the creation of the lazy stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to avoid regressions for one. It sounds like you sure what it was written for no longer hold for today. Thxfor the note, Id do a check too and update here, as I recall being told the cache reader was there for some good cases, e.g. vfile
pkg/boot/netboot/ipxe/ipxe.go
Outdated
@@ -86,7 +87,42 @@ func (c *parser) getFile(surl string) (io.ReaderAt, error) { | |||
if err != nil { | |||
return nil, fmt.Errorf("could not parse URL %q: %v", surl, err) | |||
} | |||
return c.schemes.LazyFetch(u) | |||
// return c.schemes.LazyFetch(u) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are spot on, Ron. I think mmap is also a distraction -- you can use small buffers to stream reading from or writing to a file in tmpfs.
7e5392b
to
7f1557a
Compare
7f1557a
to
d2abf3e
Compare
d2abf3e
to
9c05ccf
Compare
This patch trims down heap allocations from booting from entries created via pkg/boot/netboot.BootImages by 1) elinimating un-used in heap cache from uio cache readers. 2) move in heap images downloads directly into tmpfs files. 3) re-using tmpfs cache downloaded in lazy openers, and kexec them directly. Before this patch, when booting an image of following size kernel 30409216 bytes initrd 356650177 bytes We observed around 1.7G active heap allocations on top of a copy of the image in tmpf, right before we make kexec load syscall. See ``` ... [netboot] 2023/04/06 04:43:44 /tmpetc, 166 [netboot] 2023/04/06 04:43:44 /tmpkexec-image1189287700, 30409216 [netboot] 2023/04/06 04:43:44 /tmpkexec-image786188452, 356650177 [netboot] 2023/04/06 04:43:44 /tmpwatchdogd, 0 [netboot] 2023/04/06 04:43:44 MemStats: {Alloc:1705444048 TotalAlloc:2279514408 Sys:2335046200 Lookups:0 Mallocs:89140 Frees:11762 HeapAlloc:1705444048 HeapSys:2251948032 HeapIdle:545636352 HeapInuse:1706311680 HeapReleased:12288000} [netboot] 2023/04/06 04:43:44 MemStats: HeapSys: 2251948032, HeapReleased: 12288000 [netboot] 2023/04/06 04:43:44 env: , sys: 2251948032, alloc: 1705444048, idel: 545636352, released: 12288000, inuse: 1706311680 ... ``` With this patch, we trim down the aggresive heap allocations down to single digit of MBs right before kexec load syscall. The only outstanding ram consumer is the image in tmpfs. So we effectively have one copy of the image in the system. See ``` ... [netboot] 2023/04/14 04:35:36 /tmpcache-kernel1502474323, 30409216 [netboot] 2023/04/14 04:35:36 /tmpcombined-initrd3781511504, 356650177 [netboot] 2023/04/14 04:35:36 /tmpetc, 166 [netboot] 2023/04/14 04:35:36 /tmpwatchdogd, 0 [netboot] 2023/04/14 04:35:36 MemStats: {Alloc:1349056 TotalAlloc:7277808 Sys:15287304 Lookups:0 Mallocs:89069 Frees:78267 HeapAlloc:1349056 HeapSys:7962624 HeapIdle:5685248 HeapInuse:2277376 HeapReleased:2621440 HeapObjects:10802 StackInuse:425984 St} [netboot] 2023/04/14 04:35:36 MemStats: HeapSys: 7962624, HeapReleased: 2621440 [netboot] 2023/04/14 04:35:36 env: , sys: 7962624, alloc: 1349056, idel: 5685248, released: 2621440, inuse: 2277376 ... ``` Signed-off-by: David Hu <xuehaohu@google.com>
9c05ccf
to
0b3c4b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a case where I have to be aware that I know far less than the author of the code ... seems ok.
(fixing tests) |
Fuzzy builders beeing failing,
but the function does exist. Are we expected to update anything / build a new image for fuzzers ? |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #2655 +/- ##
==========================================
- Coverage 75.21% 75.20% -0.01%
==========================================
Files 413 413
Lines 41794 41885 +91
==========================================
+ Hits 31436 31501 +65
- Misses 10358 10384 +26
... and 1 file with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any idea why the fuzzing failed?
Build fuzzers is failing out of
but the function does exist in PR. It seems like some fuzzer test specific config / image need update (?) still looking |
fuzzer build hardcoded here: https://github.com/google/oss-fuzz/blob/master/projects/u-root/build.sh#L96-L111 |
Local execution of the fuzzing test using go test seems to work, I opened up an issue google/oss-fuzz#10112 as this seems to be related to the fuzzing build using libfuzzer. |
thx! @RiSKeD do you have it documented somewhere how to run the fuzzer locally? |
proceed for now. will address new responses from chris as they come
any concern for I submit this PR while the other thread works on getting the fuzzer work for this cas? I am also suspecting the issue is because the fuzzer build could not pick this PR specific code correctly. So submitting this would help verify that theory as ww can see if the same ipxe fuzzer build works in new PR |
OSS Fuzzer folks do not know how to resolve it at the moment..But local runs did pass as @RiSKeD mentioned, so this is specific to fuzzer CI setup |
I do still want to respond but I was busy today first and then sick. I'll
respond tomorrow.
…On Tue, Apr 18, 2023, 20:47 Xuehao (David) Hu ***@***.***> wrote:
OSS Fuzzer folks do not know how to resolve it at the moment..
—
Reply to this email directly, view it on GitHub
<#2655 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAPG3EW2KLFHVN4CKY5UYITXB5N4FANCNFSM6AAAAAAW562SW4>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
|
@10000TB For running the fuzzing tests locally, check out this document I've written about fuzzing: https://github.com/u-root/u-root/blob/main/docs/fuzzing/fuzzing.md. |
Refactor the build so it will pass with u-root/u-root#2655. Fixes #10112 Signed-off-by: AdamKorcz <adam@adalogics.com>
The CIFuzz build is now fixed by google/oss-fuzz#10121 |
@hugelgupf chris, if you are not having a second look at this at the moment. can i propose this be submitted as is ? as the I am hoping to resolve a P0 with this |
Please still comment as you find time! happy to send a followup PR for that if necessary |
If you'd mentioned the P0 timeline a bit earlier, I'd review faster. It's not appropriate to dismiss requested changes and merge anyway; the P0 is the only reason your commit is not getting reverted now. (My comments are, as they have already been, the same thing: please fix the caching reader generically, and make File lazy.)
|
can you also reply to my questions for clarification purposes, I was concerned if new behavior cause regression for original cases, thanks e.g. #2655 (comment) |
This patch trims down heap allocations from
booting from entries created via
pkg/boot/netboot.BootImages
by 1) elinimating un-used in heap cache
from uio cache readers. 2) move in heap
images downloads directly into tmpfs files.
3) re-using tmpfs cache downloaded in lazy
openers.
Before this patch, when booting an image
of following size
We observed around 1.7G active heap allocations
on top of a copy of the image in tmpf, right before we make kexec load syscall. See
With this patch, we trim down the aggresive
heap allocations down to single digit of MBs
right before kexec load syscall. The only
outstanding ram consumer is the image in tmpfs.
So we effectively have one copy of the image
in the system. See