Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

netboot: trim down heap allocations. #2655

Merged
merged 2 commits into from
Apr 20, 2023

Conversation

10000TB
Copy link
Member

@10000TB 10000TB commented Apr 14, 2023

This patch trims down heap allocations from
booting from entries created via

pkg/boot/netboot.BootImages

by 1) elinimating un-used in heap cache
from uio cache readers. 2) move in heap
images downloads directly into tmpfs files.
3) re-using tmpfs cache downloaded in lazy
openers.

Before this patch, when booting an image
of following size

kernel 30409216 bytes
initrd 356650177 bytes

We observed around 1.7G active heap allocations
on top of a copy of the image in tmpf, right before we make kexec load syscall. See

...
[netboot] 2023/04/06 04:43:44 /tmpetc, 166
[netboot] 2023/04/06 04:43:44 /tmpkexec-image1189287700, 30409216
[netboot] 2023/04/06 04:43:44 /tmpkexec-image786188452, 356650177
[netboot] 2023/04/06 04:43:44 /tmpwatchdogd, 0
[netboot] 2023/04/06 04:43:44 MemStats:

{Alloc:1705444048 TotalAlloc:2279514408 Sys:2335046200 Lookups:0 Mallocs:89140 Frees:11762 HeapAlloc:1705444048 HeapSys:2251948032 HeapIdle:545636352 HeapInuse:1706311680 HeapReleased:12288000}

[netboot] 2023/04/06 04:43:44 MemStats: HeapSys: 2251948032, HeapReleased: 12288000
[netboot] 2023/04/06 04:43:44 env: , sys: 2251948032, alloc: 1705444048, idel: 545636352, released: 12288000, inuse: 1706311680
...

With this patch, we trim down the aggresive
heap allocations down to single digit of MBs
right before kexec load syscall. The only
outstanding ram consumer is the image in tmpfs.
So we effectively have one copy of the image
in the system. See

...
[netboot] 2023/04/14 04:35:36 /tmpcache-kernel1502474323, 30409216
[netboot] 2023/04/14 04:35:36 /tmpcombined-initrd3781511504, 356650177
[netboot] 2023/04/14 04:35:36 /tmpetc, 166
[netboot] 2023/04/14 04:35:36 /tmpwatchdogd, 0
[netboot] 2023/04/14 04:35:36 MemStats:

{Alloc:1349056 TotalAlloc:7277808 Sys:15287304 Lookups:0 Mallocs:89069 Frees:78267 HeapAlloc:1349056 HeapSys:7962624 HeapIdle:5685248 HeapInuse:2277376 HeapReleased:2621440 HeapObjects:10802 StackInuse:425984 St}

[netboot] 2023/04/14 04:35:36 MemStats: HeapSys: 7962624, HeapReleased: 2621440
[netboot] 2023/04/14 04:35:36 env: , sys: 7962624, alloc: 1349056, idel: 5685248, released: 2621440, inuse: 2277376
...

@10000TB
Copy link
Member Author

10000TB commented Apr 14, 2023

(have to run. this is a copy from my internal copy. should be in good first shape. so send for early review)

pkg/boot/linux.go Outdated Show resolved Hide resolved
@@ -86,7 +87,42 @@ func (c *parser) getFile(surl string) (io.ReaderAt, error) {
if err != nil {
return nil, fmt.Errorf("could not parse URL %q: %v", surl, err)
}
return c.schemes.LazyFetch(u)
// return c.schemes.LazyFetch(u)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the big question I have: we're doing heap avoidance but copying a file to tmpfs. But it's all ram at that point.

If you could mmap, and then read into the file directly, avoiding the bounce buffers you are allocating, ISTM this would be faster. It's what I've done anyway.

Overall, if you are copying from ram to ram, I'd consider that as a problem and start there. But I know you've done a lot with mmap, so maybe there is something here I just don't know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are spot on, Ron. I think mmap is also a distraction -- you can use small buffers to stream reading from or writing to a file in tmpfs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memory number that really matters, the one that you need to track with these changes, is ultimately the amount of physical memory used by a full build. (You can do that with one of the VMs we have internally or the qemu stuff here.)

Copy link
Member Author

@10000TB 10000TB Apr 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is one of the approaches that I experimented with, and eventually went with. I think the big question you asked is mis-led by this code itself.

Let me attempt to explain how I arrived at this version, rather than others.

The problem: before this change, the c.schemes.LazyFetch(u) returns an io.ReaderAt (with an in heap bytes slice to support seek-able read). That in-heap cache[1] is the main thing I am trying to get ride of. (Other optimizations become intuitive later on, after I selected file based approach)

Approaches tried:

breaking interface change: I tried change LinuxImage's reference to kernel and initrd to be io.Reader, from current io.ReaderAt. Then we don't need the cache to support seek-able read, io.ReaderAt. It did not go well, as this is breaking interface change, and seek-able read is be relied on pervasively in other boot paths. It is a workable approach. But, eventually, I gave up, as the volume of code turned out to be a lot, and more importantly, I have to read it once into a giant bytes slice, and then have all code that previously reads io.ReaderAt to instead read from the pre-read bytes slice. You now probably can tell that, this effectively created another "copy" of the image in heap, that we are trying to get ride of in the first place, as well ( well, at least transiently while the bytes slice are still referenced.

Loads contents into file into tmpfs: Reading http body io.Reader, and writting it directly into tmpfs file help avoid maintaining big slice in heap. In addition, with this, I can skip calling curl.http.Fetch (which uses cached reader, to create a in heap slice to support io.ReaderAt). But given contents are written into tmpfs file, I can simply return os.File disguising as io.ReaderAt for kernel and initrd. So there are no breaking interfaces change needed.

I have tried multiple ways of implementing lazy Opener with file-based io.ReaderAt. This version is the one I found simplest.

On mmap, If I go with breaking interface change from io.ReaderAt to io.Reader for kernel, and initrd (https://github.com/u-root/u-root/blob/main/pkg/boot/linux.go#L28), I will certainly mmap to avoid I/O. I tried it. Specifically, it would happen around here https://github.com/u-root/u-root/blob/main/pkg/boot/linux.go#L212-L233. But if the backend for io.ReaderAt is a file in tmpfs, more efficient way is to re-use the file, other than mmap-read it into another file.

On ram consumption of a full build, I don't know if we have any existing functions that directly let us query how much physical ram our code is currently using. Probably not, we have MMU enabled, after all. But I did print out golang runtime's view of its virtual memory, and sizes of tmpfs giles. These two are, in my view, the main consumer of the ram, and they should be a good reflection.

[1] LazyFetch(u) uses cached reader, which maintains an in-heap bytes slice. https://github.com/u-root/u-root/blob/main/pkg/curl/schemes.go#L216, which eventually uses a uio.NewCachingReader(r), https://github.com/u-root/u-root/blob/main/pkg/curl/schemes.go#L260. If you read through the processing code, that cache was not taken advantage in case of ipxe.

Does those clarify ?

Copy link
Member Author

@10000TB 10000TB Apr 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, within the 'open' function, this PR does a 'io.Copy' from http body (io.Reader) to a tmpfs file (*os.File). It copies from network to tmpfs file (ram).

I could mem map the newly created tmpfs file, into a bytes slice, and io.Copy from http body (io.Reader) into the mem map slice and then flush.

On the surface, we avoid i/o. But as you said, it is writing to ram (tmpfs file) anyway, the speed of doing I/O write would not be penalized, as we won't be dealing with any layer of storage controller flushing to physical device.

Opening a file and writing to *os.File here can arguably be replaced with mem map and writing to mapped slice for the sake of reducing potential extra heap allocations it may need to do the reading. But, from my observations, golangs implemntation of io.Copy is quite lean, and I never observed large chunk of allocations lingering around, e.g while downloading > 1G payload.

Copy link
Member Author

@10000TB 10000TB Apr 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain that succinctly in the commit description.

Done. The commit message 1), 2), 3) did intend to say just that. I updated 3) a bit to make it clearer that we are kexec the file (re-use) in tmpfs directly.

Can you fix the caching reader instead to do this rather than making targeted fixes that only help in specific cases? What you've done would make u-root more complicated (now there's a caching reader && this tmpfs logic, but the tmpfs logic is only in... ipxe and this catinitrd thing?)

I thought about this. Are you sure changing the CacheReader's buffer to a os.File is good for all cases? - I assumed in-heap allocation is still taken advantage in other cases.

Not sure if I agree this make u-root code more complicated. But this does add a new way, of download and cache file... It is new, and does seem complicated. But the beauty of LazyOpenerAt is that it takes open func() (io.ReaderAt, error), which allows user to define their open function. This PR simply instructs ipxe to write a new open function.

This PR does add in new building blocks, specifically targeted for a specific case. They are building blocks, that users can opt in to use, or not to use.

But agree we should keep building blocks lean, and no duplicate. If there is specific suggestion on how to organize differently, happy to take them !

Now u-root is in a lot of places, I made the change to ipxe as that is where is needed for the right case. I did take a look at other netboot schemes, like FIT, they downloads, and process slightly differently that I decided to go with only ipxe.

I made the change with a goal in mind that I am extending the toolbox / building blocks, that each boot use case can opt in to use. I hope not to overstep on what is best for all cases, so I was hesitant to change the backend for CacheReader

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't be too shy to make changes because u-root is used in other places than here; we should just do some research before making them. https://github.com/search?q=uio.NewCachingReader&type=code -- it's only used in pkg/curl. So how about we delete it, and you make one called NewTmpfsCachingReader.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What were the cases that uio.NewCachingReader&type=code were initially written for ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what it's initially for is somewhat irrelevant -- the only thing that's relevant is what it's used for today. That search shows the only place it's used is pkg/curl, which is only AFAICT used for downloading stuff for kexecing (see here -- you have to enroll the github code search beta to actually find stuff). That tells me it's safe to replace.

As for the history, not that it's relevant, but the caching reader moved from pkg/pxe to pkg/uio in this commit. The schemes were still in pkg/pxe at the time, and that was the creation of the lazy stuff.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to avoid regressions for one. It sounds like you sure what it was written for no longer hold for today. Thxfor the note, Id do a check too and update here, as I recall being told the cache reader was there for some good cases, e.g. vfile

@rminnich rminnich added the Awaiting author Waiting for new changes or feedback for author. label Apr 14, 2023
@@ -86,7 +87,42 @@ func (c *parser) getFile(surl string) (io.ReaderAt, error) {
if err != nil {
return nil, fmt.Errorf("could not parse URL %q: %v", surl, err)
}
return c.schemes.LazyFetch(u)
// return c.schemes.LazyFetch(u)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are spot on, Ron. I think mmap is also a distraction -- you can use small buffers to stream reading from or writing to a file in tmpfs.

@10000TB 10000TB force-pushed the davidhu-in-secuure-mem-optimizations branch 2 times, most recently from 7e5392b to 7f1557a Compare April 15, 2023 05:55
@10000TB 10000TB added Awaiting reviewer Waiting for a reviewer. and removed Awaiting author Waiting for new changes or feedback for author. labels Apr 15, 2023
@10000TB 10000TB force-pushed the davidhu-in-secuure-mem-optimizations branch from 7f1557a to d2abf3e Compare April 15, 2023 06:38
@10000TB 10000TB force-pushed the davidhu-in-secuure-mem-optimizations branch from d2abf3e to 9c05ccf Compare April 16, 2023 05:55
This patch trims down heap allocations from
booting from entries created via

   pkg/boot/netboot.BootImages

by 1) elinimating un-used in heap cache
from uio cache readers. 2) move in heap
images downloads directly into tmpfs files.
3) re-using tmpfs cache downloaded in lazy
openers, and kexec them directly.

Before this patch, when booting an image
of following size

    kernel 30409216 bytes
    initrd 356650177 bytes

We observed around 1.7G active heap allocations
on top of a copy of the image in tmpf, right before
we make kexec load syscall. See

```
...
[netboot] 2023/04/06 04:43:44 /tmpetc, 166
[netboot] 2023/04/06 04:43:44 /tmpkexec-image1189287700, 30409216
[netboot] 2023/04/06 04:43:44 /tmpkexec-image786188452, 356650177
[netboot] 2023/04/06 04:43:44 /tmpwatchdogd, 0
[netboot] 2023/04/06 04:43:44 MemStats:

{Alloc:1705444048 TotalAlloc:2279514408 Sys:2335046200 Lookups:0 Mallocs:89140 Frees:11762 HeapAlloc:1705444048 HeapSys:2251948032 HeapIdle:545636352 HeapInuse:1706311680 HeapReleased:12288000}

[netboot] 2023/04/06 04:43:44 MemStats: HeapSys: 2251948032, HeapReleased: 12288000
[netboot] 2023/04/06 04:43:44 env: , sys: 2251948032, alloc: 1705444048, idel: 545636352, released: 12288000, inuse: 1706311680
...
```

With this patch, we trim down the aggresive
heap allocations down to single digit of MBs
right before kexec load syscall. The only
outstanding ram consumer is the image in tmpfs.
So we effectively have one copy of the image
in the system. See

```
...
[netboot] 2023/04/14 04:35:36 /tmpcache-kernel1502474323, 30409216
[netboot] 2023/04/14 04:35:36 /tmpcombined-initrd3781511504, 356650177
[netboot] 2023/04/14 04:35:36 /tmpetc, 166
[netboot] 2023/04/14 04:35:36 /tmpwatchdogd, 0
[netboot] 2023/04/14 04:35:36 MemStats:

{Alloc:1349056 TotalAlloc:7277808 Sys:15287304 Lookups:0 Mallocs:89069 Frees:78267 HeapAlloc:1349056 HeapSys:7962624 HeapIdle:5685248 HeapInuse:2277376 HeapReleased:2621440 HeapObjects:10802 StackInuse:425984 St}

[netboot] 2023/04/14 04:35:36 MemStats: HeapSys: 7962624, HeapReleased: 2621440
[netboot] 2023/04/14 04:35:36 env: , sys: 7962624, alloc: 1349056, idel: 5685248, released: 2621440, inuse: 2277376
...
```

Signed-off-by: David Hu <xuehaohu@google.com>
@10000TB 10000TB force-pushed the davidhu-in-secuure-mem-optimizations branch from 9c05ccf to 0b3c4b9 Compare April 16, 2023 06:17
Copy link
Member

@rminnich rminnich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a case where I have to be aware that I know far less than the author of the code ... seems ok.

@10000TB
Copy link
Member Author

10000TB commented Apr 16, 2023

(fixing tests)

@10000TB
Copy link
Member Author

10000TB commented Apr 16, 2023

Fuzzy builders beeing failing,

+ compile_native_go_fuzzer /src/u-root/pkg/boot/netboot/ipxe FuzzParseIpxeConfig fuzz_ipxe_parse_config
# ipxe
./ipxe.go:99:23: c.schemes.LazyFetchWithoutCache undefined (type curl.Schemes has no field or method LazyFetchWithoutCache)
./ipxe.go:124:19: c.schemes.LazyFetchWithoutCache undefined (type curl.Schemes has no field or method LazyFetchWithoutCache)
./ipxe.go:154:29: undefined: boot.CatInitrdsWithFileCache
2023/04/16 20:27:51 failed to build packages:exit status 2
2023-04-16 20:27:53,144 - root - ERROR - Building fuzzers failed.
2023-04-16 20:27:53,144 - root - ERROR - Error building fuzzers for (commit: f7ff40ff0e8a8fb38c26904e3d60012f121a2c4e, pr_ref: refs/pull/2655/merge).

but the function does exist. Are we expected to update anything / build a new image for fuzzers ?

@codecov
Copy link

codecov bot commented Apr 16, 2023

Codecov Report

Patch coverage: 72.11% and project coverage change: -0.01 ⚠️

Comparison is base (f6a61cd) 75.21% compared to head (c62849d) 75.20%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2655      +/-   ##
==========================================
- Coverage   75.21%   75.20%   -0.01%     
==========================================
  Files         413      413              
  Lines       41794    41885      +91     
==========================================
+ Hits        31436    31501      +65     
- Misses      10358    10384      +26     
Impacted Files Coverage Δ
pkg/uio/lazy.go 66.07% <54.54%> (-8.93%) ⬇️
pkg/boot/initrd.go 75.00% <61.53%> (-11.67%) ⬇️
pkg/boot/netboot/ipxe/ipxe.go 89.69% <80.76%> (-3.74%) ⬇️
pkg/curl/schemes.go 62.50% <84.61%> (+2.99%) ⬆️
pkg/boot/linux.go 53.57% <88.23%> (+1.97%) ⬆️

... and 1 file with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@rminnich rminnich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any idea why the fuzzing failed?

@10000TB
Copy link
Member Author

10000TB commented Apr 17, 2023

Build fuzzers is failing out of

# ipxe
./ipxe.go:99:23: c.schemes.LazyFetchWithoutCache undefined (type curl.Schemes has no field or method LazyFetchWithoutCache)
./ipxe.go:124:19: c.schemes.LazyFetchWithoutCache undefined (type curl.Schemes has no field or method LazyFetchWithoutCache)
./ipxe.go:154:29: undefined: boot.CatInitrdsWithFileCache
2023/04/17 13:31:51 failed to build packages:exit status 2

but the function does exist in PR. It seems like some fuzzer test specific config / image need update (?)

still looking

@10000TB
Copy link
Member Author

10000TB commented Apr 18, 2023

fuzzer build hardcoded here: https://github.com/google/oss-fuzz/blob/master/projects/u-root/build.sh#L96-L111

@RiSKeD

@RiSKeD
Copy link

RiSKeD commented Apr 18, 2023

Local execution of the fuzzing test using go test seems to work, I opened up an issue google/oss-fuzz#10112 as this seems to be related to the fuzzing build using libfuzzer.

@10000TB
Copy link
Member Author

10000TB commented Apr 18, 2023

thx! @RiSKeD do you have it documented somewhere how to run the fuzzer locally?

@10000TB 10000TB dismissed hugelgupf’s stale review April 18, 2023 21:27

proceed for now. will address new responses from chris as they come

@10000TB
Copy link
Member Author

10000TB commented Apr 19, 2023

any concern for I submit this PR while the other thread works on getting the fuzzer work for this cas?

I am also suspecting the issue is because the fuzzer build could not pick this PR specific code correctly. So submitting this would help verify that theory as ww can see if the same ipxe fuzzer build works in new PR

@10000TB 10000TB added the automerge Applying this label auto-merges the PR when ready label Apr 19, 2023
@10000TB
Copy link
Member Author

10000TB commented Apr 19, 2023

OSS Fuzzer folks do not know how to resolve it at the moment..But local runs did pass as @RiSKeD mentioned, so this is specific to fuzzer CI setup

@hugelgupf
Copy link
Member

hugelgupf commented Apr 19, 2023 via email

@RiSKeD
Copy link

RiSKeD commented Apr 19, 2023

@10000TB For running the fuzzing tests locally, check out this document I've written about fuzzing: https://github.com/u-root/u-root/blob/main/docs/fuzzing/fuzzing.md.

DavidKorczynski pushed a commit to google/oss-fuzz that referenced this pull request Apr 19, 2023
Refactor the build so it will pass with
u-root/u-root#2655.

Fixes #10112

Signed-off-by: AdamKorcz <adam@adalogics.com>
@RiSKeD
Copy link

RiSKeD commented Apr 19, 2023

The CIFuzz build is now fixed by google/oss-fuzz#10121

@10000TB
Copy link
Member Author

10000TB commented Apr 20, 2023

@hugelgupf chris, if you are not having a second look at this at the moment. can i propose this be submitted as is ? as the I am hoping to resolve a P0 with this

@10000TB 10000TB merged commit d44c006 into main Apr 20, 2023
@10000TB
Copy link
Member Author

10000TB commented Apr 20, 2023

Please still comment as you find time! happy to send a followup PR for that if necessary

@hugelgupf
Copy link
Member

@hugelgupf chris, if you are not having a second look at this at the moment. can i propose this be submitted as is ? as the I am hoping to resolve a P0 with this

If you'd mentioned the P0 timeline a bit earlier, I'd review faster. It's not appropriate to dismiss requested changes and merge anyway; the P0 is the only reason your commit is not getting reverted now. (My comments are, as they have already been, the same thing: please fix the caching reader generically, and make File lazy.)

// File returns the backend file of the io.ReaderAt if it
// is backed by a os.File. It opens the file if it has not 
// already been opened.
func (loa *LazyOpenerAt) File() *os.File {
	if loa.r == nil && loa.err == nil {
		loa.r, loa.err = loa.open()
	}
	if f, ok := loa.r.(*os.File); ok {
		return f
	}
	return nil
}

@10000TB
Copy link
Member Author

10000TB commented Apr 20, 2023

can you also reply to my questions for clarification purposes, I was concerned if new behavior cause regression for original cases, thanks

e.g. #2655 (comment)

@hugelgupf hugelgupf deleted the davidhu-in-secuure-mem-optimizations branch August 2, 2023 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge Applying this label auto-merges the PR when ready Awaiting reviewer Waiting for a reviewer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants