Skip to content

AudioStreamWAV::load_from_file: Do not load file into memory#115235

Open
DeeJayLSP wants to merge 1 commit intogodotengine:masterfrom
DeeJayLSP:wav-loader
Open

AudioStreamWAV::load_from_file: Do not load file into memory#115235
DeeJayLSP wants to merge 1 commit intogodotengine:masterfrom
DeeJayLSP:wav-loader

Conversation

@DeeJayLSP
Copy link
Contributor

@DeeJayLSP DeeJayLSP commented Jan 22, 2026

Partly inspired by #96545 (this PR began as a split from that one before reducing its scope to the current state)

#93831 repurposed WAV importing code so it could be used to load WAV files.

It had a side effect though: all WAV loading, including importing, loads the entire file data into memory to read, even if you use load_from_file:

before

I did some changes to separate the load procedure from load_from_buffer, so load_from_file could read from a FileAccess again. It's now something like this:

after

The current code reads the file frame-by-frame, which is significantly slower when reading directly from file, so I borrowed a technique from dr_wav where it does so in chunks of 4KiB, which maintains the current read time.

By benchmarks below, this should minimize memory usage while keeping load times around the same.

@Ivorforce
Copy link
Member

We discussed this PR briefly in the audio meeting. Makes sense to avoid copying into a buffer for no reason.

Am I understanding it right that this is a performance optimization (reduced CPU load + RAM use)? If so, please provide benchmarks / profiles, as per our optimization guidelines.

W were also wondering if there could have been any reason for get_file_as_bytes to be used (caching perhaps?). Did you check if that may be the case?

@DeeJayLSP
Copy link
Contributor Author

DeeJayLSP commented Jan 27, 2026

Am I understanding it right that this is a performance optimization (reduced CPU load + RAM use)? If so, please provide benchmarks / profiles, as per our optimization guidelines.

I wouldn't say this is a performance optimization, but more like "changing to what it used to be" before #93831 in a non-breaking way. Mostly because I never understood why wasn't this done before.

W were also wondering if there could have been any reason for get_file_as_bytes to be used (caching perhaps?). Did you check if that may be the case?

The original code always used a direct FileAccess before #93831, with no statement of why get_file_as_bytes was used. The change from opening a FileAccess into using get_file_as_bytes was never questioned either. Probably due to how it matched AudioStreamOggVorbis::load_from_file().

@Ivorforce
Copy link
Member

I wouldn't say this is a performance optimization, but more like "changing to what it used to be" before #93831 in a non-breaking way. Mostly because I never understood why wasn't this done before.

What's the motivation to change it if not for performance purposes? Every change comes with some risk, and we especially shouldn't be changing code because we don't understand what it does.

The original code always used a direct FileAccess before #93831, with no statement of why get_file_as_bytes was used. The change from opening a FileAccess into using get_file_as_bytes was never questioned either. Probably due to how it matched AudioStreamOggVorbis::load_from_file().

That's a good indication that switching back is probably safe, but not a guarantee. cc @cherrythecool

@DeeJayLSP
Copy link
Contributor Author

DeeJayLSP commented Jan 27, 2026

Compiled a build using scons target=editor production=yes.

For testing, I decided to import hundreds of WAV files(a total of 2.3GiB, the biggest WAV file having 88.7 MiB):

Build Import time Maximum memory usage
Before 13s 4.1 GiB
After 24s 2.2 GiB

In the end, it trades memory usage for higher import times. Although if we enabled import options like downsampling to mono or decreasing the mix rate, memory usage would be much higher while import times would be closer to each other.

I might do more benchmarks.

@cherrythecool
Copy link
Contributor

The original code always used a direct FileAccess before #93831, with no statement of why get_file_as_bytes was used. The change from opening a FileAccess into using get_file_as_bytes was never questioned either. Probably due to how it matched AudioStreamOggVorbis::load_from_file().

sorry for not explaining back then, but if i recall correctly, i used get_file_as_bytes due to some weird issue i was having where FileAccess and FileAccessMemory didn't work the same in some aspect? like there was something in the code that would trigger an assertion or error or something when using FileAccessMemory, but removing it seemed to cause no issues and never happened with regular FileAccess? i don't remember exactly why but if that isn't a problem anymore then i imagine switching things to use the actual file for reduced memory usage in importing would be an overall improvement 👍

@DeeJayLSP
Copy link
Contributor Author

DeeJayLSP commented Jan 28, 2026

After a few more benchmarks, I realized reading from file is significantly slower due to FileAccess reading frame by frame instead of a memory buffer.

If reading directly from file is desired nonetheless I know two ways of speeding it up:

  1. Make a temporary buffer to copy large chunks of data into memory before encoding it to float. I know this is what dr_wav does normally with 4096 bytes at once.
  2. Implementing dr_wav itself. Which is actually Move WAV loading functionality to a wav_loader module with dr_wav (adds AIFF support) #96545 and would do way more than that.

With dr_wav, loading from file has about the same speed as loading from buffer.

@DeeJayLSP
Copy link
Contributor Author

DeeJayLSP commented Jan 28, 2026

  1. Make a temporary buffer to copy large chunks of data into memory before encoding it to float. I know this is what dr_wav does normally with 4096 bytes at once.

I managed to get this one working.

Instead of making it read frame by frame from a FileAccess, we copy a 4KiB chunk into a FixedVector each time, then convert from it.

On my tests, there is little difference in time between loading from buffer and from file. Actually, from file seems to be slightly faster.

@DeeJayLSP
Copy link
Contributor Author

DeeJayLSP commented Jan 28, 2026

Redid the benchmarks after the change, this time checking only the time comparison between from buffer and from file:

Build Time from buffer Time from file
master 14584 ms 14536 ms
PR 14656 ms 14388 ms

Skipped memory checks as I don't believe there would be a difference between the previous benchmark. But basically, the difference in time is negligible.

@DeeJayLSP DeeJayLSP force-pushed the wav-loader branch 2 times, most recently from 7f709a5 to 26d6a15 Compare January 28, 2026 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants