Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Usage Improvements #33

Merged
merged 11 commits into from
Sep 5, 2021
Merged

Memory Usage Improvements #33

merged 11 commits into from
Sep 5, 2021

Conversation

shirok1
Copy link
Contributor

@shirok1 shirok1 commented Aug 29, 2021

Long story short, I am developing a WPF app using this library as a use-and-throw-away spectrogram generator. After noticing an unreasonably high memory usage,

EDIT: @shirok1's original comment/PR addressed multiple independent topics and @swharden edited it to add expandable sections and link to individual issue pages.

1. Bug: ReadWAV returns wrong sample count (#34)

1. Wrong sampleCount expression in ReadWAV (readme.md)

int sampleCount = (int) (afr.Length / afr.WaveFormat.BitsPerSample / 8);

image

I guess this sampleCount is an estimation of the final size of the List<double> audio. However, 253296 is 64 times smaller than 16210944. So the most reasonable explanation is /8 is a typo of *8.

image

This "typo" causes the birth of garbage twice as large as the audio itself.

2. Feature: SpectrogramGenerator.Add() should accept IEnumerable (#35)

2. An unnecessary type requirement in signature of sg.Add()

image

List<double>.AddRange accepts an IEnumerable<double> as the parameter. However, sg.Add() asks for an explicit double[], which means that I can't directly use the List<double> but have to do a .ToArray() first, which obvious will waste the memory as large as the audio itself.

3. Feature: Support initializing with an existing List (#36)

3. The loss of possibility to DIRECTLY use the reference of an external List<double>

The way you design this generator is for continuous or even real-time spectro drawing. In this case, make List<double> newAudio read-only, init as an empty one and .AddRange() later is the best solution. However, in my case, I need to use SG to generate the spectro for only once. Which means that as the Bitmap is generated, both the audio in ReadWAV and the newAudio in SG will immediately become garbage.

The Solution I come up with is this: An addition Optional Argument to init newAudio. It keeps the forward compatibility, but also enable me to prevent meaningless object creation (and later Grow into the same size as audio, which is the real problem).
image

4. Question: How to use the debugger P.S. I tried to add a breakpoint directly in the "Definition", but this is shown.

image

Could you tell me the right way to do this?

Hopefully this PR be merged and a new version of Nuget package be released. May you have a wonderful day!

@shirok1
Copy link
Contributor Author

shirok1 commented Aug 29, 2021

I tried but don't know where to add explanation for the additional optional argument in readme.md (supposed to be something like shortcuts?

@shirok1 shirok1 changed the title Memory Usage Improve Memory Usage Improvements Aug 29, 2021
@shirok1
Copy link
Contributor Author

shirok1 commented Aug 30, 2021

@swharden 😅

@shirok1
Copy link
Contributor Author

shirok1 commented Sep 4, 2021

@swharden pls merge this...

@swharden
Copy link
Owner

swharden commented Sep 4, 2021

@swharden pls merge this...

I've been focusing the last few days on https://github.com/ScottPlot/ScottPlot/issues and https://github.com/ScottPlot/ScottPlot/pulls but I'll try to get to this by the end of the day 👍

@swharden
Copy link
Owner

swharden commented Sep 4, 2021

Hi @shirok1, thank you for your PR! I reviewed its content and will begin working on it now.

The original comment/PR conflated at least 3 distinct issues. I opened 3 issues to track progress toward each and edited your original comment to link to those issues. I hope you find this acceptable 👍

I'll refactor this PR to address just the IEnumerable suggestion (#35) and merge it in, then work on the other two issues separately.

@swharden
Copy link
Owner

swharden commented Sep 5, 2021

Regarding topic 1, I added new tests to verify that ReadWAV is functioning as expected. I used Python's scipy.io.wavfile and GoldWave to confirm the lengths of WAV files in the data folder.

[TestCase("cant-do-that-44100.wav", 44_100, 166_671, 1)]
[TestCase("03-02-03-01-02-01-19.wav", 48_000, 214_615, 1)]
[TestCase("qrss-10min.wav", 6_000, 3_600_000, 1)]
[TestCase("cant-do-that-11025-stereo.wav", 11_025, 41668, 2)]
[TestCase("asehgal-original.wav", 40_000, 1_600_000, 1)]
public void Test_AudioFile_LengthAndSampleRate(string filename, int knownRate, int knownLength, int channels)
{
string filePath = $"../../../../../data/{filename}";
(double[] audio, int sampleRate) = AudioFile.ReadWAV(filePath);
Assert.AreEqual(knownRate, sampleRate);
Assert.AreEqual(knownLength, audio.Length / channels);
}

This "typo" causes the birth of garbage twice as large as the audio itself

I think the misunderstanding is that if the data is two channels, it will return an array with data from both channels in the array. Perhaps it's simply twice as large as you expected because of this?

@swharden swharden merged commit 33415b3 into swharden:master Sep 5, 2021
swharden added a commit that referenced this pull request Sep 5, 2021
Makes it more explicit that this sample function only works with mono WAV files. #33
@shirok1
Copy link
Contributor Author

shirok1 commented Sep 5, 2021

Thank you for your work on manually splitting it into 3 issues! Therefor my commits will be much clarified.

This "typo" causes the birth of garbage twice as large as the audio itself

I think the misunderstanding is that if the data is two channels, it will return an array with data from both channels in the array. Perhaps it's simply twice as large as you expected because of this?

The garbage is generated when the array is Growing to a larger capacity, a new internal array will be created and the old one become garbage. The default action of List is to double its capacity. So, 1/64 + 1/32 + ... +1/4 + 1/2 + 1 ≈ 2... May be the final internal List is also counted as 'allocated by Grow'. Anyway, thanks a lot again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants