In "realtime audio spectrograph" exist int fft_size. It's define height of the image. But how it's define the input array into blocks? #1

Depact · 2018-05-01T18:33:39Z

You copy from unanalyzed_values array with fft_size elements, and process. After it you delete from unanalyzed_values fft_size / pixelsPerBuffer values. Seems fft_size / pixelsPerBuffer != fft_size. Also, if unanalyzed_values has too much values - you delete some of it.
How you understand, where new audio sample starts, and ends? Why sample of music, that code process should be the same, as fft_size and start at the same position?

The text was updated successfully, but these errors were encountered:

swharden · 2018-05-21T18:38:52Z

Hi DeltaImpact, It's hard to assemble the high-level sequence of what is going on by looking at the code alone. I hope this brief summary helps clarify the topic and answer your question! I don't have the variable names memorized, but I think the terms I use will be general enough you may be able to figure them out if you understand these concepts. Note that I am referencing the microphone spectrograph source code for this example

a list of PCM audio values is added to List<short> unanalyzed_values every time a new audio buffer is completed
- when waveIn.StartRecording() is called, the sound recorder starts filling-up the audio buffer
- every time the buffer is full, Audio_buffer_captured() gets run (which adds the latest buffer to unanalyzed_values)
- if a high frame-rate of updates is required, it's important to get a high rate of completed buffers per second (since the data is only updated when a buffer completes). That's why the waveIn.BufferMilliseconds = 1000 / buffer_update_hz line is important.
timer1 periodically looks at the analyzes PCM data and analyzes it when it's long enough
- analysis always slightly lags behind recording, usually by 1-2 buffer lengths
- if analysis falls behind (CPU load issues), excessively old analyzed data is deleted to lighten the load. However, this should only occur in near-error-like conditions, and is a fail-safe to prevent slowness or crashing.
- analysis analyzes the last "chunk" of audio, which could be the buffer size itself or some other size. The chunk is probably the FFT size.
- There is a 1:2 relationship between the input PCM data length and the output FFT data in my code. This is because I collapse the real and imaginary data into a single FFT array. A 1000 point PCM array analyzed in this way will produce a FFT with 500 points.
- FFT analysis benefits from "windowing" the sample so its edges approach zero. Check out this page on windowing, but note that you can do a pretty good job just by applying a triangle-shape window to the data prior to the FFT.
- when analysis completes, the analyzed data is deleted from the list of unrealized values
- often spectrographs benefit from having overlapping "chunks" of sequential analysis, therefore it may be desired to delete less data than you actually analyzed, thus forcing partial re-analysis of overlapping data. This produces "smoother" output spectrographs.

I hope these notes help! Let me know if you have additional questions.
Best,
Scott

swharden closed this as completed May 21, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In "realtime audio spectrograph" exist int fft_size. It's define height of the image. But how it's define the input array into blocks? #1

In "realtime audio spectrograph" exist int fft_size. It's define height of the image. But how it's define the input array into blocks? #1

Depact commented May 1, 2018

swharden commented May 21, 2018

In "realtime audio spectrograph" exist int fft_size. It's define height of the image. But how it's define the input array into blocks? #1

In "realtime audio spectrograph" exist int fft_size. It's define height of the image. But how it's define the input array into blocks? #1

Comments

Depact commented May 1, 2018

swharden commented May 21, 2018