Skip to content

Commit

Permalink
Flat Spectrogram View (#82)
Browse files Browse the repository at this point in the history
* init with what we had in our project

* Made it compatible with AudioKitUI Namespace and OS availability

* fixed compiler warning: var to let

* fixed hound linter line length. no semantics, just syntax.

* fixed hound trailing spaces and spaces close to colon. no semantics, just syntax.

* fixed hound linter spaces and unused var. no semantics, just syntax.

* fixed hound linter commenting and variable name length. no semantics, just syntax.

* fixed hound linter variable name and cyclomatic complexity. no semantics, just syntax.

* fixed hound linter warnings. no semantics, just syntax.

* fixed hound linter warnings. no semantics, just syntax.

* Refined model to make it easier to configure in the future. Moved minFreq and maxFreq to SpectrogramFFTMetaData

* removed trailing white spaces. fixed hound

I really need to step up in linting and integrate it better into Xcode.

* disable swiftlint for variable name id

* Commented and marked unused function

* Added documentation on dataflow, design decisions and a brief history of the class.

* Comment on why Int instead of UUID

* Some guards to not crash on empty measurements

* Fixed linter warning on big tuples

* renamed tupel and added comment. no semantic, just syntax.

* Moved magic numbers to let constants with description.

* Comments on members / properties

* made it available prior iOS 17 using some vintage/depracted modifiers of SwiftUI

* minor rename

---------

Co-authored-by: mahal raskin <mahal@raskinapps.ch>
  • Loading branch information
aure and mahal committed Mar 18, 2024
1 parent b1fe2aa commit 6d08971
Show file tree
Hide file tree
Showing 4 changed files with 645 additions and 0 deletions.
@@ -0,0 +1,147 @@
// Copyright AudioKit. All Rights Reserved. Revision History at http://github.com/AudioKit/AudioKitUI/
/*
Dataflow overview:
* FFTTap analyzed the sound and creates an array of frequencies and amplitudes
several times per second. As soon as the data is ready, a new slice is instantiated.
On init, the slice converts the array of measurements to an image and caches it.
The conversion of data and creating an image takes quite some time and is
done only once.
* Drawing is done using UIGraphicsImageRenderer with context.fill primitives.
These are cached as UImage and layouted onto the view.
Steps involved:
* FFTTap calls SpectrogramFlatModel with newly analyzed sound using
``SpectrogramFlatModel/pushData(_ fftFloats: [Float])``
* The model then creates a SpectrogramSlice and puts it into the queue.
* Body of this view watches this queue and shows all slices in the queue.
* Because the body and therefore each slice is redrawn on any update of
the queue, the drawing of the slice should be fast. Current implementation
of SpectrogramSlice caches an image of itself after drawing.
* The image is drawn pixel aligned on a CGContext. The image then is resized
to fit into this view.
Brief history of this class
* Class was created using SpectrogramView as starting point
* SpectrogramView looked/looks like coming from an 90ies japanese synth,
in a kind of 3D surface which is cool. Most common spectrograms or sonographs
have a flat look.
* The flat look makes it easier to analyze music, make voice fingerprints and compare bird songs
* SpectrogramView had/has a major design flaw: on each update (as soon as new data arrived
from the FFT), all slices were completely redrawn from raw data. All recent measurements (80)
are converted from an array of measurements to Paths with all the lines.
* Measuring with Instruments showed that this takes a lot of time, therefore
this implementation caches the resulting image.
Cause of inefficiency of this implementation
* Each time a new slice arrives from FFTTap, the view gets a complete layout update.
* Rendering of new slices is done on a background thread and involves too many steps
* Frame rate is defined by how many samples come per second. This look ugly in case of less than 25 per second.
* It somehow doesn't show the frequency range that is selected, so some cpu time
is wasted for calculating stuff that isn't shown.
* Some arrays are iterated several times in a row whereas it could be done in one enumeration.
Following possibilities to be considered for a more energy efficient implementation:
* Only calc what is shown, enumerate array only once (see comment on captureAmplitudeFrequencyData()).
* Make the layouting independent of sample rate, just move the slices left with a continous, builtin animation.
* Layout and draw the slices directly on a Canvas (instead of HStack) and independently move the Canvas left.
* To make it shown crisp, all images should be drawn and layouted pixel aligned (integral size and position).
* Try .drawingGroup() if it helps up the performance
* Use ImageRenderer objectwillchange to create a stream of images
* Use Sample Code from Apple of vDSP and Accellerate (macOS) and port it to iOS:
https://developer.apple.com/documentation/accelerate/visualizing_sound_as_an_audio_spectrogram
* Spectrogram is actually kind of a Heatmap, so use SwiftUI.Chart
* Use factory and emitter to emit new slice images (like in a particle system)
* Measure performance impact when spreading on several threads or combine on main thread
* Use Metal-API with shaders similar to what aurioTouch Sample Code by Apple did in OpenGL
* Try to replace all CGPoint and CGPoint[] calculations using Accelerate or some other optimized library
* Measure efficiency and compare if it would make a difference to only use opaque colors in gradient
* By all these possibilites to improve energy efficiency, don't forget the latency.
* might be easy to make available in earlier versions than iOS 17, primarly because of .onChange(of:
*/

import AudioKit
import SwiftUI

/// Displays a rolling plot of the frequency spectrum.
///
/// Each slice represents a point in time with the frequencies shown from bottom to top
/// at this moment. Each frequency-cell is colored according to the amplitude.
/// The spectrum is shown logarithmic so octaves have the same distance.
///
/// This implementation is rather energy inefficent. You might not want to use it
/// a central feature in your app. Furthermore it's not scientificicly correct, when displaying
/// white noise, it will not show a uniform distribution.

public struct SpectrogramFlatView: View {
// this static var is a shortcut: better to have this in SpectrogramModel or SpectrogramFFTMetaData
public static var gradientUIColors: [UIColor] = [(#colorLiteral(red: 0, green: 0, blue: 0, alpha: 0)), (#colorLiteral(red: 0.1411764771, green: 0.3960784376, blue: 0.5647059083, alpha: 0.6275583187)), (#colorLiteral(red: 0.4217140079, green: 0.6851614118, blue: 0.9599093795, alpha: 0.8245213468)), (#colorLiteral(red: 0.8122602105, green: 0.6033009887, blue: 0.8759307861, alpha: 1)), (#colorLiteral(red: 0.9826132655, green: 0.5594901443, blue: 0.4263145328, alpha: 1)), (#colorLiteral(red: 1, green: 0.2607713342, blue: 0.4242972136, alpha: 1))]
@StateObject var spectrogram = SpectrogramFlatModel()
let node: Node
let backgroundColor: Color

/// put only one color into the array for a monochrome view
public init(node: Node,
amplitudeColors: [Color] = [],
backgroundColor: Color = Color.black) {
self.node = node
if amplitudeColors.count > 1 {
Self.gradientUIColors = amplitudeColors.map { UIColor($0) }
} else if amplitudeColors.count == 1 {
Self.gradientUIColors = [UIColor(backgroundColor), UIColor(amplitudeColors[0])]
}
self.backgroundColor = backgroundColor
}

public var body: some View {
return GeometryReader { geometry in
ZStack {
backgroundColor
.onAppear {
spectrogram.updateNode(node)
}
HStack(spacing: 0.0) {
ForEach(spectrogram.slices.items) { slice in
// flip it as the slice was drawn in the first quadrant
slice.scaleEffect(x: 1, y: -1)
// .border(.green, width: 2.0)
}
// flip it so the new slices come in right and move to the left
.scaleEffect(x: -1, y: 1)
}
// .border(.red, width: 5.0)
.frame(maxWidth: .infinity, maxHeight: .infinity, alignment: .trailing)
}.onAppear {
spectrogram.sliceSize = calcSliceSize(fromFrameSize: geometry.size)
}
.onChange(of: geometry.size) { newSize in
spectrogram.sliceSize = calcSliceSize(fromFrameSize: newSize)
}
}
}

func calcSliceSize(fromFrameSize frameSize: CGSize) -> CGSize {
let outSize = CGSize(
// even when we have non-integral width for a slice, the
// resulting image will be integral in size but resizable
// the HStack will then layout them not pixel aligned and stretched.
// that's why we ceil/floor it: ceiling makes them a bit more precise.
// floor makes it more energy efficient.
// We did some measurements, it's hard to tell visually
width: floor(frameSize.width / CGFloat(spectrogram.slices.maxItems)),
height: frameSize.height
)
return outSize
}
}

// MARK: Preview

struct SpectrogramFlatView_Previews: PreviewProvider {
static var previews: some View {
return SpectrogramFlatView(node: Mixer())
}
}
@@ -0,0 +1,170 @@
// Copyright AudioKit. All Rights Reserved. Revision History at http://github.com/AudioKit/AudioKitUI/
//

import AudioKit
import SwiftUI

/// Considerations for further development; depending on usage and requirements:
/// Make this struct public so the look can be configured. Define fftSize as enum.
/// Also add something like a gain or similar to adjust sensitivity of display.
struct SpectrogramFFTMetaData {
// fftSize defines how detailled the music is analyzed in the time domain.
// the lower the value, the less detail:
// * 1024: will receive about four analyzed frequencies between C2 and C* (65Hz to 130Hz).
// New data comes roughly 21.5 times per second, each 46ms.
// * 2048: will receive about eight analyzed frequencies between C2 and C* (65Hz to 130Hz).
// New data comes roughly 11 times per second, each 93ms.
// * 4096: will receive about 16 analyzed frequencies between C2 and C* (65Hz to 130Hz).
// New data comes roughly 5.5 times per second, each 186ms.
// Choose a higher value when you want to analyze low frequencies,
// choose a lower value when you want fast response and high frame rate on display.
let fftSize = 2048

// Lowest and highest frequencies shown.
// We use 48Hz, which is a bit lower than G1. A1 would be 440Hz/8 = 55Hz.
// The lowest human bass voice in choral music is reaching down to C1 (32.7 Hz).
// Don't go lower than 6.0, it just doesn't make sense and the display gets terribly distorted
// don't use 0 as it breaks the display because log10(0) is undefined and this error not handled
let minFreq: CGFloat = 48.0
// we will not show anything above 13500 as it's not music anymore but just overtones and noise
let maxFreq: CGFloat = 13500.0

// how/why can the sample rate be edited? Shouldn't this come from the node/engine?
// if the sample rate is changed, does the displayed frequency range also have to be changed?
// took this from existing SpectrogramView, will investigate later
let sampleRate: double_t = 44100
}

struct SliceQueue {
var maxItems: Int = 120
var items: [SpectrogramSlice] = []

public mutating func pushToQueue(element: SpectrogramSlice) {
enqueue(element: element)
if items.count > maxItems {
dequeue()
}
}

private mutating func enqueue(element: SpectrogramSlice) {
items.append(element)
}

private mutating func dequeue() {
if !items.isEmpty {
items.remove(at: 0)
}
}
}

/// Model for the SpectrogramFlatView. Makes connection to the audio node and receives FFT data
class SpectrogramFlatModel: ObservableObject {
/// A queue full of SpectrogramSlice
@Published var slices = SliceQueue()
/// Dimensions of the slices. Set prior to rendering to get slices that fit.
var sliceSize = CGSize(width: 10, height: 250) {
didSet {
if xcodePreview { createTestData() }
}
}
let nodeMetaData = SpectrogramFFTMetaData()
let xcodePreview = ProcessInfo.processInfo.environment["XCODE_RUNNING_FOR_PREVIEWS"] == "1"
var nodeTap: FFTTap!
var node: Node?

// create a filled Queue, always full of stuff. looks a bit better.
// otherwise it would be fast moving at the beginning and then
// pressing together until full (looks funny though :-).
// In case of Xcode Preview, filling of queue will be done in
// setSliceSize called typically from the geometry reader.
init() {
if !xcodePreview {
createEmptyData()
}
}

// fill the queue with empty data so the layouting doesn't start in the middle
private func createEmptyData() {
for _ in 0 ... slices.maxItems - 1 {
var points: [CGPoint] = []
for index in 0 ..< 10 {
let frequency = CGFloat(Float(index) * Float.pi)
let amplitude = CGFloat(-200.0)
points.append(CGPoint(x: frequency, y: amplitude))
}
// size and freuqency doesnt' really matter as it will all be black
let slice = SpectrogramSlice(
gradientUIColors: SpectrogramFlatView.gradientUIColors,
sliceWidth: sliceSize.width,
sliceHeight: sliceSize.height,
fftReadingsFrequencyAmplitudePairs: points,
fftMetaData: nodeMetaData
)
slices.pushToQueue(element: slice)
}
}

private func createTestData() {
let testCellAmount = 200
for _ in 0 ... slices.maxItems - 1 {
var points: [CGPoint] = []
// lowest and highest frequency full amplitude to see the rendering showing full frequency spectrum
// CGPoint x: frequency y: Amplitude -200 ... 0 whereas 0 is full loud volume
for index in 0 ... testCellAmount {
// linear frequency range from 48 to 13500 in amount of steps we generate
let frequency = 48.0 + CGFloat( index * (13500 / testCellAmount ))
var amplitude = CGFloat.random(in: -200 ... 0)
// add some silence to the test data
amplitude = amplitude < -80 ? amplitude : -200.0
points.append(CGPoint(x: frequency, y: amplitude))
}
let slice = SpectrogramSlice(
gradientUIColors: SpectrogramFlatView.gradientUIColors,
sliceWidth: sliceSize.width,
sliceHeight: sliceSize.height,
fftReadingsFrequencyAmplitudePairs: points,
fftMetaData: nodeMetaData
)
slices.pushToQueue(element: slice)
}
}

func updateNode(_ node: Node) {
// Using a background thread to get data from FFTTap.
// This doesn't make it more efficient but will not bother
// main thread and user while doing the work
if node !== self.node {
self.node = node
nodeTap = FFTTap(node, bufferSize: UInt32(nodeMetaData.fftSize * 2), callbackQueue: .global()) { fftData in
self.pushData(fftData)
}
// normalization would mean that on each slice, the loudest would have
// amplitude 1.0, independent of what has happened before.
// we don't want that as we want absolute measurements that can be compared over time.
nodeTap.isNormalized = false
nodeTap.zeroPaddingFactor = 1
nodeTap.start()
}
}

func pushData(_ fftFloats: [Float]) {
// Comes several times per second, depending on fftSize.
// This call pushes new fftReadings into the queue.
// Queue ist observed by the view and thus view is updated.
// The incoming array of floats contains 2 * fftSize entries. coded in real and imaginery part.
// The frequencies in the even numbers and the amplitudes in the odd numbers of the array.
let slice = SpectrogramSlice(
gradientUIColors: SpectrogramFlatView.gradientUIColors,
sliceWidth: sliceSize.width,
sliceHeight: sliceSize.height,
fftReadings: fftFloats,
fftMetaData: nodeMetaData
)
// we receive the callback typically on a background thread, where
// also the slice image was rendered. to inform UI we dispatch it on main thread
DispatchQueue.main.async {
self.slices.pushToQueue(element: slice)
}
}

}

0 comments on commit 6d08971

Please sign in to comment.