Skip to content

Recomposition / decluttering API #67

Open
@dy

Description

@dy

Current API comprises various concepts and various contexts, mixing them all up does not work well.
Let's try to analyze and clean them up, figure out the core value of the package, distinguished from just a heap of assorted audio aspects. On the way taking notes/ideas.

There are the following apparent contexts.

  • playback (stream to speaker)
  • recording (stream from mic)
  • manipulations
  • rendering & taking statistics (stream to analyzer)
  • reading & decoding (decode-stream from file)
  • saving & encoding (encode-stream to file)
  • generic streaming in (non-mic: web-audio, url, video/youtube, etc.)
  • generic streaming out (non-speaker: web-audio, renderer, generic observable/asyncIterator, icecast, p2p audio etc.)

Originally these concerns are handled each with separate node in audio-processing graph.
But they can be reclassified into:

  • create (from mic, file, buffer, encoded data, web-audio, url, video etc.)
  • read/output (to speaker, analyzer, buffer, encode, stream, observable, renderer, web-audio etc.)
  • manipulate (stack of string/array-like ops + audio-specific ops)
  • navigate (state of reading: seek, cues, skip, playback, rate, etc.)
  • sync/mix (video track, captions track, rendering? track, other tracks)

↑ With different flavors (type of data storage, time units convention, naming, stack of ops vs direct manipulations)

Also, it's worth correlating with MDN Audio - that includes own opinionated subset of operations.

Also, alternative audio modules (wad, aural, howler, ciseaux etc.) each has own subset of operations.

Consider possible concepts.

A. Audio (HTMLAudioElement) for node

! one possible value is to just provide standard Audio container for node.

  • 👍 existing docs (MDN)
  • 👍 compatible with web-audio-api pattern
  • 👎 losing manipulations
  • 👎 that implies implementing more generic Media class with a bunch of associates: AudioTrackList, AudioTrack, TimeRanges, MediaController, MediaError, MediaKeys, MediaDevices, MediaStream, TextTrack, VideoTrack, MediaKeySession - overall looks like an organic part of browser API, not some standalone polyfill.

B. Manipulations toolkit 🌟

  • simple decoded sync data container (AudioBuffer, Float32Array etc. - similar to pxls) - takes in any PCM/float data (likely audio-buffer-from).
  • for loading audio from remote source, use audio-load. For recording and other streaming sources - use corresp. packages.
  • basically extends AudioBuffer with a set of chainable methods (BTW! inherited AudioBuffer is compatible with regular one!)
    • ~ the possible drawback - audiobuffer is immutable - no easy way to trim/slice it etc.
    • ~ also for long (45s+) sources they recommend using MediaElementAudioSourceNode - which is a type of AudioNode.
  • for playing audio - use for example audio-play
  • 👍 ↑ this way, the package can be focused on manipulations only without cramming all into one, and go a bit deeper, eg.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions