SigMF currently lacks a way to represent stereo data #133

bhilburn · 2021-06-09T15:06:57Z

Per @Teque5's note in #117, SigMF currently lacks a way to represent stereo data (e.g., data traces from an oscilloscope). This data looks like sample pairs, just like complex sample pairs, but are not actually complex data. In the o-scope file @Teque5 generated in that Issue, the pairs are defined as a complex pair to keep them aligned, but that's not semantically accurate.

I think adding the ability to support recordings of stereo real data would not be too hard of a lift from our current spec, and would be really useful (beyond just our own logo 😄).

What about just adding stereo as a top-level type? Modifying the ABNF from the spec gives us:

    dataset-format = (real / complex / stereo) ((type endianness) / byte)

    real = "r"
    complex = "c"
    stereo = "s"

    type = floating-point / signed-integer / unsigned-integer
    floating-point = "f32"
    signed-integer = "i32" / "i16"
    unsigned-integer = "u32" / "u16"

    endianness = little-endian / big-endian
    little-endian = "_le"
    big-endian = "_be"

    byte = "i8" / "u8"

This, for example, would mean that si16_le indicates stereo int_16 LE data.

Thoughts?

The text was updated successfully, but these errors were encountered:

Teque5 · 2021-06-09T17:27:44Z

@bhilburn Generally I think the spec should expand to house an arbitrary number of channels of any type. Per the OPUS spec §5.1.1.2 this should include at least up to 8 audio channels.

I was chatting with @ke8ctn about this a bit and we think the spec should allow multiple channels of any type with an interleaved structure. This would allow MIMO recordings or multichannel RF recordings as well as multichannel audio.

I know there was a discussion long ago about this where someone was advocating that separate channels should be in separate files, but I think there is good reason to put them in the same file.

I kinda doubt you want to overload the datatype field. Perhaps there should be a specific key for num_channels of type uint. I don't think there is anything special about audio that should give it a special identifier since you can just as easily listen to RF or seismic or other timeseries data that would be inside a SigMF dataset. This field could be optional (and therefore backwards-compatible) where if it is omitted only a single channel exists in the file.

With the mod as I suggest, the logo would just contain:

datatype = "i16_le"
num_channels = 2

jacobagilbert · 2021-06-11T16:12:58Z

Does the proposed multirecording extension not cover this?

bhilburn · 2021-06-11T16:20:38Z

Agreed, @jacobagilbert, I think it does. @Teque5, see #99.

So, I think the question comes down to: do we see stereo recordings as a specific instance of multirecordings, or as its own thing?

It sounds like @Teque5 thinks of it as the former. What are your thoughts, @jacobagilbert ?

Teque5 · 2021-06-11T16:20:47Z

@jacobagilbert After reading about the issue, it essentially proposes a nearly identical idea: adding an int type channel key to global. I think num_channels or channel_count is less ambiguous but you are correct - that implementation would solve this stereo recording problem also. If the maintainers are interested I have some time today and could potentially write a PR to cover this.

jacobagilbert · 2021-06-11T16:25:14Z

The number of channels would need to be inferred (from the length of the streams object?).

One limitation I can see is that multirecordings as it exists now (and it needs some changes anyway) does not allow n interleaved channels in a single dataset. If that is a hard requirement (looking at @Teque5), we could add an n_interleaved_channels or something to that extension or to core - something to distinguish 1 dataset of n interleaved channels from n datasets that are associated. If we do that i believe the suggestion above about using an individual element datatype is the correct way.

Teque5 · 2021-06-11T16:59:57Z

@jacobagilbert This is how I see it for multiple channels:

Multiple interleaved streams in one file

pros
- simple <- very valuable
- when playing to hardware, ideal scenario
cons
- requires special i/o to untangle, no more np.fromfile()
- recordings must be identical length (unless we modify how captures annotates each channel)
- when recording from hardware will require additional buffers (maybe) to keep things aligned, especially if from multiple radios or multiple ADCs (maybe)
- recording across multiple time-synced machines not possible. Would require post-processing to merge.

Multiple streams in multiple files (a la multirecordings)

pros
- when recording from hardware, ideal scenario
- allows non-identical stream length
cons
- complicated or at least not user-friendly <- very bad
- what do you do when missing a part of a multirecording set?

Really I think the spec should support both options. I can easily envision scenarios where you may want annotations to span multiple channels or maybe you want them specific to a single channel.

jacobagilbert · 2021-06-11T17:12:30Z

I agree, both should be supported - the option used will likely be dictated by the source (or consumer) of the data. For that reason im going to tag this as v2.x, though i think we can cover the second use case in 1.0.

Multiple interleaved streams is essentially providing the ability to store 2D data - perhaps it should be generalized accordingly? np.fromfile() can still work provided it is followed up with a np.reshape().

Teque5 · 2021-06-11T17:51:53Z

@jacobagilbert Yea loading won't be hard, pretty simple in every language but I just meant it won't be obvious.

ke8ctn · 2021-06-11T17:55:18Z

Since I was tagged by @Teque5 above, I'll say that I've long been of the opinion that multi-channel support via interleaving is a missing feature of SigMF. Something like an optional core global field called core:num_channels with a default value of 1 would be very appealing. As @jacobagilbert already pointed out, it would not be at all difficult to implement the I/O as long as you know the number of channels, and the existing read_samples method could be adapted to account for this without much effort. The other metadata (new capture/annotation fields, maybe channel descriptions) would need some more thought, though.

The multirecording extension seems very useful for cases where the separate channels are more loosely related, such as when there is not a shared sample clock or datatype. It feels appropriate as an extension that gives application developers a consistent way to handle these cases.

Both would be great.

bhilburn · 2021-06-15T21:02:24Z

So, the multi-recordings extension is happening no matter what - it needs some changes, but it'll be part of v1.0.0. There are a ton of use-cases that need it, and I think it's fundamental to SigMF at this point. Per #99, in the next update, a channel index (or similar) field will also be added.

The discussion here, in my mind, is whether that (the multi-recordings extension) 👆👆 is sufficient to cover usecases like stereo recordings, or whether we need a better mechanism. I think @ke8ctn, @Teque5, and @jacobagilbert have all raised good points, here, and it seems like the answer to that question is "no", and that we need another mechanism.

The code that I shared in the OP proposed adding a way to do stereo recordings, but as noted by everyone above, should really be generalized to be arbitrary interleaved streams.

My gut is that this ought to be a fairly easy change to make, and is backwards-compatible (all additions should default to Optional such that default parsing assumes current behavior). I'd like to include it in v1.0.0 if we can, both because of its utility, and also the fact that our own logo requires it 😄.

Will mark it back to v1.0.0 for now, and let's see if we can get it done in a way that will work. @Teque5, are you still up for giving it a go?

Teque5 · 2021-06-15T21:27:57Z

Yea I can probably get to it this weekend. I'll do the final logo then too since a few people have checked my annotations now.

Teque5 · 2021-06-19T21:40:36Z

PR is done. After or if(?) you merge I will create another PR to easily convert load audio files into sigmf.

bhilburn · 2021-07-01T02:08:44Z

@Teque5 - Left some minor comments in the PR, and then I think we're good to merge. And that follow-on PR sounds excellent 🙂

bhilburn added the enhancement label Jun 9, 2021

jacobagilbert added this to the Future (v2.x) milestone Jun 11, 2021

bhilburn modified the milestones: Future (v2.x), Release v1.0.0 Jun 15, 2021

jacobagilbert assigned Teque5 Jun 17, 2021

This was referenced Jun 19, 2021

SigMF Needs a Logo #117

Closed

feature/num_channels #138

Merged

feature/validator #139

Merged

bhilburn closed this as completed in #138 Jul 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SigMF currently lacks a way to represent stereo data #133

SigMF currently lacks a way to represent stereo data #133

bhilburn commented Jun 9, 2021

Teque5 commented Jun 9, 2021 •

edited

jacobagilbert commented Jun 11, 2021

bhilburn commented Jun 11, 2021

Teque5 commented Jun 11, 2021

jacobagilbert commented Jun 11, 2021 •

edited

Teque5 commented Jun 11, 2021

jacobagilbert commented Jun 11, 2021 •

edited

Teque5 commented Jun 11, 2021

ke8ctn commented Jun 11, 2021 •

edited

bhilburn commented Jun 15, 2021

Teque5 commented Jun 15, 2021

Teque5 commented Jun 19, 2021

bhilburn commented Jul 1, 2021

SigMF currently lacks a way to represent stereo data #133

SigMF currently lacks a way to represent stereo data #133

Comments

bhilburn commented Jun 9, 2021

Teque5 commented Jun 9, 2021 • edited

jacobagilbert commented Jun 11, 2021

bhilburn commented Jun 11, 2021

Teque5 commented Jun 11, 2021

jacobagilbert commented Jun 11, 2021 • edited

Teque5 commented Jun 11, 2021

Multiple interleaved streams in one file

Multiple streams in multiple files (a la multirecordings)

jacobagilbert commented Jun 11, 2021 • edited

Teque5 commented Jun 11, 2021

ke8ctn commented Jun 11, 2021 • edited

bhilburn commented Jun 15, 2021

Teque5 commented Jun 15, 2021

Teque5 commented Jun 19, 2021

bhilburn commented Jul 1, 2021

Teque5 commented Jun 9, 2021 •

edited

jacobagilbert commented Jun 11, 2021 •

edited

jacobagilbert commented Jun 11, 2021 •

edited

ke8ctn commented Jun 11, 2021 •

edited