Eigenscape Raw Park 6 and 8 recordings are not really "Raw" #8

iranroman · 2022-03-15T15:58:25Z

TLDR: Do not use Park 6 and Park 8 files in the Eigenmike Raw dataset. The files included in the dataset are in a different format.

@sakshamsingh1 brought to my attention that, in the Eigenscape Raw dataset, the recordings for Park 6 and 8 have 25 channels, instead of 32 channels.

In essence, Raw Park 6 and Park 8 files look more like B-format than A-format (raw format).

I talked with Marc Green, author of the Eigenscape dataset, and he confirmed that there was a confusion when the A-format files were released, and, in fact, the "raw" recordings for Park 6 and Park 8 in are wrong. The real recordings are missing and unavailable to the public.

This is something that micarraylib has no control of, so users are advised to NOT use those specific files.

The text was updated successfully, but these errors were encountered:

iranroman · 2022-03-15T16:00:09Z

@sakshamsingh1, could you run a simple test to check if Park 6 and Park 8 files are perhaps the same in both versions of the Eigenscape dataset?

This would show that the format is "B" for these files in both the Raw and B-format versions of the dataset.

sakshamsingh1 · 2022-03-16T18:46:07Z

Hi @iranroman, I think Raw and B-format for Park-6 and Park-8 files have been flipped (i.e. Park-6(8)-Raw is in B-format and Park-6(8)-B is in Raw format). Because Park-6(8)-Raw has 25 channels and Park-6(8)-B has 32 channels.

Below is the code that I used for confirming this.

from micarraylib.datasets import eigenscape_loader

def get_eigen_name(cat,num,fmt):
    if fmt=='A':
        return f"{cat}-0{num}-Raw"
    else:
        return f"{cat}.{num}"

eigen_raw_path = <EIGEN RAW PATH>
eigen_path = <EIGEN B PATH>
eigen_raw = eigenscape_loader.eigenscape_raw(download=False, data_home=eigen_raw_path)
eigen = eigenscape_loader.eigenscape(download=False, data_home=eigen_path)

num = 8
clip_a = get_eigen_name('Park',num,'A')
clip_b = get_eigen_name('Park',num,'B')

a_np = eigen_raw.get_audio_numpy(clip_a, fmt="A")
b_np = eigen.get_audio_numpy(clip_b, fmt="B")

print(a_np.shape, b_np.shape)
# (25, 28800000) (32, 28800000)

marc1701 · 2022-03-17T20:37:57Z

Hi, author of EigenScape here. I've just checked my original files and it does look as though the raw Park 6 & 8 are missing, rather than flipped. I'm honestly not sure where those extra channels are coming from in this code snippet becuase I certainly don't have them! Perhaps the code is retrieving a different file?

iranroman · 2022-03-23T13:30:37Z

Hello @marc1701. Thanks a lot for joining the discussion.

Using micarraylib, I have confirmed that the observation by @sakshamsingh1 is true on my end. Moreover, if I load another Park file (in either EigenScape format) that is not 6 or 8 (5, for example) I do not see the same issue.

I also went ahead and downloaded a fresh version of Park.zip from zenodo again, and I'm sorry to inform you that Park.6.wav, and Park.8.wav have 32 channels, while all others have 25.

I think we found a real bug.

marc1701 · 2022-03-31T10:53:29Z

Hi all - I have created a new version of the dataset on zenodo (version 3), replacing Park.zip with a new version containing the correct 25-channel Ambisonic versions of Park.6.wav and Park.8.wav. I would appreciate it if someone could verify this via a download.

I am not sure how this issue crept in as the 'original' files I have on my hard drive are all the correct versions. My apologies for any problems this may have caused.

iranroman · 2022-03-31T14:42:34Z

Thank you very much @marc1701 for addressing this on the B-format version of the dataset. I have started the process to update the dataset loaders, first in soundata, then here in micarraylib.

For the A-format (raw mics) version of the dataset, is the verdict that the equivalent files are missing? Could they actually be the files previously mistaken in the B-format version?

marc1701 · 2022-04-11T18:43:55Z

Yes I have checked the raw files and it does seem as though the two formats got mixed up. So the raw files are actually those mistakenly filed with the B-format, and the B-format were files as raw. Again, I'm not sure how this happened and can only apologise for the confusion!

Unfortunately it will be slightly more difficult for me to amend the raw files as these are hosted at the University of York rather than Zenodo so I will have to liaise with University admin to get them changed, which could be a pain. Might it be possible to engineer a workaround?

iranroman · 2022-04-21T15:11:36Z

We have engineered a solution with the @marc1701 at soundata (see soundata/soundata#102 and soundata/soundata#98). If you are interacting with this dataset using micarraylib and/or soundata, you should be good.

iranroman mentioned this issue Mar 31, 2022

Eigenscape Park 6 and 8 recordings are not really B-format soundata/soundata#98

Closed

3 tasks

iranroman mentioned this issue Apr 18, 2022

Fix eigenscape soundata/soundata#102

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eigenscape Raw Park 6 and 8 recordings are not really "Raw" #8

Eigenscape Raw Park 6 and 8 recordings are not really "Raw" #8

iranroman commented Mar 15, 2022 •

edited

iranroman commented Mar 15, 2022

sakshamsingh1 commented Mar 16, 2022

marc1701 commented Mar 17, 2022

iranroman commented Mar 23, 2022

marc1701 commented Mar 31, 2022

iranroman commented Mar 31, 2022

marc1701 commented Apr 11, 2022

iranroman commented Apr 21, 2022 •

edited

Eigenscape Raw Park 6 and 8 recordings are not really "Raw" #8

Eigenscape Raw Park 6 and 8 recordings are not really "Raw" #8

Comments

iranroman commented Mar 15, 2022 • edited

iranroman commented Mar 15, 2022

sakshamsingh1 commented Mar 16, 2022

marc1701 commented Mar 17, 2022

iranroman commented Mar 23, 2022

marc1701 commented Mar 31, 2022

iranroman commented Mar 31, 2022

marc1701 commented Apr 11, 2022

iranroman commented Apr 21, 2022 • edited

iranroman commented Mar 15, 2022 •

edited

iranroman commented Apr 21, 2022 •

edited