Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metadata fields #12

Open
charlieroberts opened this issue Mar 30, 2021 · 14 comments
Open

metadata fields #12

charlieroberts opened this issue Mar 30, 2021 · 14 comments

Comments

@charlieroberts
Copy link

I have a proof-of-concept node.js script that parses the clean-samples quark and then lets you select which repos you'd like to download. Having the download size of each repo would be nice to help users make informed choices about what they're grabbing.

Which made me think that perhaps we should be adding more metadata in general, and perhaps most of this could be automatically added by the Python script so that it wouldn't be a burden on users adding sample banks. I would suggest as a possible starting point:

  1. Filesize
  2. Number of channels
  3. Sample Rate
  4. Bit depth
  5. Duration

This might enable more selective download scripts in the future e.g. "get all 16-bit mono samples that are under .5 seconds in duration from the repos by yaxu ". Is there a reason not to add more metadata?

@charlieroberts
Copy link
Author

charlieroberts commented Mar 30, 2021

In a similar vein, perhaps the "dependencies" field of the top-level quark should contain a bit more information about each repo? Or is the expectation that the typical use case will be to download all sample banks? As the collection gets larger (which will be great!) this could become problematic...

@yaxu
Copy link
Member

yaxu commented Mar 30, 2021

As I mused before, we're conflating different issues - how to replace the monolithic Dirt-Samples packs to welcome new users to tidal and friends, and how to browse and share samples more generally. This repo is really about the first issue, but the second issue is more interesting and I'm very happy to jump into that here regardless!

Yes no real reason not to add this metadata and more, e.g. cuepoints (e.g. marking onsets in a breakbeat), spectral centroid, zero crossings etc would also be very useful to have. I think it would be good to take a moment to find an existing standard to base this on though - do you know of one? When I look around for audio file metadata formats they seem more concerned with whether the sound file is 'christian rock' or 'experimental' than objective dsp measurements. I'm sure the MIR community will have something though..

Things like 'number of channels' are a bit different from 'spectral centroid' because the former will be a properties of a sound file, closer to metadata rather than data. But still useful to have this in the metadata file.. again it would be good to look at an existing standard as channel count does not tell us enough for some cases e.g. 8 channels could be octophonic in a ring or corners of a cube or 7.1.

I think you're pointing out a flaws in the quark system really. The quark database is just a list of github links, and so when an end user clicks refresh, github gets polled for everything in the list. This takes a long time. No information is shown about a quark until you install it. It works OK for installing samples with superdirt, but doesn't meet your purposes charlie, of browsing samples ahead of installation.

So some design points/questions:

  1. The metadata files should be useful independently from the samples they refer to. That is, the metadata should include a stable URL to the location of the samples, like https://github.com/tidalcycles/sounds-repetition. Then the sound filename repetition/a.wav would be relative to that.
  2. The metadata files could then be cached and served from a single location, either into the same folder or into the same file. This could then serve as an index that is updated periodically (daily? hourly?)
  3. Should the samples then be cached and served centrally as well? Unsure about that.. Maybe not.
  4. Where do we do audio analysis? The python script is intended to be usable, but also as a reference implementation as a way to define the format and not the only interface for it. There are python libraries for this too but with supercollider to hand and used by many live coding systems it probably make senses to start with doing audio analysis there to add extra metadata to files created with the python script. Regardless, getting python to read basic audio file attributes like number of channels is an easy win, I'm sure a nice library exists for that.

@telephon
Copy link

No information is shown about a quark until you install it.

What about having one single TidalSamplesMetaData Quark? That could also have the URLs for the samples.

@telephon
Copy link

4\. Where do we do audio analysis?

There is a Quark : https://github.com/musikinformatik/SoundFileAnalysis/
You can write your analysis functions in supercollider, and also make it reproducible, or add analysis on the fly.

@charlieroberts
Copy link
Author

Agreed there are multiple issues being conflated here. Maybe the quark could be auto-generated into its own repo, and this repo could be dedicated to more metadata? Or maybe it doesn't really matter, and all this text is light enough that it's not a concern.

I was trying to look through the AudioCommons work, but it's hard to find a "standard" specification in their website or a quick look through the related publications. But here's some of the AudioCommons analysis that Freesound uses.

Including transients for beat-slicing sounds fantastic, and more detailed info on spatialization also would be great, although I imagine this could get very specific (six-channel diamond vs six-channel cube etc.) Perhaps something like SpatDif could work or at least serve as a model? Could also be overkill.

@charlieroberts
Copy link
Author

what the hell is a six-channel cube? forgive my addled maths :)

@charlieroberts
Copy link
Author

OK, dug a bit more through some of the AudioCommons "deliverables" (as opposed to the corresponding peer-reviewed publications) and it looks like the analysis that Freesound includes is most of the AudioCommons "spec". Here's a document that goes into detail on how each of these is measured.

Perhaps we stick with these for analysis purposes, but extend with more descriptors that are appropriate to our use case?

@yaxu
Copy link
Member

yaxu commented Mar 31, 2021

Struggling to work out what the audio commons is.. Do you mean use their ontology, or their analysis tools as well?

@charlieroberts
Copy link
Author

I guess the specific part of their ontology that relates to analysis. The entire ontology is quite large. But maybe there are other parts also worth adopting.

@charlieroberts
Copy link
Author

Paging @axambo as she was involved in the AudioCommons project and might have some thoughts...

@yaxu
Copy link
Member

yaxu commented Apr 1, 2021

Hi @axambo!
I guess I'm unsure how this relates to freesound.. Does freesound automatically do this analysis? Or is this something we can do with supercollider and then upload/update metadata in freesound?

@charlieroberts
Copy link
Author

charlieroberts commented Apr 1, 2021

My impression is that Freesound automatically does this analysis. So, it would potentially be a duplication of "work" (computation time / energy) if we plan to try and upload everything to Freesound (maybe we should formalize how this would work if we're sure we want to do it?). And, given that we might be using the quark @telephon mentioned, the results might be slightly different from those Freesound provides.

But it does provide a common set of analysis descriptors to use a template. I'm not attached to them; even in this brief discussion @yaxu has already pointed out two properties that don't seem to be addressed (transients / slices and spatial configuration) that seem like they'd have much more value for our typical use cases than a boominess coefficient. Although making a performance of only boomy sounds seems like a fun challenge :)

@yaxu
Copy link
Member

yaxu commented Apr 1, 2021

I see, it took a bit of digging in the api (the docs are for an old version of the api) but yep the analysis is all there:
https://freesound.org/apiv2/sounds/565580/analysis/

I guess it takes a little while to run after an upoad.

Do these tags actually relate to the audiocommons stuff? There's no ac: prefix.

@charlieroberts
Copy link
Author

Ack, yes, sorry, I misunderstood how the analysis was stored in the database. The AudioCommons descriptors are filters that you can use to search Freesound... I'm not sure how the analysis you pointed to is used behind the scenes to enable these. I guess once you have all that low-level analysis it's reasonably fast to calculate the higher-level descriptions? Seems like it's actually a one-to-one correspondence in many cases (albeit with different names).

I think we're more interested in the high-level AC descriptors than the low-level analysis anyways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants