New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RNG seed [formerly Reproducibility enhancements] #31
Comments
Yeah, agree 💯. I recently implemented this kind of thing over in entrofy, so it wouldn't be hard to do. |
so thoughts on the PipelineFactory and Pipeline objects? PipelineFactory is the iterator that yields a Pipeline, which can then be passed a data object to deform. or do you see a simpler approach? |
Well, if you're reconstructing a deformation pipeline from a muda output, it only has to generate a single example. Parameterizing each element of the pipeline according to its seed (and state number) ought to suffice, so we shouldn't need to generate multiple pipeline objects. |
I'm thinking of the scenario where I generate one pipeline and want to apply it to different audio-jams objects ... to do this currently, I have to keep making new Pipelines with |
Different meaning totally different content? If that's the case, why would you care about porting over random parameters? If you want to reinstantiate a pipeline, random seeds and all, that can be done with the current serialization code (properly extended to include seeds). |
Coming off of the discussion in #62, it seems like the more useful version of this idea is to reconstruct a specific deformation sequence from a previous run of muda. This is useful when you have the original audio, deformed jams, and want to rebuild the corresponding deformed audio. I'm having a hard time thinking of any other reproducibility use cases that can/should be powered by the deformation history of individual outputs. I specifically don't see the utility in reconstructing a muda pipeline from an output's deformation history. Given the interactions between union, bypass, and pipeline, I'm not sure this is even possible: you'll only get the deformers that actually executed to form this output, not the actual deformation stack. I think encouraging folks to try to abstract up from an instance to the pipeline is an anti-pattern; instead, we should encourage folks to save their pipeline objects alongside the outputs if they want to run further deformations on new data. So I suggest this issue be consolidated into two enhancements:
These two enhancements are independent. Because the deformation history never records randomized objects (only their deterministic parent class), and all state is preserved in the history, you can get reproducibility of randomized deformations for free even without storing the seed. (This, of course, is just for audio re-deformation, not for re-running a deformation sweep on a dataset.) @ejhumphrey @justinsalamon what do yall think? |
+💯 for re-deformer, indeed it appears #62 surfaced precisely because I shared MUDA jams files (https://github.com/justinsalamon/UrbanSound8K-JAMS) to avoid having to distribute the augmented version of US8K we were using in our paper for reproducibility. Happy to put together a PR, but no cycles in the near horizon :'( |
At least two ideas jump out at me re: reproducibility:
RandomDoAThing
deformers could optionally takeseed
params, but always use one internally (and serialize accordingly).state
, which isn't the case forRandomDoAThing
deformers, or (b) there's a higher-level object that combines state and pipeline as different objects. The difference here is small (and maybe semantic), but it's a difference between a class and an instance (the pipeline is the class, the state is the instance). This might have interesting repercussions for the design of thePipeline
, which is perhaps more aptly called aPipelineFactory
.please yell if any of this is unclear, I'm kind of stream-of-consciousness working through the idea.
The text was updated successfully, but these errors were encountered: