Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.9.0 #702

Merged
merged 133 commits into from
Apr 28, 2023
Merged

v0.9.0 #702

merged 133 commits into from
Apr 28, 2023

Conversation

sammlapp
Copy link
Collaborator

This PR includes major new features and some breaking changes to OpenSoundscape, including

  • fully featured localization module for localizing sounds from a spatial array of synchronized recorders
  • class activation mapping for visualizing sample activation in deep learning models
  • refactoring of ml (formerly torch) modules including new sample module

sammlapp and others added 30 commits January 12, 2023 15:46
merge hotfix from master to develop
578: tutorial download links. Resolves #578
datasets now return Sample class
dataloaders should use opso.sample.collate_samples as collate_fn argument in order to properly collate data and labels for training/prediction

AudioSample object is created by AudioFileDataset (or AudioClipDataset) and passed to the preprocessor. Each Action now recieves and returns the Sample object, which eliminatees the ugly _extra_args implementation. The Action() class retains a simple user-friendly idea of being able to accept a function that acts on data (not a sample) by changing the Action.go() method to run the action_fn on sample.data and update sample.data (then return the sample).

Also, train and predict now expect batches from dataloaders that have dictionary keys "samples" and "labels" (instead of "X" and "y").

tutorials and tests will be broken, I havent modified them
next: consider removing dependency of external package and implementing the cam class in opso instead (it has cv2 dependency and manipulates the model in ways we might want to avoid/control ourselves)
I refactored the use of DataLoader by changing collate function to simply return the list of AudioSamples (rather than dictionary of batched tensors for 'samples' and 'labels'). This allows us to retain information about the AudioSamples (especially important if they are modified during preprocessing). The collate_samples function is now used after interating the dataloader (iterating the dataloader creates a list of AudioSamples).

salieny_map now returns a list of AudioSamples as well. The returned samples have an attribute .activation_map (type ActivationMap) which can be plotted etc.
note that Actions now modify a sample in-place (updating it's .data and maybe other attributes) - this is now reflected in the tests
Spectrogram's setattr raising AttributeError means that copy.deepcopy() will fail. As a workaround, if AttributeError is raised when trying to copy sample.data in to saple.trace[], it assigns the original object instead of copying (if immutable this isnt an issue). However, this isn't a good solution, and change immutable class implementation when #671 is addressed it should be changed
watch() will log histograms of parameter and gradient values every n epochs for each module in the torch model
this works now, and avoid error on Save by removing all forward/backward hooks from saved model (unless user specifies save_hooks=True in CNN.save()).
Torch modules should simply be "called" ie Module(input) rather than Module.forward(input). The forward() function will bypass forward hooks.
Implement wandb.watch in CNN.train()
cam module now has CAM class which stores and plots base image, activation maps per class, and guided back propagation per class

cnn now has method called generate_cams, which returns AudioSample objects with .cam as an instance of CAM

next steps: add examples to tutorials, add tests
Added from_url method to load audio from downloaded url data (following SoundFile documentation)

Also added methods to display interactive audio widget. The audio automatically displays as a widget now in Jupyter noteboks, using IPython.disply.Audio. The user can generate the widget (ie in a loop) by calling Audio.show_widget()
one test failing `test_generate_cams_num_workers` gives error about pickling when num_workers is 2, need to investigate
also test gillette for each receiver as reference

and remove warning for centering in soundfinder
also asserts dims in (2,3) for soundfinder
Adds full localization module for localizing sounds from time-synchronized recording arrays

Could use more test coverage and needs a demo notebook
these arguments got un-exposed during a refactor, but invert in particular is needed when loading models from old opso versions
also add sample module to __init__.py
@sammlapp sammlapp added this to the 0.9.0 milestone Apr 26, 2023
for some reason, ReadTheDocs was failing (ModuleNotFoundError) when running the ribbit tutorial notebook. I just copied all the cells to a new notebook then renamed it to the same name as the previous one, and it builds fine locally for me now.
update docs to reflect supported python versions

also update sentry-sdk to address #680

ran `poetry lock --no-update` because `poetry lock` hangs
this function was totally broken, but didn't realize it because it was catching errors. I updated it to use AudioSample so that it is compatible with the current codebase.

I added the flag raise_errors and added tests with raise_errors=True.

 I also changed the name of the `wandb` module to `logging` to avoid name conflicts with the wandb package.
@syunkova syunkova self-requested a review April 28, 2023 04:42
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: don't need a comma after 3.8 in line 20... doesn't matter much

@sammlapp sammlapp merged commit 03e8661 into master Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants