Skip to content

PR to address the comments on the ESC50 recipe#2070

Merged
mravanelli merged 28 commits intospeechbrain:developfrom
ycemsubakan:esc50_comments
Jul 16, 2023
Merged

PR to address the comments on the ESC50 recipe#2070
mravanelli merged 28 commits intospeechbrain:developfrom
ycemsubakan:esc50_comments

Conversation

@ycemsubakan
Copy link
Collaborator

@ycemsubakan ycemsubakan commented Jul 10, 2023

This is a PR to address the comments of Mirco in #2066

Here are the original comments and the progress:

  1. Comments on /ESC50/classification/:
  • Please mention the performance and computing time for both CNN14 and conv2d-based models. [done] Additionally, include references to our papers, including Zhepei's paper. [done]
  • We need to make the HF repository (https://huggingface.co/speechbrain/soundrec-cnn14-esc50) public. Also, please add a working example, similar to other HF interfaces. It would be helpful to upload an audio signal as an example in the repository as well. [done]
    ---> note: this repository is called speechbrain/cnn14-esc50 now.
  1. Comments on /recipes/ESC50/interpret:
  • When introducing PIQ, please add a reference to our paper. Additionally, at the end, you can include a section on citing PIQ with the necessary BibTeX information. [done]

  • Create a new interface and a corresponding HF repository for the PIQ Model. In this case, the input should be a signal, and the output should be the audio for interpretation, correct? [done]

  • For this interface, please upload one or two example files (maybe one positive and one negative example) and briefly explain the outcomes in the text. [done]

  • In /ESC50/interpret/train_piq.py, please add missing docstrings. All functions should have at least a short description. [done]

  • The YAML files should use the pretrained model form the speechbrain HF repository (remember to make it public). Currently, it is using fpaissan/conv2d_us8k or cemsubakan/cnn14-esc50/ [done].

  • It would be helpful to write a short tutorial on interpretability with a special focus on PIQ [I think this might stay out of scope of this PR].

@ycemsubakan ycemsubakan changed the title [WIP] PR to address the comments [WIP] PR to address the comments on the ESC50 recipe Jul 10, 2023
@ycemsubakan ycemsubakan self-assigned this Jul 10, 2023
@mravanelli mravanelli self-requested a review July 10, 2023 21:05
@mravanelli mravanelli added the enhancement New feature or request label Jul 10, 2023
@ycemsubakan ycemsubakan added the ready to review Waiting on reviewer to provide feedback label Jul 15, 2023
@ycemsubakan ycemsubakan changed the title [WIP] PR to address the comments on the ESC50 recipe PR to address the comments on the ESC50 recipe Jul 15, 2023
@ycemsubakan ycemsubakan marked this pull request as ready for review July 15, 2023 05:58
@mravanelli
Copy link
Collaborator

Excellent work! I tested both the training recipes and HF interfaces and everything works. I only have the following comments:

  1. It looks like there are a couple of wrong links in the interpret/README.md. The log folder of PIQ is empty. The CNN14 log links seem to point to the conv2D one.
  2. It could be great to conclude this work with a nice tutorial on interpretability. We can merge this PR meanwhile, but we will need to add a link to the tutorial in the README files once available.

@mravanelli mravanelli merged commit db36842 into speechbrain:develop Jul 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request ready to review Waiting on reviewer to provide feedback

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants