Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NeuralBlock replication functionality #79

Merged
merged 11 commits into from
May 25, 2020
Merged

Add NeuralBlock replication functionality #79

merged 11 commits into from
May 25, 2020

Conversation

pplantinga
Copy link
Collaborator

@pplantinga pplantinga commented May 14, 2020

This replaces the NeuralBlock class with a slightly cleaned-up version as well as an additonal ReplicateBlock class for replication. Remaining tasks:

  • Add difference and average combine functions.
  • Test TIMIT CTC example to ensure performance isn't harmed.

@pplantinga pplantinga added enhancement New feature or request refactor Edit code without changing functionality labels May 14, 2020
@pplantinga pplantinga requested a review from Gastron May 14, 2020 18:12
@pplantinga pplantinga self-assigned this May 14, 2020
@mravanelli
Copy link
Collaborator

Nice work! I saw that by default you combine the connections by summing them up. This can work only if the dimensionalities are the same. One more general combination function would be one that:
1- sums, averages, subtracts, or concatenates the tensors if the tensors have the same dimensionality.
2- If the tensors don't have the same dimensionality, we perform a simple linear transformation before combining them
These two points were implemented in the previous version. A more critical situation happens when we change the temporal resolution across the axis. In these cases, it's hard to say what is the best approach to follow...

@pplantinga
Copy link
Collaborator Author

2- If the tensors don't have the same dimensionality, we perform a simple linear transformation before combining them

I'm not sure its best to do this automatically. Perhaps we could provide an interface where the user could specify what happens, e.g. an additional parameter of a function that is applied to the shortcut.

@mravanelli
Copy link
Collaborator

mravanelli commented May 14, 2020 via email

@TParcollet
Copy link
Collaborator

Just a quick comment, I found NeuralBlock way more explicit than Nerve. Every single deep learning research will know what a NeuralBlock is certainly expected to do .. but Nerve :s

@pplantinga
Copy link
Collaborator Author

pplantinga commented May 15, 2020

NeuralBlock way more explicit than Nerve

Hey thanks @TParcollet , this is valuable feedback. Of course we want to make SpeechBrain as clear and explicit as possible.

I think for the things that have creative names (i.e. Brain, Lobe, Nerve) the reason to do this rather than a more "explicit" name is because they may not have an easy equivalent in the literature. This does mean more work for the user to understand how they work, but in these cases it may be necessary. In the case of Nerve, currently it reads the yaml to construct a block, which is pretty distinct from most modern machine learning systems.

Perhaps one way to make Nerve more explicit is to actually keep NeuralBlock as a separate module, and only call Nerve when we want a sequence of NeuralBlocks of the same type. We could also rename it as "Replicate" or something similar. Would love to hear your feedback Titouan as to whether we need this replicate functionality and if you think there's a better way to do it.

@TParcollet
Copy link
Collaborator

I understand better now. Nonetheless I agree that this should has a more explicit name. Remaining in the "brain" concept is fun, but it shouldn't be at the cost of ease of use. If this function is wiling to often be called, then why not combining its usage into its name? It is replicating a block from a yaml file? Then ReplicateBlock?

@pplantinga pplantinga changed the title Add nerve class Add NeuralBlock replication functionality May 15, 2020
Copy link
Collaborator

@Gastron Gastron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this functionality will be useful as its already used in like two lobes.
I have one minor docstring comment that should be changed, but otherwise after that support merging.

We should also take note that this establishes "Block" as a term to mean a logical unit of neural computation (between a layer and a network). That terminology has been used elsewhere too so that's fine by me. We should just be sure that we use the term consistently. I've tried to highlight this in another comment

@Gastron
Copy link
Collaborator

Gastron commented May 15, 2020

Also you may consider "ReplicatedBlock" or "ReplicatingBlock" as well. "ReplicaBlock" for that Blade Runner feel. EDIT: Peter has pointed out to me that it is ReplicantBlock that I want.

@pplantinga pplantinga added the ready to review Waiting on reviewer to provide feedback label May 18, 2020
@mravanelli
Copy link
Collaborator

Peter, did you try to add some kind of shortcuts and see what happens to the performance of our TIMIT_CTC example?

@pplantinga
Copy link
Collaborator Author

I would not expect to see much difference when there's only two blocks, but I guess I can try.

@pplantinga pplantinga removed the ready to review Waiting on reviewer to provide feedback label May 18, 2020
@mravanelli
Copy link
Collaborator

@pplantinga do you want me to help to do some tests with shortcuts connections?

@pplantinga pplantinga mentioned this pull request May 20, 2020
@mravanelli
Copy link
Collaborator

I did some full TIMIT experiments with residual, skip, and dense connections (with 4 DNN blocks rather than just two), and, as expected, the performance doesn't change.

@mravanelli
Copy link
Collaborator

In general, I think it is much easier for the users to just specify the combination with a string like "combine_type=sum, add, diff, avg, concat, linear". These are the types that users will use more frequently. If they wanna specify their own combination function, we can still support the user-customized function as we do now. It's a bit annoying adding a combination function in the lobe just for switching from addition to the difference.

@mravanelli
Copy link
Collaborator

Also, we should raise an error if the shortcut names are not "residual", "skip", or "dense". In the current version, if you add a random name it doesn't complain.

@mravanelli
Copy link
Collaborator

When I run residual connections with "dnn_blocks=1" it raises an error:

  File "/network/tmp1/ravanelm/speechbrain_github/speechbrain/speechbrain/nnet/containers.py", line 147, in forward
    shortcut_inputs, outputs, init_params=init_params,
UnboundLocalError: local variable 'outputs' referenced before assignment

This is probably due to the fact that we decided to not support shortcut connections in the first block.

@pplantinga pplantinga added the ready to review Waiting on reviewer to provide feedback label May 22, 2020
@pplantinga pplantinga merged commit 7450c5f into master May 25, 2020
@pplantinga pplantinga deleted the nerve-branch branch May 25, 2020 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request ready to review Waiting on reviewer to provide feedback refactor Edit code without changing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants