Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Dendritic MLP to have multiple output heads #527

Merged
merged 10 commits into from
Jun 9, 2021

Conversation

akashvelu
Copy link
Contributor

These changes modify the dendritic_mlp class to have multiple output heads which all use the same hidden layer output.

Copy link

@karangrewal karangrewal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some minor comments. Before I approve, can you explain the rationale behind having multiple output heads and how you plan to use them? Is it so that you can index the one corresponding to the certain task during training, so that all other heads are unmodified?

dendrite_weight_sparsity=0.95,
weight_sparsity=0.95, weight_init="modified", dendrite_init="modified",
freeze_dendrites=False, output_nonlinearity=None,
dendritic_layer_class=AbsoluteMaxGatingDendriticLayer,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the indentation here is fine, esp since that's how this stack overlow article does it.

Also, I copied this style from the model utils file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some minor comments. Before I approve, can you explain the rationale behind having multiple output heads and how you plan to use them? Is it so that you can index the one corresponding to the certain task during training, so that all other heads are unmodified?

Multiple output heads are used by the policy network, which outputs the mean and diagonal log std, each of which are separate heads, of a gaussian.

self._output_layers = nn.ModuleList()
for out_size in output_size:
output_layer = nn.Sequential()
output_linear = SparseWeights(module=nn.Linear(input_size, out_size),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't the size of the output layer be (hidden_sizes[-1], out_size)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input_size is updated in line 163 to be the last hidden size, so it should be the same.

@akashvelu akashvelu requested a review from subutai June 8, 2021 19:56
self._output_layer.add_module("output_linear", output_linear)
self._single_output_head = not isinstance(output_size, Iterable)
if self._single_output_head:
output_size = (output_size,)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the docstring regarding the format of output_size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@akashvelu akashvelu merged commit aace8fa into numenta:master Jun 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants