-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Dendritic MLP to have multiple output heads #527
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some minor comments. Before I approve, can you explain the rationale behind having multiple output heads and how you plan to use them? Is it so that you can index the one corresponding to the certain task during training, so that all other heads are unmodified?
dendrite_weight_sparsity=0.95, | ||
weight_sparsity=0.95, weight_init="modified", dendrite_init="modified", | ||
freeze_dendrites=False, output_nonlinearity=None, | ||
dendritic_layer_class=AbsoluteMaxGatingDendriticLayer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the indentation here is fine, esp since that's how this stack overlow article does it.
Also, I copied this style from the model utils file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some minor comments. Before I approve, can you explain the rationale behind having multiple output heads and how you plan to use them? Is it so that you can index the one corresponding to the certain task during training, so that all other heads are unmodified?
Multiple output heads are used by the policy network, which outputs the mean and diagonal log std, each of which are separate heads, of a gaussian.
self._output_layers = nn.ModuleList() | ||
for out_size in output_size: | ||
output_layer = nn.Sequential() | ||
output_linear = SparseWeights(module=nn.Linear(input_size, out_size), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't the size of the output layer be (hidden_sizes[-1], out_size)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input_size is updated in line 163 to be the last hidden size, so it should be the same.
self._output_layer.add_module("output_linear", output_linear) | ||
self._single_output_head = not isinstance(output_size, Iterable) | ||
if self._single_output_head: | ||
output_size = (output_size,) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update the docstring regarding the format of output_size
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
These changes modify the dendritic_mlp class to have multiple output heads which all use the same hidden layer output.