-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor MLP #3257
Refactor MLP #3257
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have a couple of small stylistic comments here. Let's discuss offline? I'm not sure if this is standard style or not
self.model = nn.Sequential(*self.build_layers()) | ||
self.skip = nn.Linear(d_input, d_output) if skip_connection else None | ||
|
||
def build_layers(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a docstring here?
input = x | ||
for layer in self.model: | ||
x = layer(x) | ||
if isinstance(layer, nn.Linear): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe better to add the activations into the sequential model directly? Is this style off adding activations via a loop standard elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do that, we will have to use the torch activations instead of the deepchem activations, since you cannot make a sequential model with anything but nn.modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Feel free to merge
Pull Request Template
Description
This PR allows the user to create an MLP with different sized hidden layers using a tuple, which was not previously possible. It also adds an activation on the final layer which would likely be expected. This change was made in an effort to unify the different feedforward network implementations in deepchem.
This PR also fixes a bug in the case of selecting to use 1 hidden layer. The current implementation produces a layer in the shape of [input_dim,hidden_dim], not the expected [input_dim,output_dim].
Type of change
Please check the option that is related to your PR.
Checklist
yapf -i <modified file>
and check no errors (yapf version must be 0.32.0)mypy -p deepchem
and check no errorsflake8 <modified file> --count
and check no errorspython -m doctest <modified file>
and check no errors