Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support non-linearity function and module for PyTorch Module #928

Merged
merged 1 commit into from
Nov 19, 2019

Conversation

yonghyuc
Copy link
Contributor

@yonghyuc yonghyuc commented Oct 8, 2019

Throw an error because of using non-linear functions in torch.nn.
Change into using non-linear funcitons in torch.nn.functional or torch

Previously, MultiHeadedMLPModule assumes that we are using non-linear function in torch.nn.
So we make a computational graph inside of __init__ and pass the type of the non-linear function to ModuleList. (layer.add_module('non_linearity', hidden_nonlinearity))

However, for consistency with other modules, I changed the way we use the non-linear functions.
(Similar with MLPModule, ) make linear layers only in __init__ and call the non-linear functions in the forward function. (x = self._hidden_nonlinearity(x))

So now, we need to pass non-linear functions in the torch.nn.functional (ex. torch.tanh or torch.nn.functional.relu) instead of in the torch.nn (ex torch.nn.Tanh, torch.nn.ReLU)

@yonghyuc yonghyuc requested a review from a team as a code owner October 8, 2019 21:17
@ryanjulian ryanjulian changed the title Fix non-linearity function of multi_headed_mlp_module (#926) Fix non-linearity function of multi_headed_mlp_module Oct 8, 2019
@codecov
Copy link

codecov bot commented Oct 8, 2019

Codecov Report

Merging #928 into master will increase coverage by 0.02%.
The diff coverage is 97.72%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #928      +/-   ##
=========================================
+ Coverage   85.27%   85.3%   +0.02%     
=========================================
  Files         160     160              
  Lines        7674    7661      -13     
  Branches      958     955       -3     
=========================================
- Hits         6544    6535       -9     
  Misses        937     937              
+ Partials      193     189       -4
Impacted Files Coverage Δ
src/garage/torch/modules/mlp_module.py 100% <100%> (+5.88%) ⬆️
...rc/garage/torch/modules/multi_headed_mlp_module.py 98.36% <97.29%> (+15.02%) ⬆️
src/garage/envs/grid_world_env.py 84.12% <0%> (-4.77%) ⬇️
src/garage/misc/tensor_utils.py 88.23% <0%> (-2.36%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ebea1dd...987dbaf. Read the comment docs.

Copy link
Member

@ryanjulian ryanjulian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused about this PR.

The title suggests that it's fixing a non-linearity in multi_headed_mlp_module (perhaps the default was wrong).

The content seems to rewrite how heads in this module are created, and also adds some features (layer norms, selectable output nonlinearities).

Can you add some context behind the change and explain what it does?

@yonghyuc
Copy link
Contributor Author

yonghyuc commented Oct 9, 2019

I'm a little confused about this PR.

The title suggests that it's fixing a non-linearity in multi_headed_mlp_module (perhaps the default was wrong).

The content seems to rewrite how heads in this module are created, and also adds some features (layer norms, selectable output nonlinearities).

Can you add some context behind the change and explain what it does?

I updated the description above.
Basically, everything is same except how we handle the non-linear functions of hidden and output

Copy link
Member

@avnishn avnishn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yonghyuc the changes look to be making the format of this module more standard with what Linda wrote for other torch modules originally. I want to know if there is a merit to adding out activation functions to the nn.modulelist as opposed to using the torch.functional non linearity functions directly in the forward method of the module.

One benefit that I see is that if a user wants to quickly verify the structure of a module by printing it out (e.g. a policy) if we use nn.modules for the activation functions and add them to the nn.modulelist, then the nonlinearities will appear in the printed output, but if we use the torch.functional activiation functions, they won't appear when we print out the module. I'm wondering if you, @lywong92, and @krzentner can chime in on this design decision.

src/garage/torch/modules/multi_headed_mlp_module.py Outdated Show resolved Hide resolved
@yonghyuc yonghyuc force-pushed the fix_multi_headed_mlp_module branch 2 times, most recently from 0212391 to 0cee99f Compare October 14, 2019 21:45
@@ -118,5 +109,15 @@ def forward(self, input_val):
x = input_val
for layer in self._layers:
x = layer(x)

return [layer(x) for layer in self._output_layers]
if self._hidden_nonlinearity:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to be clear, what I'm trying to avoid is having to manually call the non linearities. I think they should just be defined as computations in the nn.ModuleList

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have removed the need for this loop, right?

Copy link
Member

@avnishn avnishn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, we'll add support for both nn.nonlinearities and torch.functional nonlinearities in a future pr.

@ryanjulian ryanjulian added the backport-to-2019.10 Backport this PR to release-2019.10 label Nov 1, 2019
@yonghyuc yonghyuc changed the title Fix non-linearity function of multi_headed_mlp_module Support non-linearity function and module for PyTorch Module Nov 6, 2019
@yonghyuc yonghyuc force-pushed the fix_multi_headed_mlp_module branch 3 times, most recently from ea0c0a4 to 02d93f6 Compare November 6, 2019 23:41

"""
x = input_value
for layer in self._layers:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to replace this function with return self._layers(input_values).

Copy link
Contributor Author

@yonghyuc yonghyuc Nov 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.layers is ModuleList, so I think we need to use for loop here.
(I checked this documentation)
Or we can use Sequential to compute the layers like that.

"""
return self.module(input_value)

# pylint: disable=missing-return-doc, missing-return-type-doc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should actually globally allow not documenting __methods__? It's fairly obvious what __repr__, __getstate__, and __setstate__ do.

In the mean time, yeah. Documenting looks cleaner than locally disabling the rule.

@@ -118,5 +109,15 @@ def forward(self, input_val):
x = input_val
for layer in self._layers:
x = layer(x)

return [layer(x) for layer in self._output_layers]
if self._hidden_nonlinearity:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have removed the need for this loop, right?

@yonghyuc yonghyuc force-pushed the fix_multi_headed_mlp_module branch 4 times, most recently from 8cbf97d to 57abfe1 Compare November 14, 2019 18:25
@yonghyuc yonghyuc added ready-to-merge and removed backport-to-2019.10 Backport this PR to release-2019.10 labels Nov 14, 2019
Copy link
Member

@ryanjulian ryanjulian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- I removed ready-to-merge so you can look at my last comments, but feel free to add it back when you're ready.

@ryanjulian ryanjulian added backport-to-2019.10 Backport this PR to release-2019.10 ready-to-merge labels Nov 15, 2019
@ryanjulian
Copy link
Member

@Mergifyio backport release-2019.10

@mergify
Copy link
Contributor

mergify bot commented Nov 19, 2019

Command backport release-2019.10: pending

Waiting for the pull request to get merged

Hey, we reacted but our real name is @Mergifyio

@mergify mergify bot merged commit f58b4bd into master Nov 19, 2019
@mergify mergify bot deleted the fix_multi_headed_mlp_module branch November 19, 2019 06:30
@ryanjulian
Copy link
Member

@Mergifyio backport release-2019.10

mergify bot pushed a commit that referenced this pull request Nov 19, 2019
@mergify
Copy link
Contributor

mergify bot commented Nov 19, 2019

Command backport release-2019.10: success

Backports have been created
The following pull requests have been created:

Hey, we reacted but our real name is @Mergifyio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-to-2019.10 Backport this PR to release-2019.10 ready-to-merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants