Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Tanh Gelu #988

Merged
merged 8 commits into from
Jul 16, 2021
Merged

Implement Tanh Gelu #988

merged 8 commits into from
Jul 16, 2021

Conversation

rdspring1
Copy link
Collaborator

@rdspring1 rdspring1 commented Jul 8, 2021

  • Double backward support for Gelu - [Original and Approximate]
  • Add approximate boolean flag to Gelu
  • Fast tanh gelu to eager mode - [CPU and CUDA, skip MKLDNN]
  • Fast Tanh Gelu to NvFuser composite ops
  • Pass Pytorch CI

@jjsjann123
Copy link
Collaborator

Is this one ready for review yet? It seems to be still in draft state.

@rdspring1
Copy link
Collaborator Author

I'm still working on the passing the PyTorch CI.

Would adding a separate TanhGelu op be easier to upstream than adding an approximation flag to Gelu?

@jjsjann123
Copy link
Collaborator

Would adding a separate TanhGelu op be easier to upstream than adding an approximation flag to Gelu?

Is this because of the CI failures?
I don't think adding a separate TanhGelu would make it easier. If for some reason a new TanhGeluApprox gets CI green while the corresponding changes in TanhGelu doesn't, that means we are missing some CI test and like have to patch that later as well.

@jjsjann123
Copy link
Collaborator

linking upstream PR: pytorch#61439

Update Gelu documenation

Update test_nn with tanh gelu

Fix test_cpp_api_parity

Update torch overrides

Enable Gelu for Tensor Expressions

Tensor Expressions ignores the approximate flag

Add Gelu to UNTRACEABLE_FUNCTIONALS - test/test_fx

Update torch/onnx/symbolic_opset
Copy link
Collaborator

@jjsjann123 jjsjann123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@@ -3497,22 +3497,22 @@
CPU: gelu_out_cpu
CUDA: gelu_out_cuda

- func: gelu(Tensor self) -> Tensor
- func: gelu(Tensor self, bool approximate) -> Tensor
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick, should we default approximate to False just so that we don't have to update tests?

torch/csrc/jit/runtime/symbolic_script.cpp Outdated Show resolved Hide resolved
@jjsjann123
Copy link
Collaborator

Don't worry too much about upstream XLA test. We can deal with that when we are actually upstreaming.

@rdspring1
Copy link
Collaborator Author

rdspring1 commented Jul 14, 2021

I created an XLA PR - pytorch/xla#3039

@rdspring1 rdspring1 marked this pull request as ready for review July 16, 2021 15:36
@jjsjann123
Copy link
Collaborator

Tests passing. I think we are missing some BC allow list changes, but those shouldn't matter. I'm merging this one.

@jjsjann123 jjsjann123 merged commit 765276e into 20_12_3_devel Jul 16, 2021
@jjsjann123 jjsjann123 deleted the rds_fast_gelu branch July 16, 2021 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants