Skip to content
This repository has been archived by the owner on Nov 22, 2022. It is now read-only.

Optimize Gelu operator for caffe2 export #918

Closed
wants to merge 1 commit into from

Conversation

geof90
Copy link
Contributor

@geof90 geof90 commented Aug 16, 2019

Summary:
TIL ONNX->Caffe2 is very memory inefficient, it creates an intermediate blob for each intermediate output. So, the Gelu operator creates a lot of intermediate ops since it does a bunch of math.

Fix is to use the caffe2 Gelu operator, so all that computation is captured in a single op.

https://pxl.cl/HzGf

Differential Revision: D16849396

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Aug 16, 2019
geof90 added a commit to geof90/pytext that referenced this pull request Aug 16, 2019
Summary:
Pull Request resolved: facebookresearch#918

TIL ONNX->Caffe2 is very memory inefficient, it creates an intermediate blob for each intermediate output. So, the Gelu operator creates a lot of intermediate ops since it does a bunch of math.

Fix is to use the caffe2 Gelu operator, so all that computation is captured in a single op.

https://pxl.cl/HzGf

Differential Revision: D16849396

fbshipit-source-id: 83010d89723f6da26403c4d87eacd42998c1a54d
Summary:
Pull Request resolved: facebookresearch#918

TIL ONNX->Caffe2 is very memory inefficient, it creates an intermediate blob for each intermediate output. So, the Gelu operator creates a lot of intermediate ops since it does a bunch of math.

Fix is to use the caffe2 Gelu operator, so all that computation is captured in a single op.

https://pxl.cl/HzGf

Differential Revision: D16849396

fbshipit-source-id: a17908daff58b2c005afbc72fcb8dc46c37d075d
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in e32c2a5.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed Do not delete this pull request or issue due to inactivity. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants