optimizer exploration - v1 and v2 + fix position_weighted optimizer + decoupled weight decay #53881

lanlanfb · 2021-03-12T08:01:07Z

Summary:

Fix position_weighted optimizer: Position weighted layer uses default optimizer but is actually gradient_slice, which will cause problem if we do not handle it properly in the new optimizier. The solution is to use sparseadagrad when it is gradient_slices.
Optimizer implementation of v1 and v2: using 1st momentum with/without bias_correction.
also implemented decoupled weight decay in the new optimizer.

Test Plan:
buck test //caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test_2 -- test_mlp_optimization

buck test //caffe2/caffe2/python:optimizer_test -- TestDecayAdagrad

buck test //caffe2/caffe2/python/operator_test:decay_adagrad_test

ctr_mbl_feed work flow: f255731660
oc work flow: f255739503

Reviewed By: 0x10cxR1

Differential Revision: D26839668

… decoupled weight decay Summary: 1. Fix position_weighted optimizer: Position weighted layer uses default optimizer but is actually gradient_slice, which will cause problem if we do not handle it properly in the new optimizier. The solution is to use sparseadagrad when it is gradient_slices. 2. Optimizer implementation of v1 and v2: using 1st momentum with/without bias_correction. 3. also implemented decoupled weight decay in the new optimizer. Test Plan: buck test //caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test_2 -- test_mlp_optimization buck test //caffe2/caffe2/python:optimizer_test -- TestDecayAdagrad buck test //caffe2/caffe2/python/operator_test:decay_adagrad_test ctr_mbl_feed work flow: f255731660 oc work flow: f255739503 Reviewed By: 0x10cxR1 Differential Revision: D26839668 fbshipit-source-id: 3e0a3646d8459c769caea19658217f1a32d539bb

facebook-github-bot · 2021-03-12T08:01:15Z

💊 CI failures summary and remediations

As of commit 929ae4b (more details on the Dr. CI page):

2/2 failures possibly* introduced in this PR
- 2/2 non-scanned failure(s)

ci.pytorch.org: 1 failed

Failed: pr/caffe2-pytorch-linux-bionic-rocm4.0.1-py3.6-test

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

facebook-github-bot · 2021-03-12T08:01:30Z

This pull request was exported from Phabricator. Differential Revision: D26839668

codecov · 2021-03-12T11:18:43Z

Codecov Report

Merging #53881 (929ae4b) into master (8737c2a) will decrease coverage by 0.00%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #53881      +/-   ##
==========================================
- Coverage   77.30%   77.30%   -0.01%     
==========================================
  Files        1888     1888              
  Lines      183589   183589              
==========================================
- Hits       141923   141918       -5     
- Misses      41666    41671       +5

facebook-github-bot · 2021-03-13T02:01:07Z

This pull request was exported from Phabricator. Differential Revision: D26839668

… decoupled weight decay (pytorch#54042) Summary: Pull Request resolved: pytorch#54042 Pull Request resolved: pytorch#53881 1. Fix position_weighted optimizer: Position weighted layer uses default optimizer but is actually gradient_slice, which will cause problem if we do not handle it properly in the new optimizier. The solution is to use sparseadagrad when it is gradient_slices. 2. Optimizer implementation of v1 and v2: using 1st momentum with/without bias_correction. 3. also implemented decoupled weight decay in the new optimizer. Test Plan: buck test //caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test_2 -- test_mlp_optimization buck test //caffe2/caffe2/python:optimizer_test -- TestDecayAdagrad buck test //caffe2/caffe2/python/operator_test:decay_adagrad_test ctr_mbl_feed work flow: f255731660 oc work flow: f255739503 Reviewed By: 0x10cxR1 Differential Revision: D26839668 fbshipit-source-id: 8a2170e317e695b861b1b1e566beb82ae0f08836

… decoupled weight decay (#54042) Summary: Pull Request resolved: #54042 Pull Request resolved: #53881 1. Fix position_weighted optimizer: Position weighted layer uses default optimizer but is actually gradient_slice, which will cause problem if we do not handle it properly in the new optimizier. The solution is to use sparseadagrad when it is gradient_slices. 2. Optimizer implementation of v1 and v2: using 1st momentum with/without bias_correction. 3. also implemented decoupled weight decay in the new optimizer. Test Plan: buck test //caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test_2 -- test_mlp_optimization buck test //caffe2/caffe2/python:optimizer_test -- TestDecayAdagrad buck test //caffe2/caffe2/python/operator_test:decay_adagrad_test ctr_mbl_feed work flow: f255731660 oc work flow: f255739503 Reviewed By: 0x10cxR1 Differential Revision: D26839668 fbshipit-source-id: 2b6881c1a88540ef5766be40f5e80001257e2199

facebook-github-bot added the cla signed label Mar 12, 2021

facebook-github-bot added the fb-exported label Mar 12, 2021

lanlanfb closed this Mar 13, 2021

lanlanfb mentioned this pull request Mar 16, 2021

optimizer exploration - v1 and v2 + fix position_weighted optimizer + decoupled weight decay (#53881) #54042

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

optimizer exploration - v1 and v2 + fix position_weighted optimizer + decoupled weight decay #53881

optimizer exploration - v1 and v2 + fix position_weighted optimizer + decoupled weight decay #53881

Uh oh!

lanlanfb commented Mar 12, 2021

Uh oh!

facebook-github-bot commented Mar 12, 2021 •

edited

Loading

Uh oh!

facebook-github-bot commented Mar 12, 2021

Uh oh!

codecov bot commented Mar 12, 2021

Uh oh!

facebook-github-bot commented Mar 13, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

optimizer exploration - v1 and v2 + fix position_weighted optimizer + decoupled weight decay #53881

optimizer exploration - v1 and v2 + fix position_weighted optimizer + decoupled weight decay #53881

Uh oh!

Conversation

lanlanfb commented Mar 12, 2021

Uh oh!

facebook-github-bot commented Mar 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

ci.pytorch.org: 1 failed

Uh oh!

facebook-github-bot commented Mar 12, 2021

Uh oh!

codecov bot commented Mar 12, 2021

Codecov Report

Uh oh!

facebook-github-bot commented Mar 13, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

facebook-github-bot commented Mar 12, 2021 •

edited

Loading