Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nn.Graph optimizer part 2: add L2, pass job complete, refactor #5604

Merged
merged 21 commits into from
Jul 28, 2021

Conversation

strint
Copy link
Contributor

@strint strint commented Jul 26, 2021

  • fix error in test at Job Complete
  • support L2, add l2 in sgd
  • refactor optimizer conf and variable conf
device_tag: "cpu"                                                                                                                                                                                           
scope_symbol_id: 4611686018427473919                                                                                                                                                                        
variable_conf {                                                                                                                                                                                             
  out: "out"                                                                                                                                                                                                
  shape {                                                                                                                                                                                                   
    dim: 10                                                                                                                                                                                                 
    dim: 4                                                                                                                                                                                                  
  }                                                                                                                                                                                                         
  data_type: kFloat                                                                                                                                                                                         
  initializer {                                                                                                                                                                                             
    empty_conf {                                                                                                                                                                                            
    }                                                                                                                                                                                                       
  }                                                                                                                                                                                                         
  regularizer {                                                                                                                                                                                             
    l1_l2_conf {                                                                                                                                                                               
      l2: 0.7  # L2                                                                                                                                                                                        
    }                                                                                                                                                                                                       
  }                                                                                                                                                                                                         
}    

@strint strint added this to the v0.5.0 milestone Jul 26, 2021
@strint strint requested review from chengtbf and leaves-zwx July 26, 2021 09:07
@strint strint marked this pull request as ready for review July 26, 2021 15:29
@strint strint requested a review from oneflow-ci-bot July 26, 2021 15:42
@oneflow-ci-bot oneflow-ci-bot removed their request for review July 26, 2021 16:47
@strint strint requested a review from oneflow-ci-bot July 26, 2021 17:22
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

PyTorch resnet50 time: 138.1ms (= 6905.2ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 124.1ms (= 6205.5ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
Relative speed: 1.11 (= 138.1ms / 124.1ms)

PyTorch resnet50 time: 82.6ms (= 4129.9ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 72.1ms (= 3607.2ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
Relative speed: 1.14 (= 82.6ms / 72.1ms)

PyTorch resnet50 time: 51.7ms (= 2585.5ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 52.2ms (= 2608.1ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
Relative speed: 0.99 (= 51.7ms / 52.2ms)

PyTorch resnet50 time: 47.1ms (= 2355.5ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 49.5ms (= 2475.3ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
Relative speed: 0.95 (= 47.1ms / 49.5ms)

PyTorch resnet50 time: 42.5ms (= 2126.8ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 58.8ms (= 2937.8ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
Relative speed: 0.72 (= 42.5ms / 58.8ms)

@oneflow-ci-bot oneflow-ci-bot removed their request for review July 26, 2021 18:32
if state_block.type == BlockType.PARAMETER:
self._var2var_op_name[state_block.origin] = (
self._variables_conf[state_block.origin] = VariableConfig(
state_block.name_prefix + state_block.name
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

给每个参数构造一个VariableConf

for param in param_group.parameters:
vars_conf[param].l2 = l2
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

获取L2

def build(self, x):
out = self.m(x)
out.backward()
return out
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

测试输入 + matmul + 参数 + backward + 输出的构图

@@ -80,18 +129,18 @@ def __init__(self):
self.add_optimizer("sgd0", sgd0)
self.add_optimizer("sgd1", sgd1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

测试多optimizer的配置

@strint strint changed the title nn.Graph optimizer part 2 nn.Graph optimizer par2: add l2, pass job complete, refactor Jul 27, 2021
@strint strint changed the title nn.Graph optimizer par2: add l2, pass job complete, refactor nn.Graph optimizer part 2: add L2, pass job complete, refactor Jul 27, 2021
@strint strint requested a review from oneflow-ci-bot July 28, 2021 02:22
@strint strint requested review from oneflow-ci-bot and removed request for oneflow-ci-bot July 28, 2021 02:39
@strint strint requested review from oneflow-ci-bot and removed request for oneflow-ci-bot July 28, 2021 02:46
@github-actions
Copy link
Contributor

CI failed, removing label automerge

@oneflow-ci-bot oneflow-ci-bot removed their request for review July 28, 2021 03:46
@strint strint requested a review from oneflow-ci-bot July 28, 2021 04:03
@strint strint requested review from oneflow-ci-bot and removed request for oneflow-ci-bot July 28, 2021 04:45
@oneflow-ci-bot oneflow-ci-bot removed their request for review July 28, 2021 06:29
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

PyTorch resnet50 time: 139.2ms (= 6959.9ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 126.4ms (= 6318.6ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
Relative speed: 1.10 (= 139.2ms / 126.4ms)

PyTorch resnet50 time: 82.4ms (= 4118.6ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 74.8ms (= 3742.3ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
Relative speed: 1.10 (= 82.4ms / 74.8ms)

PyTorch resnet50 time: 60.1ms (= 3004.1ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 48.8ms (= 2438.2ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
Relative speed: 1.23 (= 60.1ms / 48.8ms)

PyTorch resnet50 time: 47.5ms (= 2373.1ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 41.8ms (= 2088.3ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
Relative speed: 1.14 (= 47.5ms / 41.8ms)

PyTorch resnet50 time: 42.8ms (= 2141.3ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 40.0ms (= 2002.4ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
Relative speed: 1.07 (= 42.8ms / 40.0ms)

@oneflow-ci-bot oneflow-ci-bot removed their request for review July 28, 2021 09:22
@strint strint requested a review from oneflow-ci-bot July 28, 2021 10:52
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

PyTorch resnet50 time: 141.3ms (= 7067.4ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 127.5ms (= 6377.4ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
Relative speed: 1.11 (= 141.3ms / 127.5ms)

PyTorch resnet50 time: 84.4ms (= 4218.9ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 74.2ms (= 3707.8ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
Relative speed: 1.14 (= 84.4ms / 74.2ms)

PyTorch resnet50 time: 60.5ms (= 3024.3ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 48.6ms (= 2430.2ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
Relative speed: 1.24 (= 60.5ms / 48.6ms)

PyTorch resnet50 time: 47.9ms (= 2394.4ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 41.2ms (= 2057.9ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
Relative speed: 1.16 (= 47.9ms / 41.2ms)

PyTorch resnet50 time: 41.6ms (= 2079.5ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 41.0ms (= 2048.0ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
Relative speed: 1.02 (= 41.6ms / 41.0ms)

@oneflow-ci-bot oneflow-ci-bot merged commit 5628713 into master Jul 28, 2021
@oneflow-ci-bot oneflow-ci-bot deleted the fea/nn_graph/optimizer2 branch July 28, 2021 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants