Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stall]Layer auto detect input size #688

Closed
wants to merge 14 commits into from

Conversation

dcslin
Copy link
Member

@dcslin dcslin commented May 2, 2020

No description provided.

@lgtm-com
Copy link

lgtm-com bot commented May 2, 2020

This pull request introduces 1 alert when merging dd0dc99 into e4082c6 - view on LGTM.com

new alerts:

  • 1 for Unnecessary pass

@dcslin
Copy link
Member Author

dcslin commented May 2, 2020

changes:

  1. an example of make class Xxx(Operation) to private class _Xxx(Operation). Because class Operation should be only for internal. For the term Operation in the user space, it should be the operation functions def xxx(x):.... Also these operation function should be used by the user
  2. fix bug for set_param when given Tensor as params
  3. Modified Linear constructor to (self,out_features, in_features=None, bias=True):. out_features comes first, leaving in_features as optional. When Linear is constructed with only out_features, it's params, W and b, are not initialized. After set_params or forward/__call__, it's params, W and b, are initialized.

@nudles
Copy link
Member

nudles commented May 4, 2020

changes:

1. an example of make `class Xxx(Operation)` to private `class _Xxx(Operation)`. Because `class Operation` should be only for internal. For the term `Operation` in the user space, it should be the operation functions `def xxx(x):...`. Also these operation function should be used by the user

2. fix bug for `set_param` when given Tensor as params

3. Modified `Linear` constructor to `(self,out_features, in_features=None, bias=True):`. `out_features` comes first, leaving `in_features` as optional. When `Linear` is constructed with only `out_features`, it's params, `W` and `b`, are not initialized. After `set_params` or `forward`/`__call__`,  it's params, `W` and `b`, are initialized.

For the last point, it will break the compatibility..
there are two solutions

  1. use *args and **kwargs
  2. assume the old code passes (in_features, out_features, bias=True) and the new code passes (out_features, bias=True), then we check if in_features is None or not to decide the argument order.

In V4, we can update the API completely.

@lgtm-com
Copy link

lgtm-com bot commented May 4, 2020

This pull request introduces 1 alert when merging 2b16b37 into e4082c6 - view on LGTM.com

new alerts:

  • 1 for Unnecessary pass

@lgtm-com
Copy link

lgtm-com bot commented May 5, 2020

This pull request introduces 2 alerts when merging cd289d6 into e4082c6 - view on LGTM.com

new alerts:

  • 2 for Unnecessary pass

@dcslin
Copy link
Member Author

dcslin commented May 6, 2020

changes:

1. an example of make `class Xxx(Operation)` to private `class _Xxx(Operation)`. Because `class Operation` should be only for internal. For the term `Operation` in the user space, it should be the operation functions `def xxx(x):...`. Also these operation function should be used by the user

2. fix bug for `set_param` when given Tensor as params

3. Modified `Linear` constructor to `(self,out_features, in_features=None, bias=True):`. `out_features` comes first, leaving `in_features` as optional. When `Linear` is constructed with only `out_features`, it's params, `W` and `b`, are not initialized. After `set_params` or `forward`/`__call__`,  it's params, `W` and `b`, are initialized.

For the last point, it will break the compatibility..
there are two solutions

  1. use *args and **kwargs
  2. assume the old code passes (in_features, out_features, bias=True) and the new code passes (out_features, bias=True), then we check if in_features is None or not to decide the argument order.

In V4, we can update the API completely.

ok thanks.

Updated:
changes No.4: modified the __ini__ to parse *args and **kwargs for Linear, RNN, LSTM

@chrishkchris chrishkchris changed the base branch from master to dev May 6, 2020 05:22
@chrishkchris chrishkchris changed the base branch from dev to master May 6, 2020 05:23
@lgtm-com
Copy link

lgtm-com bot commented May 6, 2020

This pull request introduces 2 alerts and fixes 1 when merging cd289d6 into 536f7e4 - view on LGTM.com

new alerts:

  • 2 for Unnecessary pass

fixed alerts:

  • 1 for Unused local variable

@chrishkchris chrishkchris changed the base branch from master to dev May 6, 2020 05:38
@lgtm-com
Copy link

lgtm-com bot commented May 6, 2020

This pull request introduces 2 alerts when merging cd289d6 into db1846d - view on LGTM.com

new alerts:

  • 2 for Unnecessary pass

@lgtm-com
Copy link

lgtm-com bot commented May 7, 2020

This pull request introduces 3 alerts when merging e538fef into db1846d - view on LGTM.com

new alerts:

  • 2 for Unnecessary pass
  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented May 7, 2020

This pull request introduces 3 alerts when merging 4a98351 into db1846d - view on LGTM.com

new alerts:

  • 2 for Unnecessary pass
  • 1 for Unused import

@nudles nudles mentioned this pull request May 12, 2020
return {"W": self.W, "b": self.b}
else:
return {"W": self.W}

def set_params_initializer(self, **initializers):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass initializers as argus of __init__ .

@nudles
Copy link
Member

nudles commented May 14, 2020

The following APIs should be backward compatible. Please test.

class Linear(Layer):
     def __init__(self, num_output, *args, bias=True, **kwargs):
          # the following block is for backward compatibility. 
          # the old code will all Linear(2, 3), or (2, 3, False)
          if len(args) > 0:
             num_input = num_output
             num_output = args[0]
          if len(args) > 1:
             bias = args[1]

       self.num_output = num_output
       self.bias = bias

class Conv2d(Layer):
       def __init__(self,
                 out_channels,
                 kernel_size,
                 *args,
                 stride=1,
                 padding=0,
                 dilation=1,
                 group=1,
                 bias=True,
                 pad_mode="NOTSET",
                 **kwargs):
         # the old code create the layer like: Conv2d(8, 16, 3), or Conv2d(8, 16, 3, stride=1)
         # the following code block is for backward compatibility
         if len(args) >0:
           in_channel=out_channel
           out_channel = kernel
           kernel = args[0]
         if len(args) > 1:
           stride = args[1]
         if len(args) > 2:
           padding = args[2]

@dcslin
Copy link
Member Author

dcslin commented May 15, 2020

update linear constructor, tested ok

@lgtm-com
Copy link

lgtm-com bot commented May 15, 2020

This pull request introduces 3 alerts when merging 3d835a0 into db1846d - view on LGTM.com

new alerts:

  • 2 for Unnecessary pass
  • 1 for Unused import

@joddiy
Copy link
Member

joddiy commented May 18, 2020

Hi, @dcslin , can I use this PR right now or which operation I can use now? since I need to let the soonx to support the new autograd api.

@dcslin dcslin changed the title Autograd refactor [stall]Layer auto detect in features May 18, 2020
@dcslin
Copy link
Member Author

dcslin commented May 18, 2020

Hi, @dcslin , can I use this PR right now or which operation I can use now? since I need to let the soonx to support the new autograd api.

Hi @joddiy please refer to #697

@dcslin dcslin changed the title [stall]Layer auto detect in features [stall]Layer auto detect input size May 18, 2020
@joddiy
Copy link
Member

joddiy commented May 18, 2020

Hi, @dcslin , can I use this PR right now or which operation I can use now? since I need to let the soonx to support the new autograd api.

Hi @joddiy please refer to #697

Thanks, shicong, and one thing to confirm, the name of operation should be _ReLU or ReLU?

@dcslin
Copy link
Member Author

dcslin commented May 19, 2020

Hi, @dcslin , can I use this PR right now or which operation I can use now? since I need to let the soonx to support the new autograd api.

Hi @joddiy please refer to #697

Thanks, shicong, and one thing to confirm, the name of operation should be _ReLU or ReLU?

I guess you are building the model? then by convention we use autograd.relu()

@nudles
Copy link
Member

nudles commented May 20, 2020

Hi, @dcslin , can I use this PR right now or which operation I can use now? since I need to let the soonx to support the new autograd api.

Hi @joddiy please refer to #697

Thanks, shicong, and one thing to confirm, the name of operation should be _ReLU or ReLU?

I guess you are building the model? then by convention we use autograd.relu()

I suggest to use layer.ReLU() to avoid mixing operators and layers in constructing the model.
Refer to the first post in #696 (comment)

@dcslin
Copy link
Member Author

dcslin commented May 22, 2020

some of the work is merged into #697
the rest need to be revised as api changed
thus closing this PR

@dcslin dcslin closed this May 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants