the grad of lars should be scaled in lbsgd #15102

starimpact · 2019-05-30T14:41:22Z

 776     def _get_lars(self, weight, g, wd):
 777         """Returns a scaling factor for the learning rate for this layer
 778         default is 1
 779         """
 780         weight2 = self._l2norm(weight)
 781         grad2 = self._l2norm(g)

 782         grad2 = grad2*(self.rescale_grad**2)

 783         lars = math.sqrt(weight2 / (grad2 + wd * weight2 + 1e-18))
 784         if lars < 0.01:
 785             lars = 0.01
 786         elif lars > 100:
 787             lars = 100
 788         return lars

mxnet-label-bot · 2019-05-30T14:41:26Z

Hey, this is the MXNet Label Bot.
Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it.
Here are my recommended labels: Bug

frankfliu · 2019-05-30T16:15:27Z

@mxnet-label-bot add [operator, bug]

abhinavs95 · 2019-06-04T19:41:00Z

Hi @starimpact Could you provide some more info like a brief description of the problem with a minimum reproducible example?

abhinavs95 · 2019-06-07T19:50:22Z

@mxnet-label-bot add [Pending Requester Info]

lanking520 · 2019-07-17T22:31:09Z

The user point to a valid location:

python/mxnet/optimizer/optimizer.py

Please track this file for further investigation.

@starimpact could you please bring more information about why this change is necessary?

anirudhacharya · 2019-10-04T21:05:36Z

@starimpact please try this optimizer https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/optimizer/optimizer.py#L788 and close this issue if your concern is addressed. lbsgd is likely to be deprecated.

starimpact · 2019-10-16T01:57:37Z

_l2norm is time consuming for a large parameter, so I suggest it should be:

    def _l2norm(self, v):
        "inner product implementation"
        #for big local parameter
        v = v.reshape(-1)
        if len(v) > 100000:
            step = len(v)/100000+1
            v = v[::step]
        norm = multiply(v, v).asnumpy().sum()
        #norm = (multiply(v, v).sum()).asnumpy()
        norm = math.sqrt(norm)
        return norm

marcoabreu added Bug Operator labels May 30, 2019

marcoabreu added the Pending Requester Info label Jun 7, 2019

lanking520 removed the Pending Requester Info label Jul 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the grad of lars should be scaled in lbsgd #15102

the grad of lars should be scaled in lbsgd #15102

starimpact commented May 30, 2019 •

edited

mxnet-label-bot commented May 30, 2019

frankfliu commented May 30, 2019

abhinavs95 commented Jun 4, 2019

abhinavs95 commented Jun 7, 2019

lanking520 commented Jul 17, 2019 •

edited

anirudhacharya commented Oct 4, 2019

starimpact commented Oct 16, 2019

the grad of lars should be scaled in lbsgd #15102

the grad of lars should be scaled in lbsgd #15102

Comments

starimpact commented May 30, 2019 • edited

mxnet-label-bot commented May 30, 2019

frankfliu commented May 30, 2019

abhinavs95 commented Jun 4, 2019

abhinavs95 commented Jun 7, 2019

lanking520 commented Jul 17, 2019 • edited

anirudhacharya commented Oct 4, 2019

starimpact commented Oct 16, 2019

starimpact commented May 30, 2019 •

edited

lanking520 commented Jul 17, 2019 •

edited