Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is this value coming from #125

Open
rafikg opened this issue Sep 19, 2019 · 2 comments
Open

What is this value coming from #125

rafikg opened this issue Sep 19, 2019 · 2 comments

Comments

@rafikg
Copy link

rafikg commented Sep 19, 2019

stepsize = 136106 #68053 // after each stepsize iterations update learning rate: lr=lr*gamma
?

@michalfaber
Copy link
Owner

Yes, indeed it looks strange. It's been taken from the original caffe version of this repo. In caffe it is a parameter (number of iterations) of a solver. I don't plan to refactor this code as I am working on a new version for Tensorflow 2.0

@rafikg
Copy link
Author

rafikg commented Sep 21, 2019

In the original paper, they did not say anything about the training scheme (optimizer, learning rate, epochs,...).
I reimplement your model with tf.keras and everything was good except the MultiSGD where I get an error here is my code:

@tf_export('keras.optimizers.MultiSGD')
class MultiSGD(Optimizer):
    """Stochastic gradient descent optimizer.

    Includes support for momentum,
    learning rate decay, and Nesterov momentum.

    Arguments:
        lr: float >= 0. Learning rate.
        momentum: float >= 0. Parameter that accelerates SGD
            in the relevant direction and dampens oscillations.
        decay: float >= 0. Learning rate decay over each update.
        nesterov: boolean. Whether to apply Nesterov momentum.
    """

    def __init__(self, lr=0.01, momentum=0., decay=0., nesterov=False, lr_mult=None, **kwargs):
        super(MultiSGD, self).__init__(**kwargs)
        with K.name_scope(self.__class__.__name__):
            self.iterations = K.variable(0, dtype='int64', name='iterations')
            self.lr = K.variable(lr, name='lr')
            self.momentum = K.variable(momentum, name='momentum')
            self.decay = K.variable(decay, name='decay')
        self.initial_decay = decay
        self.nesterov = nesterov
        self.lr_mult = lr_mult

    def get_updates(self, loss, params):
        grads = self.get_gradients(loss, params)
        self.updates = [state_ops.assign_add(self.iterations, 1)]

        lr = self.lr
        if self.initial_decay > 0:
            lr = lr * (  # pylint: disable=g-no-augmented-assignment
                    1. / (1. + self.decay * math_ops.cast(self.iterations,
                                                          K.dtype(self.decay))))
        # momentum
        shapes = [K.int_shape(p) for p in params]
        moments = [K.zeros(shape) for shape in shapes]
        self.weights = [self.iterations] + moments
        for p, g, m in zip(params, grads, moments):
            if p.name in self.lr_mult:
                multiplied_lr = lr * self.lr_mult[p.name]
            else:
                multiplied_lr = lr
            # v = self.momentum * m - lr * g  # velocity
            v = self.momentum * m - multiplied_lr * g  # velocity
            self.updates.append(state_ops.assign(m, v))

            if self.nesterov:
                # new_p = p + self.momentum * v - lr * g
                new_p = p + self.momentum * v - multiplied_lr * g
            else:
                new_p = p + v

            # Apply constraints.
            if getattr(p, 'constraint', None) is not None:
                new_p = p.constraint(new_p)

            self.updates.append(state_ops.assign(p, new_p))
        return self.updates

    def get_config(self):
        config = {
            'lr': float(K.get_value(self.lr)),
            'momentum': float(K.get_value(self.momentum)),
            'decay': float(K.get_value(self.decay)),
            'nesterov': self.nesterov
        }
        base_config = super(MultiSGD, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

tensorflow.python.framework.errors_impl.FailedPreconditionError: Error while reading resource variable training/MultiSGD/Variable_180 from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/training/MultiSGD/Variable_180/N10tensorflow3VarE does not exist.
[[{{node training/MultiSGD/mul_541/ReadVariableOp}} = ReadVariableOpdtype=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]
[[{{node metrics/acc_9/Mean/_791}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_8739_metrics/acc_9/Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants