latent weights performance measure #503

nchiprut · 2020-06-06T12:47:07Z

Feature motivation

For a new optimizer I am trying to write, I need to test the model performance with the learned real valued weights, I think its very interesting to see how the latent weights behave also for other optimizers

Feature description

I would like to have something like:

with lq.context.quantized_scope(False):
    quant_model.evaluate(test_batches)

and in my case it's also very useful to have:

with lq.context.quantized_scope(False):
    trained_model = quant_model.fit(
        train_batches,
        epochs=n_epochs,
        validation_data=test_batches,
    )

Feature implementation

I think I solved this issue by changing BaseLayer behavior (under "larq/layers_base.py)

    def call(self, inputs):
        if self.input_quantizer:
            inputs = self.input_quantizer(inputs)
        # with context.quantized_scope(True):
        return super().call(inputs)

now low level loop of evaluation works as expected, but high level behavior of fit and evaluate seem to be off
maybe there is a simple way of achieving this?
thanks a lot for the good work!

The text was updated successfully, but these errors were encountered:

lgeiger · 2020-06-12T18:11:11Z

Sorry for the late respone, have you checked out this discussion about a very similar request? It also includes an example notebook and a workaround for your use case.

In general the reason why we explicitely set quantized_scope(True) is to ensure that the forward pass will always be equivalent to the expected inference version of the model. If we wouldn't set quantized_scope(True) here this could lead to very tricky to debug issues where the actual accuracy of the model might not match the value reported by Keras. E.g. once the model is built changing the scope might not change the actual inference graph depending on whether you are using TensorFlow in eager or graph mode.

We set quantized scope to false by default in order to enable saving of the latent model for correctly resuming of training. Since otherwise as far as I can tell we don't have control over this without deeply customizing the behaviour of tf.Variable and tf.keras.Model.
In theory we could make is_quantized a tf.Variable that could then be changed by the user via a context manager, but if we implement this in a straight forward way I fear that this could introduce performance bottlenecks since every quantized variable would then be wrapped in a tf.cond which I am not sure if TensorFlow would correctly optimize away during inference.

The original idea behind lq.context.quantized_scope was mainly to have a way to control reading and writing of model variables and not to actual change the behaviour of model.fit directly. So far I personally haven't found the need to to change the behaviour of a kernel quantizer during training and it was enough for me to build a second model without any quantizers to evaluate the behaviour of latent weights. However I'd be happy to discuss possible changes to the behaviour of the API.

Another question where I am not sure about the desired behaviour would be is how to treat input quantizers since to be consistent I guess they would also need to have a flag that turns them on or off then. Looking at this from a higher level it seems that you are looking for a global flag that changes the behaviour of the model during inference. Supporting this directly on the framework level seems to be quite tricky to get right since it could become a potential source of hard to debug issues or and confusion if not done right.

What is the reason why using a separate non-quantized model without kernel quantizers to evaluate performance of your algorithm with latent weights doesn't work for you? This would help me to understand your use case a bit better so I can have a think about whether we can support this in a nice way without breaking existing code or reducing training performance.

nchiprut · 2020-06-15T07:39:04Z

Thanks for the detailed answer, I looked at the notebook and I think it might be good enough for my purpose.
The real use case of the requested feature is the ability to evaluate performance during training, I did something like (I use tf2, so eager mode)

for x,y in ds:
        with lq.context.quantized_scope(False):
            y_hat = model(x)
        with lq.context.quantized_scope(True):
            y_hat_quant = model(x)
       metric.update_state(y, y_hat)
       metric_quant.update_state(y, y_hat_quant)

and with the previous changes of BaseLayer, it seems to works fine, I am just afraid I broke something.
also, it would be great to further implement the evaluation inside 'fit' and 'evaluate', i.e. override 'test_step' and 'train_step' methods of keras.Model, but that seems not to work for some reason - which I thought is the evidence that I broke something.

I still need to test your solution, even though it's a bit more complicated it might be more stable, another issue, like you stated, is the performance- that's why I tried to avoid copying the weights each time to another non-quantized model, and that's also my motivation to override keras.Model methods which, as far as I know, run faster.

thanks again for the great help!

lgeiger self-assigned this Jun 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

latent weights performance measure #503

latent weights performance measure #503

nchiprut commented Jun 6, 2020

lgeiger commented Jun 12, 2020

nchiprut commented Jun 15, 2020 •

edited

latent weights performance measure #503

latent weights performance measure #503

Comments

nchiprut commented Jun 6, 2020

Feature motivation

Feature description

Feature implementation

lgeiger commented Jun 12, 2020

nchiprut commented Jun 15, 2020 • edited

nchiprut commented Jun 15, 2020 •

edited