Confusion about computing the gradient of predictor? #12

wenerhg · 2019-03-11T12:33:32Z

In 'NAO/cnn/encoder/encoder.py', the gradient of predictor is computed as follow:
new_arch_outputs = self.encoder_outputs - self.params['predict_lambda'] * grads_on_outputs

However, I think the gradient should be computed as described in paper:
$h_{t}^{'} = h_{t} + \eta \frac{\partial f}{\partial h_{t}}, e_{x^{'}}=\{h_{1}^{'},\dots,h_{T}^{'}\}$

Is there an error with computing the gradient of predictor in code?

The text was updated successfully, but these errors were encountered:

renqianluo · 2019-05-07T06:51:34Z

@wenerhg In the paper, for the general purpose, we denote f as the performance which is the larger the better, so we move towards the positive direction of the gradient, which shows as '+'. In the code, we use error rate as f which is the lower the better, so we need to move towards the negative direction, which shows as '-'.

renqianluo · 2019-05-07T06:52:47Z

@wenerhg in other words, if you use accuracy as f in code, you should move towards the positive direction and use '+'

renqianluo closed this as completed May 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confusion about computing the gradient of predictor? #12

Confusion about computing the gradient of predictor? #12

wenerhg commented Mar 11, 2019 •

edited

Loading

renqianluo commented May 7, 2019

renqianluo commented May 7, 2019

Confusion about computing the gradient of predictor? #12

Confusion about computing the gradient of predictor? #12

Comments

wenerhg commented Mar 11, 2019 • edited Loading

renqianluo commented May 7, 2019

renqianluo commented May 7, 2019

wenerhg commented Mar 11, 2019 •

edited

Loading