Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reporting and resetting step sizes #211

Open
pmelchior opened this issue Oct 8, 2020 · 0 comments
Open

Reporting and resetting step sizes #211

pmelchior opened this issue Oct 8, 2020 · 0 comments
Labels

Comments

@pmelchior
Copy link
Owner

Right now there is no feedback if step sizes are well-chosen. During optimization this can become apparent.

  • Too large steps would lead to parameter bouncing, with large changes in overall gradient direction. This should be visible in the relation between m and v, because the latter is the running average of g^2: |m/sqrt(v)| ~ 0.
  • Too small steps would lead to very smooth trajectories with lots of iterations and |m/sqrt(v)| ~ 1.

We could re-introduce Parameter.converged with a python enum of FAST, RIGHT, SLOW, so that user can inspect which parameter steps should be adjusted in case of trouble with convergence. Alternative would be a helper function to that effect.

Separate but related. We currently store m, v, vhat in Parameter, each of them has the same shape as the parameter array. When we pickle the source, they get saved so that we can restart with them. Beside the storage requirement, it is unclear if that's the best way to restart with more sources in Blend because the current sources have already converged. It's hard for a minor source (like a newly revealed detected) to fend for itself on equal footing. Empirically, this is still better than zeroing all of the stored gradient-related quantities.

But maybe there's a middle ground of setting the new step sizes to c * step * m / sqrt(vhat), so that the next step in the iteration would have a step size of the previous step (times some constant c, TBD), but gradients will be computed from scratch for all sources.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant