Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FTML optimizer #48

Merged
merged 4 commits into from Dec 7, 2018

Conversation

@zoq
Copy link
Member

commented Nov 8, 2018

Implementation of "Follow the Moving Leader in Deep Learning" by Shuai Zheng and James T. Kwok.

@rcurtin
Copy link
Member

left a comment

Looks great to me. The paper was good; I had not previously known of the connection of Adam to FTL-type techniques.

I think we should add documentation before merge, but I am writing that in all the PRs so maybe I sound like a broken record now. :)

* @endcode
*
* For FTML to work, a DecomposableFunctionType template parameter is
* required. This class must implement the following function:

This comment has been minimized.

Copy link
@rcurtin

rcurtin Nov 19, 2018

Member

We have this documentation across a lot of optimizers, but the addition of function_types.md probably means we can reduce and centralize a lot of it. Do you think that we could replace this with something like... FTML requires a separable differentiable function to optimize (see <url>). The only problem with that is that it's not clear what URL to use there. I suppose ideally we'd like to point a user at the website documentation, but, the URL there could change. So I am not sure what the best choice is.

This comment has been minimized.

Copy link
@rcurtin

rcurtin Dec 5, 2018

Member

Actually for this one I wonder if it would be best to open another issue instead of handling it in this PR.

const size_t batchSize = 32,
const double beta1 = 0.9,
const double beta2 = 0.999,
const double eps = 1e-8,

This comment has been minimized.

Copy link
@rcurtin

rcurtin Nov 19, 2018

Member

Like for Padam, do you think we should adjust this documentation to point out that it's to avoid divisions by zero? (Or something like that.)

Also for consistency here I guess we should call it eps since there is a method Epsilon().

zoq added some commits Dec 4, 2018

@rcurtin

rcurtin approved these changes Dec 5, 2018

Copy link
Member

left a comment

Looks great to me. Feel free to merge whenever you are happy with the code. If you want to add something to HISTORY.txt go ahead, otherwise I will do as part of the next release process.

@zoq zoq merged commit 630803a into mlpack:master Dec 7, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@zoq

This comment has been minimized.

Copy link
Member Author

commented Dec 7, 2018

I'll update HISTORY.txt, once the other PR is merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.