NCA does not find good solutions #245

Closed
rcurtin opened this Issue Dec 29, 2014 · 2 comments

Projects

None yet

1 participant

@rcurtin
Member
rcurtin commented Dec 29, 2014

Reported by rcurtin on 18 Apr 42819596 23:25 UTC
Currently NCA uses the L-BFGS optimizer, but, the NCA objective function is potentially nonconvex and thus the L-BFGS optimizer may not find a good solution.

Here is an excerpt of results comparing a couple of methods:


cardiotoc       train mean      train variance        runtime (mean)      test mean      test variance

baseline        40.423%         0.015178%             -----               35.031%          0.065745%
lmnn            71.969%         0.085596%               15.228s           66.050%          0.097625%
nca (mlpack)    38.985%         0.019756%              186.37s            33.103%          0.050762%
nca (matlab)
rca
fdsa
m-h mcmc        66.176%         0.062989%             2892.6s             55.392%          0.10958%
m-h mcmc diag   74.637%         0.036214%              826.43s            66.176%          0.16475%
m-h diag (10k)  77.863%         0.032059%             3005.9s             69.843%          0.07890%
m-h diag (25k)  80.040%         0.018136%             7401.9s             72.038%          0.05013%
gibbs mcmc

Almost certainly it's getting stuck in local minima. Therefore, we need to use a different optimizer...

@rcurtin rcurtin self-assigned this Dec 29, 2014
@rcurtin rcurtin added this to the mlpack 1.0.4 milestone Dec 29, 2014
@rcurtin rcurtin closed this Dec 29, 2014
@rcurtin
Member
rcurtin commented Dec 30, 2014

Commented by rcurtin on 20 Jun 42833654 21:39 UTC
SGD (stochastic gradient descent) implemented with a test in r13793. Now it just needs to be applied to NCA.

@rcurtin
Member
rcurtin commented Dec 30, 2014

Commented by rcurtin on 13 Sep 42841981 08:02 UTC
Much better numbers:

cardiotoc       train mean      train variance        runtime (mean)      test mean      test variance

baseline        40.423%         0.015178%             -----               35.031%          0.065745%
lmnn            71.969%         0.085596%               15.228s           66.050%          0.097625%
nca (mlpack)    73.918%         0.026099%             1483.1s             66.066%          0.052424%  alpha = 0.000004, iterations = 200k
nca (matlab)
rca
fdsa
m-h mcmc        66.176%         0.062989%             2892.6s             55.392%          0.10958%
m-h mcmc diag   74.637%         0.036214%              826.43s            66.176%          0.16475%
m-h diag (10k)  77.863%         0.032059%             3005.9s             69.843%          0.07890%
m-h diag (25k)  80.040%         0.018136%             7401.9s             72.038%          0.05013%
gibbs mcmc

With that said I'm going to go ahead and close this. The documentation for NCA is a lot better now too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment