Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust SPSA Logistic Regression test optimizer parameter #87

Merged
merged 3 commits into from Mar 5, 2019

Conversation

zoq
Copy link
Member

@zoq zoq commented Feb 28, 2019

Adjust SPSA Logistic Regression test optimizer parameter.

@favre49 favre49 mentioned this pull request Mar 3, 2019
@rcurtin
Copy link
Member

rcurtin commented Mar 4, 2019

I found that this helps, but there are still occasional failures. I had better success by running the test three times, like we do for some tests, and then making sure it succeeded at least once. Here's the code I used:

TEST_CASE("SPSALogisticRegressionTest", "[SPSATest]")
{
  bool success = false;
  arma::Row<size_t> responses, testResponses, shuffledResponses;

  for (size_t trial = 0; trial < 3; ++trial)
  {
    LogisticRegressionTestData(data, testData, shuffledData,
        responses, testResponses, shuffledResponses);
    LogisticRegression<> lr(shuffledData, shuffledResponses, 0.5);

    SPSA optimizer(0.5, 1, 0.102, 0.16, 0.3, 1000, 1e-7);
    arma::mat coordinates = lr.GetInitialPoint();
    optimizer.Optimize(lr, coordinates);

    // Ensure that the error is close to zero.
    const double acc = lr.ComputeAccuracy(data, responses, coordinates);
    const double testAcc = lr.ComputeAccuracy(testData, testResponses,
        coordinates);
    if (acc == Approx(100.0).epsilon(0.003) &&
        testAcc == Approx(100.0).epsilon(0.006))
      success = true;
    
    if (success)
      break;
  }

  REQUIRE(success == true);
}

Note that I also found that each run of Optimize() took surprisingly long---so I reduced the number of iterations to only 1000. I think that maybe sometimes SPSA will go a bad direction and doesn't come back, so restarts are a solution.

However, I then dug into the SPSA implementation and found some other issues that I'm happy to fix but I want to get some confirmation on:

  1. SPSA is not meant for differentiable separable functions like the documentation says. In fact it is for arbitrary functions, as it only calls Evaluate().
  2. SPSA isn't originally defined as something that works on separable functions, but the extension is straightforward; simply call Evaluate() on a small batch of points instead of all the points. Should we implement that extension? I am okay with that if we can make it clear to users via, perhaps, two separate classes/typedefs like SPSA (full batch) and BatchSPSA (or some similar name).
  3. I'm confused by this block inside SPSA::Optimize():
    gradient.zeros();
    for (size_t b = 0; b < batchSize; b++)
    { 
      // Stochastic directions.
      spVector = arma::conv_to<arma::mat>::from(
          arma::randi(iterate.n_rows, iterate.n_cols,
          arma::distr_param(0, 1))) * 2 - 1;
      
      iterate += ck * spVector;
      const double fPlus = function.Evaluate(iterate, 0, iterate.n_elem);
      
      iterate -= 2 * ck * spVector;
      const double fMinus = function.Evaluate(iterate, 0, iterate.n_elem);                                                                                                       
      iterate += ck * spVector;

      gradient += (fPlus - fMinus) * (1 / (2 * ck * spVector));
    }

    gradient /= (double) batchSize;

That should be the approximation of the gradient. However, we're simply looping batchSize times to approximate the gradient, and we're also calling Evaluate() very strangely, with a strange batch size of iterate.n_elem. It seems to me that:

  • if we are doing full-batch SPSA (as defined earlier) that we should be computing gradient as:
    spVector = arma::conv_to<arma::mat>::from(
        arma::randi(iterate.n_rows, iterate.n_cols,
        arma::distr_param(0, 1))) * 2 - 1;

    iterate += ck * spVector;
    const double fPlus = function.Evaluate(iterate);
      
    iterate -= 2 * ck * spVector;
    const double fMinus = function.Evaluate(iterate);                                                                                                       
    iterate += ck * spVector;

    gradient = (fPlus - fMinus) * (1 / (2 * ck * spVector));

A key point there is that there is no loop over the batch size.

  • If we are doing small-batch SPSA, then we should call function.Evaluate(iterate, 0, batchSize) instead.

Let me know what you think. Like I said I'm happy to make the changes, but I want to double-check that you agree first.

@zoq
Copy link
Member Author

zoq commented Mar 4, 2019

Interesting, I Interesting, I tested the optimizer like 500 times with a single error (using a new random seed each time), but if you still see errors I'm fine to run the test multiple times, I guess since you adjusted the number of iterations it's still fast enough. About SPSA/BatchSPSA; I agree, switching to function.Evaluate(iterate) makes a lot more sense. I think at this point there is no need to provide a batch variant unless you see one. Do you like to open a PR?

Copy link
Member

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I observed about the same thing too. It was one error every ~500-800 iterations, but I generally shoot for ~1000 clean iterations with no failures. (Maybe I am too paranoid?) Mostly I found the change I proposed made it run a bunch faster too. Anyway, I'm fine whichever way you want to go with this.

@zoq
Copy link
Member Author

zoq commented Mar 4, 2019

We should definitely switch to function.Evaluate(iterate), and if it runs faster I guess it makes sense to adjust that one as well. I can do this inside of this PR or we can open a new one, but I don't think we should merge this one in. Let me know what you think.

@rcurtin
Copy link
Member

rcurtin commented Mar 4, 2019

I'll work up a patch and get it to you shortly, and we can apply all the changes in this PR. 👍

@rcurtin
Copy link
Member

rcurtin commented Mar 4, 2019

rcurtin and others added 2 commits March 5, 2019 01:23
Signed-off-by: Marcus Edel <marcus.edel@fu-berlin.de>
Copy link
Member

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks, I forgot to remove the random seed. Feel free to merge whenever you're ready. 👍

@zoq zoq merged commit 64f15cf into mlpack:master Mar 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants