Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMAES inconsistency fix #88

Closed
wants to merge 10 commits into from
Closed

CMAES inconsistency fix #88

wants to merge 10 commits into from

Conversation

favre49
Copy link
Member

@favre49 favre49 commented Mar 1, 2019

This is a fix for issue #70 . The boundary handler has been adapted from the python code for CMA-ES
I had to change the bounds in the test functions since they did not align with the solution.
I will update the documentation for the code in a couple hours if there are no problems

@zoq zoq mentioned this pull request Mar 5, 2019
@zoq
Copy link
Member

zoq commented Mar 5, 2019

You can put the update right after the Fixes for SPSA (#87). line.

@zoq
Copy link
Member

zoq commented Mar 5, 2019

Ah wrong PR.

@favre49
Copy link
Member Author

favre49 commented Mar 8, 2019

weird error here, the sudo apt-get update failed?

Copy link
Member

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @favre49, sorry the response took so long here. It's looking good to me and it'll be nice to fix these issues, but I have a few questions still. Let me know what you think. 👍

@@ -129,7 +129,7 @@ double CMAES<SelectionPolicyType>::Optimize(
const size_t idx1 = i % 2;

// Perform Cholesky decomposition. If the matrix is not positive definite,
// add a small value and try again
// add a small value and try again.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding that. :)

@@ -146,6 +166,16 @@ class CMAES

//! The selection policy used to calculate the objective.
SelectionPolicyType selectionPolicy;

//! Step size damping.
double ds;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for consistency, it might be beter to name this stepSizeDamping since the method to access it is StepSizeDamping(). (Similar for CumulationConstant().) If we are exposing these parameters, should we also add constructor parameters for these also? I think maybe I don't fully follow what these parameters are for (or how a user would set them).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change the name for sure. I included those two based on Marcus' suggestion in the issue discussion. These parameters are used in the update of the variables. It provides some extra flexibility for the user, but I wonder if including them in the constructor would become more confusing to the user, or how useful they would be (I use CMA-ES a lot for my work, and have never really toyed with these values). Maybe @zoq can shed some light on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's fair. It doesn't matter to me if the variables are included or not, but if we do include them then I think we should make them accessible from the constructor. 👍

size_t PopulationSize() const { return lambda; }
//! Modify the step size.
//! Modify the population size.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, thanks! 👍

@@ -70,6 +71,7 @@ class CMAES
* objective.
*/
CMAES(const size_t lambda = 0,
const double initialSigma = 0.6,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned that this would break reverse compatibility and we'd have to bump the major version (I don't think that's justified by a relatively simple change like this). According to our versioning policy,

 * A MAJOR version bump implies a backwards-incompatible change to the
   user-facing optimizers.  Code using one major version of ensmallen may
   not work with a newer major version of ensmallen.

We could instead add initialSigma as the last parameter with a default value, and then we could optionally mark that constructor deprecated ("will be removed in 2.10.0") and add another constructor just like this one here. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, didn't think of that. I'll change it :)

const double cs = (muEffective + 2) / (iterate.n_elem + muEffective + 5);
const double ds = 1 + cs + 2 * std::max(std::sqrt((muEffective - 1) /
ds = 1 + cs + 2 * std::max(std::sqrt((muEffective - 1) /
(iterate.n_elem + 1)) - 1, 0.0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess, then, that if the user has set ds or cc, they get overwritten here when Optimize() is called?

double x = iterate(it,ii);

// Shift x into [lowerBound-al, upperBound+au].
if (x < lowerBound - 2 * al - (upperBound - lowerBound) / 2 || x > upperBound + 2 * au + (upperBound - lowerBound) / 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style issue---this line is over 80 characters long. Can you wrap it please?

{
for (size_t it = 0; it < iterate.n_rows; it++)
{
double x = iterate(it,ii);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny style issue, this should be iterate(it, ii).

@@ -283,6 +287,79 @@ double CMAES<SelectionPolicyType>::Optimize(
return overallObjective;
}

// Transforms the candidate into the given bounds.
template<typename SelectionPolicyType>
void CMAES<SelectionPolicyType>::BoundaryTransform(arma::mat& iterate)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, if I'm understanding that this function, given iterate, will transform each element of iterate such that it falls between lowerBound and upperBound, you could just use arma::clamp(), I think. Correct me if I'm wrong and there's a bit that I missed. (It looks like maybe there is some handling for when upperBound and lowerBound are too close together? I have not figured out exactly what that does.)

pObjective(idx(j)) = selectionPolicy.Select(function, batchSize,
pPosition.slice(idx(j)));
BoundaryTransformInverse(pPosition.slice(idx(j)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I guess I'm a little confused here. I would imagine it would be easier to take pPosition.slice(idx(j)) and simply clamp it to be within the desired range, but I'm not sure what the reverse transformation is. Can you clarify the functionality here a bit? Maybe I misunderstood.

@mlpack-bot
Copy link

mlpack-bot bot commented Jul 19, 2019

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

@gaurav-singh1998
Copy link
Contributor

Hi, @favre49, @zoq, @rcurtin since @favre49 has implemented and addressed the majority part regarding the issue #70 I want to take up this and implement the necessary changes that @rcurtin suggested above. Can I move forward with it?

@gaurav-singh1998
Copy link
Contributor

Also, @favre49 can you provide the link of the source code or the paper that you referred for applying box constraints for each parameter here?

@favre49
Copy link
Member Author

favre49 commented Mar 24, 2020

@gaurav-singh1998 That would be great. I'll try to find the sources I used and get back to you

@favre49
Copy link
Member Author

favre49 commented Mar 24, 2020

If I'm not wrong, I used pycma

@favre49
Copy link
Member Author

favre49 commented Mar 24, 2020

Also, what I failed to address in Ryan's comment pertained to how Nikolas Hansen, the author of the paper on CMA-ES advised handling bound constraints. I am unable to find the resources that I read this from, but the gist of it was that at every step, the result obtained would be "projected" into another space, where it would exist in the bounds (if I remember correctly). The way this has been done is in my code (though probably flawed) as well as in the pycma code. If I'm able to find it again, I'll let you know.

@gaurav-singh1998
Copy link
Contributor

gaurav-singh1998 commented Apr 2, 2020

Hi, @favre49 sorry for responding so late on this. After going through this implementation of boundary handling and an example which shows how to use it in the final CMAES implementation I found that they are not applying the inverse function after transforming them into the given box constraints like you have implemented here and here. I may be wrong somewhere or missing something but kindly help me with this before moving forward. Thanks.

@favre49
Copy link
Member Author

favre49 commented Apr 2, 2020

I think instead of calling inverse here they write the same thing as in the function here. Not sure why, though

@gaurav-singh1998
Copy link
Contributor

gaurav-singh1998 commented Apr 2, 2020

Hi, @favre49 in the libcmaes library, C++ version of CMAES by N Hansen, box constraints are handles in the same way i.e. they just provided a function for the inverse here. So, before moving further with this wanted to clarify should I also apply this concept for handling the box constraints? Thanks.

@favre49
Copy link
Member Author

favre49 commented Apr 3, 2020

Yeah, I think that would make sense

@gaurav-singh1998
Copy link
Contributor

Hi, @favre49 sorry to disturb you again and again but wanted to know can I open a new PR for this?

@favre49
Copy link
Member Author

favre49 commented Apr 4, 2020

No issues, I'll close this when you open your PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants