Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: differential_evolution improve init_population_lhs comments closes #5101. #5132

Merged
merged 3 commits into from Oct 13, 2015

Conversation

Projects
None yet
4 participants
@andyfaff
Copy link
Contributor

andyfaff commented Aug 10, 2015

This PR modifies:

  1. The init_population_lhs method to clarify what the code does.
  2. It adds population_size and parameter_count attributes to the DifferentialEvolutionSolver class.
  3. Adds extra comment lines in the optimizing loop in the solve method to clarify what's occurring.

This is to address some comments made in #5101

@Hvass-Labs, @dlax

@andyfaff andyfaff force-pushed the andyfaff:lhs_comment branch from 6d18cbf to 8fad595 Aug 10, 2015

@andyfaff andyfaff force-pushed the andyfaff:lhs_comment branch from 8fad595 to d573eed Aug 10, 2015

# Store the dimensionality of the problem.
self.parameter_count = np.size(self.limits, 1)

# We need a random number generator.
self.random_number_generator = _make_random_gen(seed)

This comment has been minimized.

Copy link
@larsmans

larsmans Aug 11, 2015

Contributor

While I appreciate well-commented code, I think this is overdoing it.

for candidate in range(np.size(self.population, 0)):

# evolve and test each candidate solution in the population.
for candidate in range(self.population_size):

This comment has been minimized.

Copy link
@larsmans

larsmans Aug 11, 2015

Contributor

Same here. This goes against DRY.

@andyfaff andyfaff changed the title MAINT: differential_evolution improve init_population_lhs comments MAINT: differential_evolution improve init_population_lhs comments closes #5101. Aug 13, 2015

@rgommers

This comment has been minimized.

Copy link
Member

rgommers commented Oct 11, 2015

@Hvass-Labs, @dlax: a review of this would be nice. It looks like a clean improvement to me, so I'm inclined to merge it.

@rgommers rgommers added this to the 0.17.0 milestone Oct 11, 2015

self.population = np.zeros_like(rdrange)
# Offset each segment to cover the entire parameter range [0, 1)
samples += np.atleast_2d(
np.linspace(0., 1., self.population_size, endpoint=False)).T

This comment has been minimized.

Copy link
@dlax

dlax Oct 11, 2015

Member

Is there any reason for not building samples in just one step? To avoid samples += just after defining samples.

This comment has been minimized.

Copy link
@andyfaff

andyfaff Oct 11, 2015

Author Contributor

I could undoubtedly be done in one line. However, one of the comments made in #5101 was that the source code was difficult to follow (commenting, etc).

This line is split into two to try and make it clearer what the samples array is and how it is created. I actually felt that the code as it existed was clear already, but then I originally wrote the code.

I think that two lines is clearer, but I'm easy either way.

This comment has been minimized.

Copy link
@dlax

dlax Oct 12, 2015

Member

Andrew Nelson wrote:

In scipy/optimize/_differentialevolution.py
#5132 (comment):

  •    # Make the random pairings
    
  •    self.population = np.zeros_like(rdrange)
    
  •    # Offset each segment to cover the entire parameter range [0, 1)
    
  •    samples += np.atleast_2d(
    
  •        np.linspace(0., 1., self.population_size, endpoint=False)).T
    

I could undoubtedly be done in one line. However, one of the comments
made in #5101 #5101 was that the
source code was difficult to follow (commenting, etc).

This line is split into two to try and make it clearer what the
|samples| array is and how it is created. I actually felt that the code
as it existed was clear already, but then I originally wrote the code.

I think that two lines is clearer, but I'm easy either way.

Hmm, ok... Not convinced ;) I still think a single statement, maybe
splitted into two lines with a comment between would be clearer.

Also, I'd find np.linspace(0., 1., self.population_size, endpoint=False)[:, np.newaxis] clearer than the atleast_2d + .T combo.

# default population initialization is a latin hypercube design, but
# there are other population initializations possible.
self.population = np.zeros((popsize * self.parameter_count,
self.parameter_count))

This comment has been minimized.

Copy link
@dlax

dlax Oct 11, 2015

Member

What's the point of initializing self.population here? Unless I'm mistaken, It gets overridden unconditionally below by self.init_population_* calls.

This comment has been minimized.

Copy link
@andyfaff

andyfaff Oct 11, 2015

Author Contributor

Once the DifferentialEvolutionSolver is created, I wanted to be able to reuse the solver. This means reinitialising the population with init_population_lhs or init_population_random. This is why the methods are public, not private. Those two methods need to know the shape of the population in order to create it. It's not a good idea to have to pass a parameter to the methods, so I was relying on already having created a population of the right shape, which is why it's initialised there. It is overwritten unconditionally below.

What I could do is add a self._population_shape = (popsize * self.parameter_count, self.parameter_count) attribute to the class. The init_population_xxx method could then use that shape. However, it's 50:50 either way as to what's best.

This comment has been minimized.

Copy link
@dlax

dlax Oct 12, 2015

Member

Andrew Nelson wrote:

In scipy/optimize/_differentialevolution.py
#5132 (comment):

     self.random_number_generator = _make_random_gen(seed)
  •    #default initialization is a latin hypercube design, but there
    
  •    #are other population initializations possible.
    
  •    self.population = np.zeros((popsize \* parameter_count,
    
  •                                parameter_count))
    
  •    # default population initialization is a latin hypercube design, but
    
  •    # there are other population initializations possible.
    
  •    self.population = np.zeros((popsize \* self.parameter_count,
    
  •                                self.parameter_count))
    

Once the |DifferentialEvolutionSolver| is created, I wanted to be able
to reuse the solver. This means reinitialising the population with
|init_population_lhs| or |init_population_random|. This is why the
methods are public, not private. Those two methods need to know the
shape of the population in order to create it. It's not a good idea to
have to pass a parameter to the methods, so I was relying on already
having created a population of the right shape, which is why it's
initialised there. It is overwritten unconditionally below.

What I could do is add a |self._population_shape = (popsize *
self.parameter_count, self.parameter_count)| attribute to the class. The
|init_population_xxx| method could then use that shape. However, it's
50:50 either way as to what's best.

This makes me think that this population should perhaps not be an
attribute after all, but something passed to the solve() method. The
typical usage would then be:

solver = DifferentialEvolutionSolver(func, bounds)
initial_population = init_population_lhs(rng)
result = solver.solve(initial_population)

where the init_population_xx could be functions.

Though, I'm not a user of this API, and this is certainly of this PR
scope...

This comment has been minimized.

Copy link
@andyfaff

andyfaff Oct 12, 2015

Author Contributor

There was a reason for having the current scheme. The solve method takes the current state of the population and evolves it by maxiter generations. One can check how the population evolves over time by setting maxiter to an arbitrary number (e.g. 1), and calling solve repeatedly, each time checking what the population is doing. For this reason you wouldn't want to initialise the population before each call to solve.
A second, equally important reason, is that there are restrictions on the population shape (number of columns needs to be same length as bounds).

A future change to this class could be to extract the for nit in range(1, self.maxiter + 1): loop into a separate method, and make that method into a generator. Each call to the generator would evolve the population by a single generation.

dlax added a commit that referenced this pull request Oct 13, 2015

Merge pull request #5132 from andyfaff/lhs_comment
MAINT: differential_evolution improve init_population_lhs comments closes #5101.

@dlax dlax merged commit 8d7e701 into scipy:master Oct 13, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@andyfaff andyfaff deleted the andyfaff:lhs_comment branch Oct 13, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.