Memory issues #5

guyko81 · 2016-02-11T15:08:51Z

Hi Trevor,

it's a very nice implementation - I was searching for such solution for a long time. So really thank you!

I got only 1 issue that with long term of evolution (generations = some_huge_number; or population_size = some_huge_number + generations = some_number) the program runs out of memory. I checked the code and it saves every iteration's population. Do you think it's necessary? In my understanding we only need the current population and the best of the previous in the beginning.

What do you think, can the code be changed some way to make
self._programs = []
before every iteration and just save the previous one in a self._programs_prev (or something)?

jamartinh · 2016-02-11T15:24:50Z

Hello, I have also experienced the same issue, and can't run my experiments
for many iterations.

I tough it was a problem with garbage collection.

I think one can have this as a parameter. num_generations_history

Cheers,
Jose A.

2016-02-11 16:08 GMT+01:00 guyko81 notifications@github.com:

Hi Trevor,

it's a very nice implementation - I was searching for such solution for a
long time. So really thank you!

I got only 1 issue that with long term of evolution (generations =
some_huge_number; or population_size = some_huge_number + generations =
some_number) the program runs out of memory. I checked the code and it
saves every iteration's population. Do you think it's necessary? In my
understanding we only need the current population and the best of the
previous in the beginning.

What do you think, can the code be changed some way to make
self._programs = []
before every iteration and just save the previous one in a
self._programs_prev (or something)?

—
Reply to this email directly or view it on GitHub
#5.

/ .- .-.. .-.. / -.-- --- ..- / -. . . -.. / .. ... / .-.. --- ...- .
José Antonio Martín H. (PhD) E-Mail: jamartinh@fdi.ucm.es
Computer Science Faculty Phone: (+34) 91 3947650
Complutense University of Madrid Fax: (+34) 91 3947527
C/ Prof. José García Santesmases,s/n 28040 Madrid, Spain
web: http://www.dacya.ucm.es/jam/
LinkedIn: http://www.linkedin.com/in/jamartinh (Let's connect)
.-.. --- ...- . / .. ... / .- .-.. .-.. / .-- . / -. . . -..

trevorstephens · 2016-02-11T15:41:49Z

Thanks for the report! I'll look into your hypothesis @guyko81 but suspect the issue is more likely with numpy arrays being stored as the equations are recursively evaluated. These /should/ be garbage collected by Python as they are never stored in the object, but I'll check that out as well @jamartinh

I have seen this issue as well, and was thinking that a eval_size parameter might help by evaluating fewer samples at once, rather than the whole dataset. I've been meaning to work on a v0.2 for a while now. This should be top of the list.

For now, you might find using n_jobs=1 more stable (fewer evaluations at once) or ramping up the parsimony to keep the programs smaller.

guyko81 · 2016-02-11T16:10:53Z

Thanks Trevor! Can't tell more, so thank you :)

trevorstephens · 2016-09-03T18:33:22Z

I've located the main culprit. It is due almost entirely to saving the indices of X & y used for evaluating a programs fitness in the case of using max_samples. These indices are also retained for no under-sampling. I am working on a fix now, and can still retain all prior populations for inspecting the lineage of a final program.

trevorstephens · 2016-09-05T18:33:17Z

I have also added a check at each evolution to see whether older generations are still relevant, ie whether any of their "dna" exists in the current generation. Any irrelevant programs will be removed from the old generation's population by marking them as None. This results in a massive reduction of the number of programs stored and should help significantly with memory use.

trevorstephens · 2016-09-10T19:12:22Z

Mostly fixed by #19 ... Please re-open if problems still persist in the master branch or the next release.

trevorstephens added the bug label Feb 11, 2016

trevorstephens mentioned this issue Mar 5, 2016

Broken Pipe Issue #8

Closed

trevorstephens mentioned this issue Apr 29, 2016

How to limit the memory required by gplearn? #13

Closed

trevorstephens mentioned this issue Sep 5, 2016

[MRG] Reduce memory consumption #19

Merged

trevorstephens closed this as completed Sep 10, 2016

trevorstephens added this to the 0.2.0 milestone Mar 26, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory issues #5

Memory issues #5

guyko81 commented Feb 11, 2016

jamartinh commented Feb 11, 2016

trevorstephens commented Feb 11, 2016

guyko81 commented Feb 11, 2016

trevorstephens commented Sep 3, 2016

trevorstephens commented Sep 5, 2016

trevorstephens commented Sep 10, 2016

Memory issues #5

Memory issues #5

Comments

guyko81 commented Feb 11, 2016

jamartinh commented Feb 11, 2016

trevorstephens commented Feb 11, 2016

guyko81 commented Feb 11, 2016

trevorstephens commented Sep 3, 2016

trevorstephens commented Sep 5, 2016

trevorstephens commented Sep 10, 2016