Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question regarding periodic_checkpoint_folder #977

Open
tjiagoM opened this issue Dec 11, 2019 · 5 comments
Open

Question regarding periodic_checkpoint_folder #977

tjiagoM opened this issue Dec 11, 2019 · 5 comments

Comments

@tjiagoM
Copy link

tjiagoM commented Dec 11, 2019

Hi!
First of all, thanks again for the development of tpot!

When I create a TPOTClassifier, I specify folders both in periodic_checkpoint_folder and memory arguments.

I thought that by making these checkpoints, I would be able to resume the search in case my script was suddenly stopped, given that the information was saved both by the checkpoints and the memory cache. However, from what I can see in the documentation (and from my trials) this is not possible.

Can you confirm whether this is indeed not possible? If that's the case, I'd like to suggest that as a new feature!

@weixuanfu
Copy link
Contributor

So far it is not possible but I think it would be a nice new feature. Contributions are welcome for this new feature.

@Midhilesh29
Copy link

Hi!
I wish to contribute to this feature.

@weixuanfu
Copy link
Contributor

@Midhilesh29 Cool, please submit a PR based on dev branch for this feature.

@Midhilesh29
Copy link

I have a doubt can you please help me with that.

  1. Is that enough to serialize the individuals in the population by converting them into a string using clean_pipeline_string function and use pickle to store them?
  2. Can I Convert the stored string into deap.gp.primitives object and again run a function which has similar logic as eaMuPlusLambda function given in the TPOT?
  3. For this checkpoint, I have planned to develop a class which takes how often the individuals should be saved and checkpoint directory. Is that a good idea? If so where can I add the class?
  4. Is that necessary to store all the individuals of the population or store only the recent one?

Please help me with these doubts, I am a beginner for contributing to an open-source project

@weixuanfu
Copy link
Contributor

weixuanfu commented Mar 13, 2020

Thank you for working on this! Below are my answers.

I have a doubt can you please help me with that.

1. Is that enough to serialize the individuals in the population by converting them into a string using clean_pipeline_string function and use pickle to store them?

The string should be able to pickled.

2. Can I Convert the stored string into deap.gp.primitives object and again run a function which has similar logic as eaMuPlusLambda function given in the TPOT?

This is possible with the exact same tpot_obj._pset. Please check this example in a unit test.

3. For this checkpoint, I have planned to develop a class which takes how often the individuals should be saved and checkpoint directory. Is that a good idea? If so where can I add the class?

Do you mean a logger class? If so, I think it is a good idea. You can add this class to a new python script file and then import it into base.

4. Is that necessary to store all the individuals of the population or store only the recent one?

Maybe we can provide both options, low-memory option for only individuals from the recent generation and default option for all the individuals over all generations.

Please help me with these doubts, I am a beginner for contributing to an open-source project

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants