Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TPOTClassifier error for large data #1332

Open
kiranellur opened this issue Nov 30, 2023 · 1 comment
Open

TPOTClassifier error for large data #1332

kiranellur opened this issue Nov 30, 2023 · 1 comment

Comments

@kiranellur
Copy link

kiranellur commented Nov 30, 2023

I am getting the following error

RuntimeError: There was an error in the TPOT optimization process. This could be because the data was not formatted properly, or because data for a regression problem was provided to the TPOTClassifier object. Please make sure you passed the data to TPOT correctly.

My Current best internal cv score is -inf . Even though the optimisation progress bar is displaying 75%

Even though it is working for smaller dataset , I am getting the erro for those having 200000 rows and 20 columns. I am currently using TPOT version 12.0 Is there any specific reason i am getting this?

Can you please help me to resolve this error. Thank you.

@perib
Copy link
Contributor

perib commented Nov 30, 2023

I would recommend trying out TPOT2, the next version of TPOT. You can find it here: https://github.com/EpistasisLab/tpot2
This version is more stable with larger datasets compared to TPOT1. There is also a memory_limit parameter that you can use to set the maximum amount of RAM a single pipeline can take up.

For TPOT1:
Perhaps it is simply running out of RAM and crashing?

Some suggestions:
You could try to reduce RAM usage by lowering n_jobs.
you could try editing the configuration dictionary to use smaller/faster models.
One possibility is that fitting the pipeline is taking too long and timing out. You can increase the timeout by setting the parameter max_eval_time_mins .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants