Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

colab notebook runtime disconnects when numdata_points = 20000 #95

Open
ln0v opened this issue Jun 5, 2020 · 1 comment
Open

colab notebook runtime disconnects when numdata_points = 20000 #95

ln0v opened this issue Jun 5, 2020 · 1 comment

Comments

@ln0v
Copy link

ln0v commented Jun 5, 2020

I started from the demo income classification , using the linear regressor - I am using my own data set to predict insurance payments using about 12 features. My dataset has over 30,000 points, the what-if tool disconnect when I set the number of data points to 20000. whatif works up to about 15,000 numdata_points.
This is not a time out problem, code runs for under 5 minutes. Is there a limit on the number of data points that WitConfigBuilder can handle in colab.
response would be much appreciated.

@jameswex
Copy link
Collaborator

jameswex commented Jun 5, 2020

The limit seems to be based on the number of examples, the size of examples, and possibly also on your browser/computer as well. We're currently doing a deep dive into what causes WIT failures (including colab runtime failures when running in colab), investigating the intersection of all of those variables. At the end of that, we should be able to provide more documentation/guidance, and possibly also find some fixes we can make to raise the limit and/or provide warnings.

In general, there are failures caused by the colab kernel crashing due to its own memory issues, and also failures caused by the browser tab's allowed memory being exceeded.

Your experience of having issues at around 15k examples matches up with our experience.

For now I would recommend sampling your data to keep the datapoint count around 10k, and we should have more data/information on WIT memory/runtime issues in the upcoming weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants