Unresponsive on "Quick Search" stage with simple dataset #187

garrettjoecox · 2021-11-13T04:18:28Z

Hey there, I have a dataset I have stripped down to be pretty bare trying to get this library working

df.dtypes
TXNS               int64
VOLUME           float64
ANNUAL_VOLUME    float64

The dataframe has 350,000 rows, I figured maybe the size was causing it to be slow but it's been sitting like this for about 15 minutes now, with "kernel busy"

I'm sort of new to this tech so I'm not even sure how I would go about further debugging, any ideas?

The text was updated successfully, but these errors were encountered:

Hetarth02 · 2021-11-13T05:01:15Z

Can you try according to this as mentioned in the repo, I think the "features" part should not be the in the same code-line. Sample

sanketsarang · 2021-11-13T05:19:20Z

@garrettjoecox, considering the size of the data, you might actually need GPU acceleration. To ensure that it is not stuck, can you try with just the first 1000 rows of the DataFrame?

garrettjoecox · 2021-11-13T13:57:56Z

Can you try according to this as mentioned in the repo, I think the "features" part should not be the in the same code-line. Sample

According to the docs this is valid, as I only want to train on those two columns

There might be scenarios where you want to explicitely exclude some columns, or only use a subset of columns in the training. Manually specify the features to be used. AutoAI will still perform a feature selection within the list of features provided to improve effective model accuracy.

model = bc.train(file="data.csv", target="Y_value", features=["col1", "col2", "col3"])

However since I have already stripped everything else from my dataset I don't need to specify them as they are the only columns remaining, so I tried removing the features argument and got the same result.

garrettjoecox · 2021-11-13T14:01:56Z

@garrettjoecox, considering the size of the data, you might actually need GPU acceleration. To ensure that it is not stuck, can you try with just the first 1000 rows of the DataFrame?

I tried with less data, (1000, 100, 50, 10 rows) and it seems to have gotten a bit further, but then the kernel dies every time around 30-40% of the way through this step:

sanketsarang · 2021-11-14T19:58:38Z

Thank you for the update @garrettjoecox. My best guess is that it is crashing on one of the models. This problem is most likely data specific. Is it possible for you to email the first 1000 rows of data to support@blobcity.com? Trying it on your data will allow us to diagnose better.

sanketsarang assigned Thilakraj1998 Nov 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unresponsive on "Quick Search" stage with simple dataset #187

Unresponsive on "Quick Search" stage with simple dataset #187

garrettjoecox commented Nov 13, 2021

Hetarth02 commented Nov 13, 2021 •

edited

sanketsarang commented Nov 13, 2021

garrettjoecox commented Nov 13, 2021

garrettjoecox commented Nov 13, 2021 •

edited

sanketsarang commented Nov 14, 2021

Unresponsive on "Quick Search" stage with simple dataset #187

Unresponsive on "Quick Search" stage with simple dataset #187

Comments

garrettjoecox commented Nov 13, 2021

Hetarth02 commented Nov 13, 2021 • edited

sanketsarang commented Nov 13, 2021

garrettjoecox commented Nov 13, 2021

garrettjoecox commented Nov 13, 2021 • edited

sanketsarang commented Nov 14, 2021

Hetarth02 commented Nov 13, 2021 •

edited

garrettjoecox commented Nov 13, 2021 •

edited