-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: Found array with 0 sample(s) (shape=(0, 174)) while a minimum of 1 is required. #79
Comments
Hi @torivor, Can't say without looking at the dataset, but have a look at my answer in #65 then the thread in #68 Won't go into the details as it's already described in those threads, but in short this error can occur if columns are all missing the same value even if each column itself is not completely null. Right now this is expected behavior given the algorithm design, albeit the error is not handled well. In #68 you'll see a more optimal solution - one in which we use placeholders vs. listwise delete. This is the right way to go, just haven't written the code for it yet. In the meantime i'd play with your featureset, maybe reducing the column space or sampling more rows if possible. |
Hello Kearney,
I beg your pardon for the late reply. The dataset that I mentioned before
in the Github issue can be found on the following link:
https://1drv.ms/u/s!AuNCY1udObSUgeVoN1EVZLGPvMGLrg?e=z4XdYb
Should you have any alternative solutions regarding the issue, please let
me know. Thank you for the response!
Sincerely,
Andreas Parasian
…On Wed, Aug 3, 2022 at 1:38 PM Joe Kearney ***@***.***> wrote:
Hi @torivor <https://github.com/torivor>,
Can't say without looking at the dataset, but have a look at my answer in
#65 <#65> then the thread in
#68 <#68>
Won't go into the details as it's already described in those threads, but
in short this error can occur if columns are *all missing the same value*
even if each column itself is not completely null.
Right now this is expected behavior given the algorithm design, albeit the
error is not handled well. In #68
<#68> you'll see a more
optimal solution - one in which we use placeholders vs. listwise delete.
This is the right way to go, just haven't written the code for it yet.
In the meantime i'd play with your featureset, maybe reducing the column
space or sampling more rows if possible.
—
Reply to this email directly, view it on GitHub
<#79 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOHMQXEVJ3TKEK6ZEHGYFNLVXIHWZANCNFSM55JZJ6VQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Correction, please use this link to download the CSV file:
https://onedrive.live.com/download?resid=94B4399D5B6342E3!29416&authkey=!ADdRFWSxj7zBi64
The link I sent only shows a preview of the file without any option to
download (since its larger than 25 megabytes).
On Thu, Aug 4, 2022 at 12:34 AM Andreas Parasian ***@***.***>
wrote:
… Hello Kearney,
I beg your pardon for the late reply. The dataset that I mentioned before
in the Github issue can be found on the following link:
https://1drv.ms/u/s!AuNCY1udObSUgeVoN1EVZLGPvMGLrg?e=z4XdYb
Should you have any alternative solutions regarding the issue, please let
me know. Thank you for the response!
Sincerely,
Andreas Parasian
On Wed, Aug 3, 2022 at 1:38 PM Joe Kearney ***@***.***>
wrote:
> Hi @torivor <https://github.com/torivor>,
>
> Can't say without looking at the dataset, but have a look at my answer in
> #65 <#65> then the thread in
> #68 <#68>
>
> Won't go into the details as it's already described in those threads, but
> in short this error can occur if columns are *all missing the same value*
> even if each column itself is not completely null.
>
> Right now this is expected behavior given the algorithm design, albeit
> the error is not handled well. In #68
> <#68> you'll see a more
> optimal solution - one in which we use placeholders vs. listwise delete.
> This is the right way to go, just haven't written the code for it yet.
>
> In the meantime i'd play with your featureset, maybe reducing the column
> space or sampling more rows if possible.
>
> —
> Reply to this email directly, view it on GitHub
> <#79 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AOHMQXEVJ3TKEK6ZEHGYFNLVXIHWZANCNFSM55JZJ6VQ>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
Apologies for the late reply. Looked into it more and this is because of the issues I linked to. Right now Again, see #65 and #68. The recommended solution for now is to experiment with your data. Try using fewer columns for the imputation as a start, instead of all the features. I'm planning to work on moving from complete case to mean placeholders at some point soon but don't have a TBD on that yet. |
Hello, this package has been a lifesaver for my imputation needs. However, recently I encountered this error.. "ValueError: Found array with 0 sample(s) (shape=(0, 174)) while a minimum of 1 is required." that I can't seem to solve. I encountered the error while trying to fit_transform my Pandas DataFrame. All column of the DataFrame has some values, none of them are completely filled with missing values yet the package still throws this error. Hence, I don't know what triggered the error. I assume this is due to a coding fault within the package.
I can send the data as it's publicly available from a Kaggle competition, but the size is too big. Should anyone require the data, I can email it, just send me a request at andreasparasian@gmail.com.
The text was updated successfully, but these errors were encountered: