New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed Large memory requirements for SimpleImputer strategy median #4794 #4817
Conversation
rerun tests |
3 similar comments
rerun tests |
rerun tests |
rerun tests |
Codecov Report
@@ Coverage Diff @@
## branch-22.08 #4817 +/- ##
================================================
+ Coverage 77.62% 78.30% +0.67%
================================================
Files 180 180
Lines 11384 11526 +142
================================================
+ Hits 8837 9025 +188
+ Misses 2547 2501 -46
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Co-authored-by: William Hicks <wphicks@users.noreply.github.com>
Co-authored-by: William Hicks <wphicks@users.noreply.github.com>
rerun tests |
2 similar comments
rerun tests |
rerun tests |
rerun tests |
@gpucibot merge |
…idsai#4794 (rapidsai#4817) I have implemented a fix for [BUG] Large memory requirements for SimpleImputer strategy median rapidsai#4794. I narrowed down the issue to _masked_column_median. As expected, the extra memory results from the unnecessary copy of the array (in the case where NaN is the masked value). However, in the other case (where NaN isn't the masked value) this copy is necessary. To fix this, I used in-place sorting. However, in both cases the memory usage goes from 3000 MiB (size of original array) to 13000. From my understanding, sorting should only take up an additional 3000 MiB. Is it possible to reduce memory usage further? Still, this fix still reduces the memory used by over 5000 MiB. Authors: - https://github.com/erikrene Approvers: - William Hicks (https://github.com/wphicks) URL: rapidsai#4817
I have implemented a fix for [BUG] Large memory requirements for SimpleImputer strategy median #4794. I narrowed down the issue to _masked_column_median. As expected, the extra memory results from the unnecessary copy of the array (in the case where NaN is the masked value). However, in the other case (where NaN isn't the masked value) this copy is necessary. To fix this, I used in-place sorting. However, in both cases the memory usage goes from 3000 MiB (size of original array) to 13000. From my understanding, sorting should only take up an additional 3000 MiB. Is it possible to reduce memory usage further? Still, this fix still reduces the memory used by over 5000 MiB.