Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: sparse dataset #84

Closed
wants to merge 11 commits into from
Closed

Conversation

michcio1234
Copy link

@michcio1234 michcio1234 commented Jul 29, 2019

No description provided.

)
if prefixes:
cols = list(map(lambda x: '{}_{}'.format(column, x), cols))
if column_cat is False:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm @michcio1234 do you think we should add a check if the column is numerical here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably yes 👍

@codecov
Copy link

codecov bot commented Aug 1, 2019

Codecov Report

Merging #84 into ohe-include-untouched will increase coverage by 0.28%.
The diff coverage is 91.13%.

Impacted file tree graph

@@                    Coverage Diff                    @@
##           ohe-include-untouched      #84      +/-   ##
=========================================================
+ Coverage                  89.79%   90.07%   +0.28%     
=========================================================
  Files                          7        7              
  Lines                       1215     1290      +75     
=========================================================
+ Hits                        1091     1162      +71     
- Misses                       124      128       +4
Impacted Files Coverage Δ
sparsity/dask/core.py 85.86% <90.47%> (+0.55%) ⬆️
sparsity/sparse_frame.py 92.23% <93.75%> (+0.43%) ⬆️
sparsity/dask/io_.py 93.4% <0%> (+1.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3e6adca...06a1907. Read the comment docs.

And change sf_arange fixture to have values from 1 to 10 rather than
from 0 to 9, because 0 is treated as missing in sparse frames.
When the same dsf is computed twice, results may be different
If an argument is a generator, we will call next() on it before passing
it to a function for each partition.
@michcio1234 michcio1234 closed this Aug 2, 2019
@michcio1234 michcio1234 changed the title WIP: Leave some columns untouched when one-hot-encoding WIP: sparse dataset Aug 2, 2019
@michcio1234 michcio1234 reopened this Aug 2, 2019
@michcio1234 michcio1234 changed the base branch from master to ohe-include-untouched August 2, 2019 07:23
@michcio1234
Copy link
Author

Superseded by 4 smaller PRs.

@michcio1234 michcio1234 closed this Aug 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants