Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DataFrame.sparse accessor #25681

Closed
TomAugspurger opened this issue Mar 12, 2019 · 0 comments

Comments

@TomAugspurger
Copy link
Contributor

commented Mar 12, 2019

I'd like to add a .sparse accessor to DataFrame, to assist with deprecating SparseDataFrame.

It'll contain

  • from_spmatrix (part of the SparseDataFrame constructor)
  • to_dense (SparseDataFrame.to_dense)
  • to_coo (SparseDataFrame.to_coo)
  • density

A few design questions:

  1. When should the _validate raise?
    a. When there are no sparse columns
    b. When there is any non-sparse columns
    c. Never.

It's slightly easier to implement if we assume everything is sparse.

  1. Return value of DataFrame.sparse.density. If we mirror SparseDataFrame.density, this returns a float. Would it be more useful to return a Series with the density of each column? (and users can .mean() if they want the average density)

I believe that with these methods, the essentially all the functionality of SparseDataFrame will be replicable with a DataFrame of sparse values (the main exception being an expanding __setitem__ creating a sparse column by default; but that's OK to not provide that functionality).

@TomAugspurger TomAugspurger added this to the 0.25.0 milestone Mar 12, 2019

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Mar 12, 2019

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Mar 12, 2019

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Mar 15, 2019

Squashed commit of the following:
commit 8b136bf
Merge: 3005aed 01d3dc2
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Fri Mar 15 16:03:23 2019 -0500

    Merge remote-tracking branch 'upstream/master' into sparse-frame-accessor

commit 3005aed
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Thu Mar 14 06:26:32 2019 -0500

    isort?

commit 318c06f
Merge: 0922296 79205ea
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Thu Mar 14 06:25:45 2019 -0500

    Merge remote-tracking branch 'upstream/master' into sparse-frame-accessor

commit 0922296
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Wed Mar 13 21:35:51 2019 -0500

    updates

commit f433be8
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Wed Mar 13 20:54:07 2019 -0500

    lint

commit 6696f28
Merge: 534a379 1017382
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Wed Mar 13 20:53:13 2019 -0500

    Merge remote-tracking branch 'upstream/master' into sparse-frame-accessor

commit 534a379
Merge: 94a7baf 5c341dc
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Tue Mar 12 14:37:27 2019 -0500

    Merge remote-tracking branch 'upstream/master' into sparse-frame-accessor

commit 94a7baf
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Tue Mar 12 14:22:48 2019 -0500

    fixups

commit 6f619b5
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Tue Mar 12 13:38:48 2019 -0500

    32-bit compat

commit 24f48c3
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Mon Mar 11 22:05:46 2019 -0500

    API: DataFrame.sparse accessor

    Closes pandas-dev#25681

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Apr 7, 2019

Squashed commit of the following:
commit 8b136bf
Merge: 3005aed 01d3dc2
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Fri Mar 15 16:03:23 2019 -0500

    Merge remote-tracking branch 'upstream/master' into sparse-frame-accessor

commit 3005aed
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Thu Mar 14 06:26:32 2019 -0500

    isort?

commit 318c06f
Merge: 0922296 79205ea
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Thu Mar 14 06:25:45 2019 -0500

    Merge remote-tracking branch 'upstream/master' into sparse-frame-accessor

commit 0922296
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Wed Mar 13 21:35:51 2019 -0500

    updates

commit f433be8
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Wed Mar 13 20:54:07 2019 -0500

    lint

commit 6696f28
Merge: 534a379 1017382
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Wed Mar 13 20:53:13 2019 -0500

    Merge remote-tracking branch 'upstream/master' into sparse-frame-accessor

commit 534a379
Merge: 94a7baf 5c341dc
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Tue Mar 12 14:37:27 2019 -0500

    Merge remote-tracking branch 'upstream/master' into sparse-frame-accessor

commit 94a7baf
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Tue Mar 12 14:22:48 2019 -0500

    fixups

commit 6f619b5
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Tue Mar 12 13:38:48 2019 -0500

    32-bit compat

commit 24f48c3
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Mon Mar 11 22:05:46 2019 -0500

    API: DataFrame.sparse accessor

    Closes pandas-dev#25681

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Apr 11, 2019

Squashed commit of the following:
commit 8b136bf
Merge: 3005aed 01d3dc2
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Fri Mar 15 16:03:23 2019 -0500

    Merge remote-tracking branch 'upstream/master' into sparse-frame-accessor

commit 3005aed
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Thu Mar 14 06:26:32 2019 -0500

    isort?

commit 318c06f
Merge: 0922296 79205ea
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Thu Mar 14 06:25:45 2019 -0500

    Merge remote-tracking branch 'upstream/master' into sparse-frame-accessor

commit 0922296
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Wed Mar 13 21:35:51 2019 -0500

    updates

commit f433be8
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Wed Mar 13 20:54:07 2019 -0500

    lint

commit 6696f28
Merge: 534a379 1017382
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Wed Mar 13 20:53:13 2019 -0500

    Merge remote-tracking branch 'upstream/master' into sparse-frame-accessor

commit 534a379
Merge: 94a7baf 5c341dc
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Tue Mar 12 14:37:27 2019 -0500

    Merge remote-tracking branch 'upstream/master' into sparse-frame-accessor

commit 94a7baf
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Tue Mar 12 14:22:48 2019 -0500

    fixups

commit 6f619b5
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Tue Mar 12 13:38:48 2019 -0500

    32-bit compat

commit 24f48c3
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Mon Mar 11 22:05:46 2019 -0500

    API: DataFrame.sparse accessor

    Closes pandas-dev#25681

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Apr 18, 2019

Squashed commit of the following:
commit 8b136bf
Merge: 3005aed 01d3dc2
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Fri Mar 15 16:03:23 2019 -0500

    Merge remote-tracking branch 'upstream/master' into sparse-frame-accessor

commit 3005aed
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Thu Mar 14 06:26:32 2019 -0500

    isort?

commit 318c06f
Merge: 0922296 79205ea
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Thu Mar 14 06:25:45 2019 -0500

    Merge remote-tracking branch 'upstream/master' into sparse-frame-accessor

commit 0922296
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Wed Mar 13 21:35:51 2019 -0500

    updates

commit f433be8
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Wed Mar 13 20:54:07 2019 -0500

    lint

commit 6696f28
Merge: 534a379 1017382
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Wed Mar 13 20:53:13 2019 -0500

    Merge remote-tracking branch 'upstream/master' into sparse-frame-accessor

commit 534a379
Merge: 94a7baf 5c341dc
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Tue Mar 12 14:37:27 2019 -0500

    Merge remote-tracking branch 'upstream/master' into sparse-frame-accessor

commit 94a7baf
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Tue Mar 12 14:22:48 2019 -0500

    fixups

commit 6f619b5
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Tue Mar 12 13:38:48 2019 -0500

    32-bit compat

commit 24f48c3
Author: Tom Augspurger <tom.w.augspurger@gmail.com>
Date:   Mon Mar 11 22:05:46 2019 -0500

    API: DataFrame.sparse accessor

    Closes pandas-dev#25681
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.