Set-oriented Operations in Pandas
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
tests Initial commit Dec 26, 2018
.gitignore Initial commit Dec 26, 2018 Update changelog Dec 27, 2018
LICENSE Initial commit Dec 26, 2018 Initial commit Dec 26, 2018
requirements.txt Update homepage Dec 27, 2018

Pandas Sets: Set-oriented Operations in Pandas

If you store standard Python sets in your Series or DataFrame objects, you'll find this useful.

The pandas_sets package adds a .set accessor to any pandas Series object; it's like .dt for datetime or .str for string, but for set.

It exposes all public methods available in the standard set.


pip install pandas-sets

Just import the pandas_sets package and it will register a .set accessor to any Series object.

import pandas_sets


import pandas_sets
import pandas as pd
df = pd.DataFrame({'post': [1, 2, 3, 4],
                    'tags': [{'python', 'pandas'}, {'philosophy', 'strategy'}, {'scikit-learn'}, {'pandas'}]
pandas_posts = df[df.tags.set.contains('pandas')]


pandas_posts.tags.set.update({'data', 'analysis'})



  • The implementation is primitive for now. It's based heavily on the pandas' core StringMethods implementation.
  • The public API has been tested for most expected scenarios.
  • The API will need to be extended to handle NA values appropriately.