Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

saving on memory can cost both memory and performance #9073

Closed
behzadnouri opened this issue Dec 13, 2014 · 2 comments
Closed

saving on memory can cost both memory and performance #9073

behzadnouri opened this issue Dec 13, 2014 · 2 comments
Labels
Performance Memory or execution speed performance

Comments

@behzadnouri
Copy link
Contributor

xref: #8676 (comment)

memory cost:

with int16 labels:

$ python -m memory_profiler mem-profile.py 
Filename: mem-profile.py

Line #    Mem usage    Increment   Line Contents
================================================
     4   80.156 MiB    0.000 MiB   @profile
     5                             def ix(obj):
     6   87.809 MiB    7.652 MiB       obj.ix[999]

with int64 labels:

$ python -m memory_profiler mem-profile.py 
Filename: mem-profile.py

Line #    Mem usage    Increment   Line Contents
================================================
     4   79.387 MiB    0.000 MiB   @profile
     5                             def ix(obj):
     6   79.387 MiB    0.000 MiB       obj.ix[999]

performance cost:

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
frame_xs_mi_ix                               |   8.5303 |   0.6206 |  13.7452 |
series_xs_mi_ix                              |   8.0659 |   0.5600 |  14.4023 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

Ratio < 1.0 means the target commit is faster then the baseline.
Seed used: 1234

Target [c11e75c] : PERF: set multiindex labels with coerced dtype (GH8456)
Base   [6bbb39e] : Merge pull request #8675 from pydata/setitem

the mem-profile.py used for memory profiling:

@profile
def ix(obj):
    obj.ix[999]

if __name__ == '__main__':
    import numpy as np
    from pandas import MultiIndex, Series
    mi = MultiIndex.from_tuples([(x,y) for x in range(1000) for y in range(1000)])
    ts =  Series(np.random.randn(1000000), index=mi)
    ix(ts)
@behzadnouri behzadnouri changed the title saving on memory can costs both memory and performance saving on memory can cost both memory and performance Dec 13, 2014
@jreback jreback added the Performance Memory or execution speed performance label Dec 13, 2014
@jreback
Copy link
Contributor

jreback commented Dec 13, 2014

pls include other performance for actual 'real' indexing. getting a single scalar is useful, but other benchmarks are as well.

@mroeschke
Copy link
Member

This looks related to ix which is deprecated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance
Projects
None yet
Development

No branches or pull requests

3 participants