New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the performance of instantiating a Series object with dictionary data and a datetimeindex #14894

Closed
nateyoder opened this Issue Dec 16, 2016 · 0 comments

Comments

Projects
None yet
3 participants
@nateyoder
Contributor

nateyoder commented Dec 16, 2016

The current code path always results in an exception on:

data = lib.fast_multiget(data, index.astype('O'),

which is then caught. Only a slight performance advantage is seen but hopefully the code change makes it less confusing for newcomers like me.

Code Sample, a copy-pastable example if possible

dr = pd.date_range(
            start=datetime(2015, 10, 26),
            end=datetime(2016, 1, 1),
            freq='10s'
        )
data = {d: v for d, v in zip(dr, range(len(dr)))}
s = Series(data=data, index=dr)

Problem description

The current code path always results in an exception on:

data = lib.fast_multiget(data, index.astype('O'),

which is then caught. Only a slight performance advantage is seen but hopefully the code change makes it less confusing for newcomers like me.

ASV output of new benchmark
Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For pandas commit hash 5f05fdc:
[ 0.00%] ·· Building for conda-py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt.................................
[ 0.00%] ·· Benchmarking conda-py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 50.00%] ··· Running ...x.time_series_constructor_no_data_datetime_index 3.26s
[ 50.00%] · For pandas commit hash 3ba2cff:
[ 50.00%] ·· Building for conda-py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt...
[ 50.00%] ·· Benchmarking conda-py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[100.00%] ··· Running ...x.time_series_constructor_no_data_datetime_index 3.77s before after ratio
[3ba2cff] [5f05fdc]

  • 3.77s      3.26s      0.87  series_methods.series_constructor_dict_data_datetime_index.time_series_constructor_no_data_datetime_index
    

@jreback jreback added this to the 0.19.2 milestone Dec 16, 2016

@jreback jreback closed this in e503d40 Dec 17, 2016

ischurov added a commit to ischurov/pandas that referenced this issue Dec 19, 2016

Clean up construction of Series with dictionary and datetime index
closes #14894
Fix usage of fast_multiget with index which was always throwing an
exception that was then caught; add ASV that show slight improvement

Author: Nate Yoder <nate@whistle.com>

Closes #14895 from nateyoder/series_dict_index and squashes the following commits:

56be091 [Nate Yoder] Update whatsnew and fix pep8 issue
5f05fdc [Nate Yoder] Fix usage of fast_multiget with index which was always throwing an exception that was then caught; add ASV that show slight improvement

jorisvandenbossche added a commit to jorisvandenbossche/pandas that referenced this issue Dec 24, 2016

Clean up construction of Series with dictionary and datetime index
closes #14894
Fix usage of fast_multiget with index which was always throwing an
exception that was then caught; add ASV that show slight improvement

Author: Nate Yoder <nate@whistle.com>

Closes #14895 from nateyoder/series_dict_index and squashes the following commits:

56be091 [Nate Yoder] Update whatsnew and fix pep8 issue
5f05fdc [Nate Yoder] Fix usage of fast_multiget with index which was always throwing an exception that was then caught; add ASV that show slight improvement

(cherry picked from commit e503d40)

ShaharBental added a commit to ShaharBental/pandas that referenced this issue Dec 26, 2016

Clean up construction of Series with dictionary and datetime index
closes #14894
Fix usage of fast_multiget with index which was always throwing an
exception that was then caught; add ASV that show slight improvement

Author: Nate Yoder <nate@whistle.com>

Closes #14895 from nateyoder/series_dict_index and squashes the following commits:

56be091 [Nate Yoder] Update whatsnew and fix pep8 issue
5f05fdc [Nate Yoder] Fix usage of fast_multiget with index which was always throwing an exception that was then caught; add ASV that show slight improvement
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment