Skip to content
This repository was archived by the owner on Feb 2, 2024. It is now read-only.

Conversation

@kozlov-alexey
Copy link
Contributor

This PR:

  1. Modifies boxing/unboxing of Series and DFs to handle pd.RangeIndex,
  2. Adds fix_df_index to transform values of index argumenent of Series and
    DF ctor calls, which fixes RewriteDataFrame ctor now handling index=None as argument,
  3. Adds iteration, operators (is, eq, ne) support for RangeIndexType,
  4. Renames and refactors sdc_check_indexes_equal to numpy_like.array_equal,
  5. Adds specializations for RangeIndexType in all Series/DF methods,
    such as operators, getitem, setitem and indexing related functions
    (sdc_join_series_indexes, sdc_reindex_series, etc).

This PR:
1. Modifies boxing/unboxing of Series and DFs to handle pd.RangeIndex,
2. Adds fix_df_index to transform values of index argumenent of Series and
DF ctor calls, which fixes RewriteDataFrame ctor now handling index=None as argument,
3. Adds iteration, operators (is, eq, ne) support for RangeIndexType,
4. Renames and refactors sdc_check_indexes_equal to numpy_like.array_equal,
5. Adds specializations for RangeIndexType in all Series/DF methods,
such as operators, getitem, setitem and indexing related functions
(sdc_join_series_indexes, sdc_reindex_series, etc).
@pep8speaks
Copy link

pep8speaks commented May 26, 2020

Hello @kozlov-alexey! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-06-24 12:21:45 UTC

@kozlov-alexey kozlov-alexey force-pushed the feature/range_index_support branch 2 times, most recently from a26b951 to 120f7ac Compare May 27, 2020 18:27
@kozlov-alexey kozlov-alexey force-pushed the feature/range_index_support branch from 120f7ac to 80f92bc Compare May 27, 2020 18:59
@kozlov-alexey
Copy link
Contributor Author

kozlov-alexey commented May 29, 2020

It seems performance degraded with these changes, need to fix that before merge, e.g. for
operator.add:

with #862:

1 Python 10000000 0.031289 0.03108 0.033158    
1 SDC 10000000 0.188526 0.188433 0.188893 2.371142 0.000554
2 SDC 10000000 0.152458 0.152053 0.153302 2.43293 0.000682
4 SDC 10000000 0.086953 0.086793 0.087348 2.425093 0.000674
8 SDC 10000000 0.053097 0.053085 0.053173 2.430201 0.000825
16 SDC 10000000 0.036783 0.036191 0.037241 2.418686 0.000837
28 SDC 10000000 0.03135 0.031309 0.031361 2.425914 0.000833
56 SDC 10000000 0.029505 0.02948 0.030095 2.413604 0.000696

on master:

1 Python 10000000 0.03126 0.031004 0.031958    
1 SDC 10000000 0.047348 0.047187 0.047402 0.393299 0.00041
2 SDC 10000000 0.033103 0.033065 0.033186 0.446369 0.000514
4 SDC 10000000 0.017455 0.017369 0.017512 0.44836 0.000512
8 SDC 10000000 0.008907 0.008868 0.010109 0.444994 0.000514
16 SDC 10000000 0.005434 0.005059 0.007267 0.443173 0.000626
28 SDC 10000000 0.004042 0.003909 0.005736 0.444903 0.000519
56 SDC 10000000 0.004228 0.004043 0.006067 0.445996 0.000637

@kozlov-alexey kozlov-alexey force-pushed the feature/range_index_support branch from d3bbd8e to a453d79 Compare June 4, 2020 18:12
@kozlov-alexey
Copy link
Contributor Author

kozlov-alexey commented Jun 4, 2020

With reverting back to a separate impl for none (positional) indexes perf stays the same with this PR, see below for Series.operator.add:
with this PR:

nthreads type size median min max compile boxing
1 Python 10000000 0.031252 0.031215 0.031294    
1 SDC 10000000 0.049261 0.049189 0.049536 0.4135 0.000481
2 SDC 10000000 0.033909 0.033767 0.033971 0.448139 0.000529
4 SDC 10000000 0.017572 0.017527 0.017576 0.453382 0.000608
8 SDC 10000000 0.00911 0.009097 0.009185 0.446261 0.000554
16 SDC 10000000 0.005 0.00499 0.005736 0.465148 0.000619
28 SDC 10000000 0.003969 0.003954 0.005549 0.454575 0.000649
56 SDC 10000000 0.004146 0.00411 0.004553 0.451319 0.000679

on master (e061965):

nthreads type size median min max compile boxing
1 Python 10000000 0.031025 0.030967 0.031119    
1 SDC 10000000 0.049254 0.049175 0.049279 0.392598 0.000421
2 SDC 10000000 0.034642 0.03444 0.034732 0.450111 0.000578
4 SDC 10000000 0.017848 0.017798 0.018241 0.45714 0.000616
8 SDC 10000000 0.009329 0.009308 0.010389 0.452543 0.000596
16 SDC 10000000 0.005169 0.005132 0.007395 0.448971 0.000621
28 SDC 10000000 0.004017 0.003881 0.005692 0.440998 0.000525
56 SDC 10000000 0.004092 0.004059 0.006165 0.462535 0.000664

@AlexanderKalistratov AlexanderKalistratov merged commit 4c85598 into IntelPython:master Jun 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants