Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NdPPoly to cupyx.scipy.interpolate #7357

Merged
merged 11 commits into from
Apr 3, 2023
Merged

Add NdPPoly to cupyx.scipy.interpolate #7357

merged 11 commits into from
Apr 3, 2023

Conversation

andfoy
Copy link
Contributor

@andfoy andfoy commented Jan 30, 2023

See #7186

@andfoy andfoy mentioned this pull request Jan 30, 2023
51 tasks
@kmaehashi kmaehashi added cat:feature New features/APIs prio:medium labels Jan 31, 2023
@andfoy andfoy force-pushed the add_ndppoly branch 3 times, most recently from 68c7db0 to 09c3536 Compare February 7, 2023 18:48
@andfoy andfoy marked this pull request as ready for review February 10, 2023 20:35
@takagi
Copy link
Member

takagi commented Feb 13, 2023

/test mini

@andfoy
Copy link
Contributor Author

andfoy commented Feb 13, 2023

These are the benchmark results that compare the current NdPPoly implementation (as of ceb1778) against the SciPy one. For larger inputs and some dimensions, CuPy implementation runtime improvement ranges from 5X up to ~40X (and beyond) over SciPy .

Benchmark script
from itertools import product

import cupy as cp
import numpy as np
from cupy import testing
from cupyx.profiler import benchmark

from cupyx.scipy.interpolate import NdPPoly as CuNdPPoly
from scipy.interpolate import NdPPoly as CPUNdPPoly

from tqdm import tqdm


input_length = [5, 10, 100, 1000, 5000, 10000, 50000, 100000, 500000]
dims = [1, 2, 3, 4]
input_sizes = sorted(list(product(input_length, dims)), key=lambda x: x[1])

modules = [(cp, CuNdPPoly, 'CuPy'), (np, CPUNdPPoly, 'SciPy')]
kwargs = [('extrapolate', [True, False, 'periodic'])]
# kwargs = [('extrapolate', [True])]
kwargs = list(product(*[list(product([x[0]], x[1])) for x in kwargs]))
measurement = {}


def gather_time(prof):
    cpu_time = prof.cpu_times.mean() * 1000
    gpu_time = prof.gpu_times.mean() * 1000
    return max(cpu_time, gpu_time)


def input_creation(xp, cls, size, **kwargs):
    length, dims = size
    orders = [i + 1 for i in range(dims)]
    dim_elements = [i + len(orders) for i in range(dims)]
    c = testing.shaped_random(
        tuple(orders) + tuple(dim_elements), xp)
    x = [xp.linspace(0, 1, order + 1) ** (i + 1)
         for i, order in enumerate(dim_elements)]
    b = cls(c, x, **kwargs)
    x = tuple([testing.shaped_random((length,), xp) for _ in range(dims)])
    return b, x


def closure(b, x):
    return b(x)


for kwarg_comb in kwargs:
    kwarg_measurement = measurement.get(kwarg_comb, {})
    measurement[kwarg_comb] = kwarg_measurement
    kwarg_comb = dict(kwarg_comb)
    for size in tqdm(input_sizes):
        size_measurement = kwarg_measurement.get(size, {})
        kwarg_measurement[size] = size_measurement
        for xp, cls, _id in modules:
            b, x = input_creation(xp, cls, size, **kwarg_comb)
            prof = benchmark(closure, (b, x), n_repeat=100)
            size_measurement[_id] = gather_time(prof)


lines = []
for kwarg_comb in kwargs:
    kwarg_line = ', '.join([f'{x}={y}' for x, y in kwarg_comb])
    lines.append(f'### `{kwarg_line}`\n')
    lines.append('| (Size, Dimensions) | CuPy (ms) | SciPy (ms) | Speedup |')
    lines.append('|:------------------:|:---------:|:----------:|:-------:|')
    kwarg_measurement = measurement[kwarg_comb]
    for size in input_sizes:
        comp = f'{size}'
        size_measurement = kwarg_measurement[size]
        times = []
        for _, _, _id in modules:
            time = size_measurement[_id]
            times.append(time)
            time_info = f'{time:3f}'
            comp = f'{comp} | {time_info}'
        speedup = times[1] / times[0]
        comp = f'{comp} | {speedup:3f}'
        lines.append(f'| {comp} |')
    lines.append('\n')


print('\n'.join(lines))

extrapolate=True

(Size, Dimensions) CuPy (ms) SciPy (ms) Speedup
(5, 1) 1.061472 0.016712 0.015744
(10, 1) 1.073111 0.017012 0.015853
(100, 1) 1.098182 0.025943 0.023624
(1000, 1) 1.059656 0.115920 0.109394
(5000, 1) 1.077135 0.508636 0.472212
(10000, 1) 1.079761 1.001260 0.927298
(50000, 1) 1.171351 4.959986 4.234416
(100000, 1) 1.087441 9.837268 9.046252
(500000, 1) 1.182588 49.092133 41.512451
(5, 2) 1.209332 0.025440 0.021036
(10, 2) 1.219569 0.026109 0.021408
(100, 2) 1.196465 0.049667 0.041511
(1000, 2) 1.224257 0.162295 0.132566
(5000, 2) 1.218356 0.707828 0.580970
(10000, 2) 1.234574 1.384079 1.121098
(50000, 2) 1.226536 6.816401 5.557441
(100000, 2) 1.257903 13.580752 10.796345
(500000, 2) 1.495180 67.964886 45.455984
(5, 3) 1.240464 0.026539 0.021395
(10, 3) 1.227077 0.027872 0.022714
(100, 3) 1.257783 0.047134 0.037474
(1000, 3) 1.248640 0.237099 0.189886
(5000, 3) 1.227908 1.077741 0.877705
(10000, 3) 1.262046 2.120201 1.679972
(50000, 3) 1.413180 10.530038 7.451310
(100000, 3) 1.526846 20.949679 13.720882
(500000, 3) 2.814731 105.659627 37.538096
(5, 4) 1.289951 0.029534 0.022896
(10, 4) 1.268823 0.031991 0.025213
(100, 4) 1.298377 0.072859 0.056115
(1000, 4) 1.292815 0.477708 0.369510
(5000, 4) 1.345453 2.268993 1.686416
(10000, 4) 1.716305 4.516486 2.631517
(50000, 4) 3.990015 22.446983 5.625789
(100000, 4) 6.862765 47.686336 6.948560
(500000, 4) 30.620509 226.275137 7.389660

extrapolate=False

(Size, Dimensions) CuPy (ms) SciPy (ms) Speedup
(5, 1) 1.114796 0.015639 0.014028
(10, 1) 1.076003 0.015790 0.014674
(100, 1) 1.063596 0.017685 0.016628
(1000, 1) 1.076735 0.032529 0.030211
(5000, 1) 1.074946 0.094950 0.088330
(10000, 1) 1.078243 0.174603 0.161933
(50000, 1) 1.058833 0.799011 0.754614
(100000, 1) 1.088087 1.573249 1.445885
(500000, 1) 1.161016 7.740995 6.667430
(5, 2) 1.212215 0.023830 0.019658
(10, 2) 1.200085 0.023692 0.019742
(100, 2) 1.212668 0.026961 0.022233
(1000, 2) 1.210323 0.046441 0.038370
(5000, 2) 1.217838 0.130870 0.107461
(10000, 2) 1.242048 0.240994 0.194030
(50000, 2) 1.205388 1.098933 0.911684
(100000, 2) 1.251601 2.148627 1.716703
(500000, 2) 1.456235 10.635589 7.303486
(5, 3) 1.252573 0.025251 0.020159
(10, 3) 1.247108 0.025290 0.020279
(100, 3) 1.272618 0.029487 0.023170
(1000, 3) 1.242533 0.057168 0.046009
(5000, 3) 1.240854 0.179386 0.144566
(10000, 3) 1.247457 0.355299 0.284818
(50000, 3) 1.299791 1.595225 1.227293
(100000, 3) 1.350468 3.125960 2.314723
(500000, 3) 1.989153 15.693632 7.889607
(5, 4) 1.285822 0.026878 0.020904
(10, 4) 1.282683 0.027020 0.021066
(100, 4) 1.294885 0.034074 0.026314
(1000, 4) 1.285076 0.083719 0.065147
(5000, 4) 1.302391 0.316569 0.243068
(10000, 4) 1.322594 0.609823 0.461081
(50000, 4) 1.499530 2.951569 1.968329
(100000, 4) 1.666262 5.790240 3.474987
(500000, 4) 3.554421 30.001381 8.440581

extrapolate=periodic

(Size, Dimensions) CuPy (ms) SciPy (ms) Speedup
(5, 1) 1.646383 0.016061 0.009756
(10, 1) 1.038089 0.016697 0.016085
(100, 1) 1.036977 0.025701 0.024784
(1000, 1) 1.104012 0.115440 0.104564
(5000, 1) 1.480375 0.509033 0.343854
(10000, 1) 1.047596 1.000034 0.954599
(50000, 1) 1.041125 5.279316 5.070778
(100000, 1) 1.057183 10.574448 10.002479
(500000, 1) 1.261907 65.633859 52.011650
(5, 2) 1.982352 0.031177 0.015727
(10, 2) 1.735244 0.025890 0.014920
(100, 2) 1.253148 0.038064 0.030375
(1000, 2) 1.188874 0.174252 0.146569
(5000, 2) 1.182464 0.706712 0.597661
(10000, 2) 1.651375 1.518329 0.919434
(50000, 2) 1.223043 6.903531 5.644554
(100000, 2) 1.266542 14.370590 11.346321
(500000, 2) 1.450015 72.430039 49.951222
(5, 3) 1.226005 0.026627 0.021718
(10, 3) 1.225443 0.028017 0.022863
(100, 3) 1.224346 0.046833 0.038251
(1000, 3) 1.230594 0.237129 0.192695
(5000, 3) 1.230251 1.113619 0.905197
(10000, 3) 1.233347 2.123393 1.721650
(50000, 3) 1.381370 10.530108 7.622948
(100000, 3) 1.509370 21.021070 13.927053
(500000, 3) 2.749411 97.764697 35.558416
(5, 4) 1.198635 0.028313 0.023621
(10, 4) 1.205476 0.030266 0.025107
(100, 4) 1.202627 0.070360 0.058505
(1000, 4) 1.199256 0.457655 0.381616
(5000, 4) 1.248479 2.112796 1.692295
(10000, 4) 1.631273 4.259612 2.611220
(50000, 4) 3.896652 22.652035 5.813205
(100000, 4) 6.947956 46.308289 6.665023
(500000, 4) 30.622329 234.312540 7.651689

@takagi
Copy link
Member

takagi commented Feb 14, 2023

/test mini

takagi
takagi previously approved these changes Feb 14, 2023
Copy link
Member

@takagi takagi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I will merge the PR after the release this week.

@takagi
Copy link
Member

takagi commented Feb 16, 2023

Jenkins CI fails on ROCm environment because of the same problem with #5843 (comment). I'm going to see the change in this PR and what causes this here.

@takagi
Copy link
Member

takagi commented Mar 29, 2023

/test mini

@takagi takagi added this to the v13.0.0a1 milestone Apr 3, 2023
Copy link
Member

@takagi takagi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@takagi takagi merged commit 9de07a4 into cupy:main Apr 3, 2023
4 checks passed
@takagi
Copy link
Member

takagi commented Apr 3, 2023

ROCm environments were locally tested.

@andfoy andfoy deleted the add_ndppoly branch April 4, 2023 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

3 participants