Skip to content

Conversation

@ZzEeKkAa
Copy link
Contributor

@ZzEeKkAa ZzEeKkAa commented May 1, 2023

  • Move logic with initialization to related method.
  • Add numba_dpex_n implementation based on the numba_np realization.
Legend
======
        postfix                            description
0  numba_dpex_k  Numba-dpex kernel
1  numba_dpex_p  Numba-dpex prange
2  numba_dpex_n  Numba-dpex NumPy API
3  python        Python
4  numba_n       Numba nopython
5  numba_np      Numba nopython, Parallel=True
6  numba_npr     Numba nopython, Parallel=True, prange
7  numpy         NumPy
8  dpnp          dpnp
9  sycl          DPC++ native ext.

Summary of current implementation
=================================
   input_size                 benchmark problem_preset    numpy           numba_np       numba_dpex_n
0        78KB                       adi              S  Success            Success   Failed Execution
1         3MB              arc_distance              S  Success            Success            Success
2       152MB                      atax              S  Success            Success            Success
3         6MB              azimint_hist              S  Success            Success   Failed Execution
4         6MB             azimint_naive              S  Success            Success   Failed Execution
5       152MB                      bicg              S  Success            Success            Success
6          0B               cavity_flow              S  Success      Unimplemented      Unimplemented
7        87KB              channel_flow              S  Success            Success   Failed Execution
8          0B                  cholesky              S  Success      Unimplemented      Unimplemented
9         7MB                 cholesky2              S  Success            Success   Failed Execution
10       61MB                   compute              S  Success            Success            Success
11         0B          contour_integral              S  Success   Failed Execution      Unimplemented
12       96KB               conv2d_bias              S  Success            Success   Failed Execution
13         0B               correlation              S  Success   Failed Execution      Unimplemented
14         0B                covariance              S  Success   Failed Execution      Unimplemented
15        1KB                     crc16              S  Success            Success            Success
16      625KB                   deriche              S  Success            Success   Failed Execution
17         0B                   doitgen              S  Success      Unimplemented      Unimplemented
18         0B                    durbin              S  Success  Failed Validation      Unimplemented
19        1MB                   fdtd_2d              S  Success            Success   Failed Execution
20         0B            floyd_warshall              S  Success   Failed Execution      Unimplemented
21       27MB                      gemm              S  Success            Success   Failed Execution
22        7MB                    gemver              S  Success            Success   Failed Execution
23       61MB                   gesummv              S  Success            Success   Failed Execution
24       30MB                   go_fast              S  Success            Success            Success
25       32KB               gramschmidt              S  Success            Success   Failed Execution
26        5MB                     hdiff              S  Success            Success   Failed Execution
27      244KB                   heat_3d              S  Success            Success  Failed Validation
28         0B                 jacobi_1d              S  Success            Success  Execution Timeout
29      351KB                 jacobi_2d              S  Success            Success  Failed Validation
30       23MB                      k2mm              S  Success            Success   Failed Execution
31      135MB                      k3mm              S  Success            Success            Success
32       28KB                        lu              S  Success            Success   Failed Execution
33       28KB                    ludcmp              S  Success            Success   Failed Execution
34         0B               mandelbrot1              S  Success            Success   Failed Execution
35         0B               mandelbrot2              S  Success      Unimplemented      Unimplemented
36      244MB                       mlp              S  Success            Success   Failed Execution
37      230MB                       mvt              S  Success            Success   Failed Execution
38         0B                     nbody              S  Success      Unimplemented      Unimplemented
39       160B                  nussinov              S  Success            Success            Success
40         0B  scattering_self_energies              S  Success      Unimplemented      Unimplemented
41       19KB                 seidel_2d              S  Success  Failed Validation  Failed Validation
42       16MB                   softmax              S  Success            Success   Failed Execution
43      144KB                      spmv              S  Success            Success   Failed Execution
44         0B              stockham_fft              S  Success   Failed Execution      Unimplemented
45         0B                      symm              S  Success      Unimplemented      Unimplemented
46         0B                     syr2k              S  Success      Unimplemented      Unimplemented
47         0B                      syrk              S  Success      Unimplemented      Unimplemented
48       30MB                   trisolv              S  Success            Success            Success
49         0B                      trmm              S  Success      Unimplemented      Unimplemented
50        5MB                      vadv              S  Success            Success  Failed Validation
  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to an issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • If this PR is a work in progress, are you filing the PR as a draft?

@ZzEeKkAa ZzEeKkAa requested a review from adarshyoga May 1, 2023 19:18
@ZzEeKkAa ZzEeKkAa self-assigned this May 1, 2023
@ZzEeKkAa ZzEeKkAa force-pushed the feature/add_numba_dpex_n_implementations branch 2 times, most recently from 402ea1b to 002b9c0 Compare May 1, 2023 20:53
Copy link
Contributor

@adarshyoga adarshyoga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. One change - you missed converting one workload,

@ZzEeKkAa ZzEeKkAa force-pushed the feature/add_numba_dpex_n_implementations branch 2 times, most recently from efbd388 to 18772d7 Compare May 1, 2023 22:07
@ZzEeKkAa ZzEeKkAa force-pushed the feature/add_numba_dpex_n_implementations branch from 18772d7 to 49ee54a Compare May 1, 2023 22:09
@adarshyoga adarshyoga enabled auto-merge May 1, 2023 22:12
@adarshyoga adarshyoga merged commit c5ff998 into main May 1, 2023
@adarshyoga adarshyoga deleted the feature/add_numba_dpex_n_implementations branch May 1, 2023 22:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants