Skip to content

adds extended unit tests for memory layout analysis#692

Open
lukastruemper wants to merge 3 commits intomainfrom
memory-layout-analysis-tests
Open

adds extended unit tests for memory layout analysis#692
lukastruemper wants to merge 3 commits intomainfrom
memory-layout-analysis-tests

Conversation

@lukastruemper
Copy link
Copy Markdown
Contributor

No description provided.

@daisytuner
Copy link
Copy Markdown

daisytuner Bot commented Apr 29, 2026

Daisytuner Report - mlir_torch_models (chamomile)

@@                                   Benchmarks                                   @@
=====================================================================================
  Benchmark              Time        ΔTime       Thr         Energy      ΔEnergy     
=====================================================================================
# bn_conv_bn_relu_maxpool_torch18.63 s     -1.98%      N/A         3615.52 J   -1.91%      
# bn_conv_bn_relu_maxpool_run_none3.25 s      +0.95%      N/A         652.84 J    +1.00%      
# bn_conv_bn_relu_maxpool_run_sequential3.21 s      -0.20%      N/A         643.36 J    -0.38%      
# bn_conv_bn_relu_maxpool_run_openmp3.36 s      +2.53%      N/A         676.39 J    +1.42%      
# bn_conv_bn_relu_maxpool_run_cuda3.58 s      -0.44%      N/A         701.27 J    -0.61%      

@daisytuner
Copy link
Copy Markdown

daisytuner Bot commented Apr 29, 2026

Daisytuner Report - python_npbench (zinnia)

@@                                   Benchmarks                                   @@
=====================================================================================
  Benchmark              Time        ΔTime       Thr         Energy      ΔEnergy     
=====================================================================================
# adi_numpy              1.32 s      +0.26%      N/A         131.66 J    +0.29%      
# adi_omp                4.68 s      +0.57%      N/A         532.50 J    +0.17%      
# adi_cuda               4.34 s      -0.46%      N/A         423.69 J    -0.52%      
# adi_seq_tuning         4.53 s      +1.05%      N/A         430.06 J    +0.85%      
# atax_numpy             2.16 s      -0.04%      N/A         223.43 J    -0.17%      
# atax_omp               3.07 s      -2.03%      N/A         388.75 J    -2.44%      
# atax_cuda              4.13 s      +0.02%      N/A         425.70 J    +0.08%      
# atax_seq_tuning        4.11 s      +0.39%      N/A         399.63 J    +0.38%      
# gemm_numpy             1.23 s      +1.74%      N/A         196.54 J    +1.72%      
# gemm_omp               1.13 s      -0.07%      N/A         163.33 J    -0.31%      
# gemm_cuda              10.67 s     +0.19%      N/A         1016.01 J   +0.27%      
# gemm_seq_tuning        1.13 s      +0.14%      N/A         163.71 J    +0.43%      
# gesummv_numpy          1.75 s      -0.01%      N/A         250.12 J    +0.12%      
# gesummv_omp            2.12 s      +0.62%      N/A         343.49 J    +0.28%      
# gesummv_cuda           5.15 s      +0.50%      N/A         709.66 J    +0.34%      
# gesummv_seq_tuning     6.12 s      -1.75%      N/A         746.62 J    -0.65%      
# gemver_numpy           1.09 s      +1.80%      N/A         168.25 J    +2.21%      
# gemver_omp             923.57 ms   -0.82%      N/A         123.09 J    -0.51%      
# gemver_cuda            2.38 s      -0.74%      N/A         253.12 J    -0.50%      
# gemver_seq_tuning      3.06 s      +3.07%      N/A         270.93 J    +0.86%      
# k2mm_numpy             1.20 s      +0.02%      N/A         196.08 J    +0.18%      
# k2mm_omp               3.56 s      -0.01%      N/A         669.37 J    +0.33%      
# k2mm_cuda              13.40 s     -0.04%      N/A         1272.15 J   -0.01%      
# k2mm_seq_tuning        2.98 s      +2.74%      N/A         394.31 J    +0.95%      
# k3mm_numpy             1.03 s      -0.19%      N/A         181.70 J    -0.11%      
# k3mm_omp               5.59 s      +0.84%      N/A         951.46 J    +0.28%      
# k3mm_cuda              19.77 s     -0.34%      N/A         1867.83 J   -0.14%      
# k3mm_seq_tuning        5.09 s      +4.70%      N/A         693.25 J    +1.40%      
# mvt_numpy              2.42 s      +0.10%      N/A         246.82 J    0.00%       
# mvt_omp                2.75 s      +0.12%      N/A         286.06 J    +0.16%      
# mvt_cuda               3.36 s      +0.11%      N/A         343.70 J    +0.06%      
# mvt_seq_tuning         2.76 s      +0.10%      N/A         286.16 J    +0.19%      
# symm_numpy             780.14 ms   +0.55%      N/A         80.19 J     +0.39%      
# symm_omp               2.36 s      +4.41%      N/A         253.85 J    +4.26%      
# symm_seq_tuning        3.12 s      +0.34%      N/A         279.37 J    +0.01%      
# syr2k_numpy            875.34 ms   +1.06%      N/A         89.30 J     +1.14%      
# syr2k_omp              2.25 s      +0.01%      N/A         241.62 J    -0.02%      
# syr2k_cuda             1.58 s      -0.52%      N/A         164.76 J    -0.45%      
# syr2k_seq_tuning       2.25 s      +0.36%      N/A         241.83 J    +0.23%      
# syrk_numpy             766.25 ms   +0.69%      N/A         78.95 J     +0.56%      
# syrk_omp               2.07 s      +0.01%      N/A         217.36 J    +0.00%      
# syrk_cuda              1.41 s      -0.01%      N/A         148.69 J    -0.33%      
# syrk_seq_tuning        2.06 s      -0.79%      N/A         216.49 J    -0.67%      
# trmm_numpy             865.68 ms   -0.01%      N/A         88.25 J     -0.35%      
# trmm_omp               744.45 ms   +0.52%      N/A         95.06 J     +0.93%      
# trmm_seq_tuning        3.41 s      +1.37%      N/A         279.42 J    +0.69%      

@lukastruemper lukastruemper force-pushed the memory-layout-analysis-tests branch from 4d307d1 to e9cd732 Compare April 29, 2026 10:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant