Makes fasterrcnn backbone run by NoraHagmeyer · Pull Request #666 · daisytuner/docc

NoraHagmeyer · 2026-04-09T17:56:52Z

Adds tensor.insert_slice() support to the mlir frontend
Adds a fasterrcnn_resnet50 benchmark to the repository. Only the backbone can be lowered to torch-mlir. Therefore, no harness and benchmarking infrastructure is used for this benchmark.

mlir/lib/Target/SDFG/TensorToSDFGTranslator.cpp

daisytuner · 2026-04-09T19:45:57Z

Daisytuner Report - mlir_torch_models (chamomile)

@@                                   Benchmarks                                   @@
=====================================================================================
  Benchmark              Time        ΔTime       Thr         Energy      ΔEnergy     
=====================================================================================
# bn_conv_bn_relu_maxpool_torch18.62 s     +0.51%      N/A         3487.65 J   -3.87%      
# bn_conv_bn_relu_maxpool_run_none3.26 s      +2.77%      N/A         632.98 J    -0.97%      
# bn_conv_bn_relu_maxpool_run_sequential3.29 s      +3.37%      N/A         642.20 J    -0.02%      
# bn_conv_bn_relu_maxpool_run_openmp3.37 s      +3.81%      N/A         660.00 J    -0.89%      
# bn_conv_bn_relu_maxpool_run_cuda3.71 s      +1.05%      N/A         697.14 J    -4.00%

daisytuner · 2026-04-09T21:01:14Z

Daisytuner Report - mlir_torch_layers (chamomile)

@@                                   Benchmarks                                   @@
=====================================================================================
  Benchmark              Time        ΔTime       Thr         Energy      ΔEnergy     
=====================================================================================
# batchnorm_torch        19.01 s     -1.23%      N/A         3740.63 J   +0.32%      
# batchnorm_run_none     3.87 s      -0.17%      N/A         761.27 J    +0.58%      
# batchnorm_run_sequential3.98 s      +0.35%      N/A         784.66 J    +1.49%      
# batchnorm_run_openmp   3.51 s      -1.55%      N/A         729.36 J    -0.13%      
# batchnorm_run_cuda     5.47 s      -1.94%      N/A         1078.82 J   -0.47%      
# conv2d_torch           18.59 s     +0.11%      N/A         3652.26 J   +1.16%      
# conv2d_run_openmp      4.34 s      +0.04%      N/A         1039.37 J   +1.88%      
# conv2d_run_cuda        7.54 s      +1.82%      N/A         1474.85 J   +3.11%      
# linear_torch           6.20 s      -0.85%      N/A         1489.29 J   +0.67%      
# linear_run_none        10.52 s     -0.09%      N/A         2890.40 J   +0.38%      
# linear_run_sequential  8.89 s      -0.75%      N/A         2569.07 J   +0.10%      
# linear_run_openmp      8.46 s      +1.81%      N/A         2482.34 J   +2.67%      
# linear_run_cuda        8.38 s      -0.25%      N/A         1635.44 J   +0.58%      
# matmul_torch           6.08 s      +0.21%      N/A         1463.57 J   +1.05%      
# matmul_run_none        10.65 s     +1.25%      N/A         2921.20 J   +1.33%      
# matmul_run_sequential  8.89 s      +0.69%      N/A         2578.28 J   +1.56%      
# matmul_run_openmp      8.28 s      -0.58%      N/A         2432.14 J   -0.02%      
# matmul_run_cuda        8.23 s      +0.32%      N/A         1615.04 J   +1.59%      
# pooling_torch          26.16 s     +1.48%      N/A         5225.36 J   +2.81%      
# pooling_run_none       15.27 s     -0.73%      N/A         2946.75 J   +0.53%      
# pooling_run_sequential 15.19 s     -1.28%      N/A         2935.33 J   +0.11%      
# pooling_run_openmp     8.99 s      -1.87%      N/A         1833.85 J   -0.80%      
# pooling_run_cuda       20.03 s     -0.83%      N/A         3908.24 J   +0.46%      
# relu_torch             19.26 s     +1.68%      N/A         3784.56 J   +2.88%      
# relu_run_none          3.80 s      +0.01%      N/A         750.72 J    +1.30%      
# relu_run_sequential    3.81 s      -0.28%      N/A         751.30 J    +0.77%      
# relu_run_openmp        3.53 s      +1.94%      N/A         725.23 J    +3.28%      
# relu_run_cuda          5.48 s      -0.79%      N/A         1081.98 J   +0.60%

daisytuner · 2026-04-09T21:36:29Z

Daisytuner Report - python_npbench (zinnia)

@@                                   Benchmarks                                   @@
=====================================================================================
  Benchmark              Time        ΔTime       Thr         Energy      ΔEnergy     
=====================================================================================
# adi_numpy              1.31 s      -0.43%      N/A         130.82 J    -0.51%      
- adi_omp                14.61 s     +82.84%     N/A         1422.99 J   +79.08%     
- adi_cuda               4.69 s      +10.03%     N/A         454.05 J    +9.55%      
- adi_seq_tuning         15.13 s     +83.89%     N/A         1401.41 J   +83.29%     
# atax_numpy             2.15 s      -0.08%      N/A         222.84 J    -0.14%      
# atax_omp               2.96 s      -1.22%      N/A         370.55 J    -1.82%      
# atax_cuda              4.14 s      +0.74%      N/A         424.87 J    +0.63%      
# atax_seq_tuning        4.10 s      -0.44%      N/A         397.79 J    +0.08%      
# gemm_numpy             1.22 s      +0.95%      N/A         195.16 J    +0.80%      
# gemm_omp               1.12 s      +0.56%      N/A         162.94 J    +0.15%      
# gemm_cuda              10.63 s     +0.40%      N/A         1011.22 J   +0.48%      
# gemm_seq_tuning        1.11 s      -0.10%      N/A         161.69 J    -0.28%      
# gesummv_numpy          1.75 s      -0.49%      N/A         249.64 J    -0.51%      
# gesummv_omp            1.96 s      -1.39%      N/A         308.00 J    -1.87%      
# gesummv_cuda           8.34 s      +0.13%      N/A         1001.69 J   +0.08%      
# gesummv_seq_tuning     8.58 s      -0.51%      N/A         974.64 J    -0.03%      
# gemver_numpy           1.10 s      +0.98%      N/A         169.15 J    +1.11%      
# gemver_omp             843.10 ms   +0.02%      N/A         107.36 J    -0.37%      
# gemver_cuda            3.90 s      +0.95%      N/A         390.24 J    +0.91%      
# gemver_seq_tuning      5.51 s      -0.49%      N/A         496.62 J    +0.59%      
# k2mm_numpy             1.20 s      -0.15%      N/A         196.40 J    -0.11%      
# k2mm_omp               3.57 s      -0.43%      N/A         662.22 J    -0.62%      
# k2mm_cuda              13.59 s     -0.13%      N/A         1290.27 J   +0.17%      
# k2mm_seq_tuning        2.91 s      -3.09%      N/A         390.51 J    -1.50%      
# k3mm_numpy             1.03 s      -0.20%      N/A         181.82 J    -0.24%      
# k3mm_omp               5.55 s      +0.03%      N/A         947.43 J    -0.85%      
# k3mm_cuda              19.79 s     -0.07%      N/A         1864.91 J   -0.14%      
# k3mm_seq_tuning        4.90 s      -0.90%      N/A         686.17 J    -0.17%      
# mvt_numpy              2.43 s      -0.09%      N/A         247.61 J    -0.17%      
# mvt_omp                2.74 s      -0.23%      N/A         284.54 J    -0.19%      
# mvt_cuda               3.36 s      -0.32%      N/A         342.50 J    -0.23%      
# mvt_seq_tuning         2.74 s      +0.01%      N/A         284.26 J    +0.02%      
# symm_numpy             794.75 ms   -0.57%      N/A         81.74 J     -0.44%      
- symm_omp               6.06 s      +29.35%     N/A         595.37 J    +28.66%     
- symm_seq_tuning        8.31 s      +20.20%     N/A         749.91 J    +20.80%     
# syr2k_numpy            894.33 ms   -0.11%      N/A         90.93 J     -0.02%      
- syr2k_omp              9.85 s      +35.42%     N/A         935.82 J    +34.66%     
# syr2k_cuda             1.64 s      +7.62%      N/A         170.26 J    +6.89%      
- syr2k_seq_tuning       9.84 s      +36.06%     N/A         935.45 J    +35.17%     
# syrk_numpy             789.79 ms   +0.70%      N/A         80.95 J     +0.57%      
- syrk_omp               5.97 s      +31.05%     N/A         573.47 J    +29.89%     
- syrk_cuda              1.53 s      +12.48%     N/A         159.62 J    +10.93%     
- syrk_seq_tuning        5.98 s      +31.38%     N/A         574.51 J    +30.14%     
# trmm_numpy             880.51 ms   +0.39%      N/A         89.76 J     +0.50%      
# trmm_omp               711.70 ms   +0.01%      N/A         89.98 J     -0.63%      
# trmm_seq_tuning        3.38 s      +0.11%      N/A         276.75 J    -0.20%

This benchmark does not use the harness as it only compiles the backbone. The ROI is not supported by Torch-MLIR.

NoraHagmeyer added 2 commits April 9, 2026 16:50

Remove debug output

2b674c3

Add insert_slice op to mlir frontend

f7973f9

NoraHagmeyer requested review from Atrisan and lukastruemper and removed request for Atrisan and lukastruemper April 9, 2026 17:56

Atrisan requested changes Apr 9, 2026

View reviewed changes

mlir/lib/Target/SDFG/TensorToSDFGTranslator.cpp Show resolved Hide resolved

NoraHagmeyer added 2 commits April 10, 2026 11:14

Add fasterrcnn_resenet50 benchmark

4ba3c0a

This benchmark does not use the harness as it only compiles the backbone. The ROI is not supported by Torch-MLIR.

Add fasterr_cnn to nightlies

06a2ec8

NoraHagmeyer force-pushed the fastcnn branch from b8fc5e7 to 06a2ec8 Compare April 10, 2026 09:27

NoraHagmeyer requested a review from Atrisan April 10, 2026 09:43

Atrisan approved these changes Apr 10, 2026

View reviewed changes

NoraHagmeyer merged commit 6e2e054 into main Apr 10, 2026
25 checks passed

NoraHagmeyer deleted the fastcnn branch April 10, 2026 12:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Makes fasterrcnn backbone run#666

Makes fasterrcnn backbone run#666
NoraHagmeyer merged 4 commits intomainfrom
fastcnn

NoraHagmeyer commented Apr 9, 2026

Uh oh!

Uh oh!

daisytuner bot commented Apr 9, 2026 •

edited

Loading

Uh oh!

daisytuner bot commented Apr 9, 2026 •

edited

Loading

Uh oh!

daisytuner bot commented Apr 9, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

NoraHagmeyer commented Apr 9, 2026

Uh oh!

Uh oh!

daisytuner bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Daisytuner Report - mlir_torch_models (chamomile)

Uh oh!

daisytuner bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Daisytuner Report - mlir_torch_layers (chamomile)

Uh oh!

daisytuner bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Daisytuner Report - python_npbench (zinnia)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

daisytuner bot commented Apr 9, 2026 •

edited

Loading

daisytuner bot commented Apr 9, 2026 •

edited

Loading

daisytuner bot commented Apr 9, 2026 •

edited

Loading