v.0.5.1.0
Pre-release
Pre-release
What's Changed
- Improve the c++ driver build-run-onnx-lib.sh by @chentong319 in #3108
- Followups for torch model driver by @chentong319 in #3106
- [NNPA] Fix an error in ZHighConstantPropagation for QuantizedStick by @tungld in #3112
- Add z17 for -march by @chentong319 in #3113
- Decompose Hardswish into simpler ONNX ops by @kumarappan-cmyk in #3107
- Reorder relu to maxpool optimization pass in ONNX dialect by @Arkar-Hema in #3109
- Fix "operand #0 does not dominate this use" when fusing onnx.ConstantOp by @tungld in #3119
- Support QLinearMatMul on CPU by @tungld in #3117
- Update black-format-check.yml by @andife in #3118
- Merge nested concat Ops optimization pass in ONNX dialect by @Arkar-Hema in #3111
- Enhance shape inference for ONNX Reshape by @tungld in #3122
- update zdnn-v1.1.2 by @Sunny-Anand in #3130
- Updating supported ops on NNPA md for z17. by @christopherlmunoz in #3120
- fix CVE-2025-32434 by @Sunny-Anand in #3135
- Fuse consecutive clips pattern by @kumarappan-cmyk in #3132
- Fix types in RandomNormal(Like) test inputs to match the dtype by @jorickert in #3144
- Conv add const where the constant is a scalar by @AlexandreEichenberger in #3145
- added support for Celu op by @logeshwaranmcw in #3139
- Fix some warnings related to stickification for NNPA by @tungld in #3147
- Removing duplicate file docs/SupportedONNXOps-NNPA-supplement.md by @christopherlmunoz in #3146
- Instance and Group norm needs shape inference by @AlexandreEichenberger in #3148
- Fusion of Matmul add covering the stacked/unstacked/bcast1/bcast23 patterns by @AlexandreEichenberger in #3140
- Support --march=native by @chentong319 in #3134
- Do not fuse locations in Maxpool-Relu canonicalization by @jorickert in #3152
- Support for arch native on mac by @AlexandreEichenberger in #3153
- Fix messages about setting ONNX_MLIR_HOME by @tungld in #3157
- Combine Parallel Convolution optimization pass in ONNX Dialect by @Arkar-Hema in #3116
- Fuse locations when merging concats by @jorickert in #3162
- added support for bitwise_not op by @LekkalaSravya3 in #3156
- Adding Lowering for shrink op from onnx to krnl dialect by @jagadeeshvx in #3165
- LLVM update 43d71ba by @brnorris03 in #3086
- Workaround broken ubi8 amd64 image not creating pip3 symlink by @gongsu832 in #3173
- Add MemoryEffortsOpInterface to krnl.InstrumentOp by @chentong319 in #3172
- Add Lowering of ThresholdedRelu op from ONNX to Krnl Dialect by @VishaliSenthilkumar in #3154
- Prefetch prevent OMP pass to succeed by @AlexandreEichenberger in #3170
- Adding Lowering for Mish Op from Onnx to krnl dialect by @jagadeeshvx in #3167
- Fuse location for max and min in clip fusing and add a test for it by @jorickert in #3177
- Fix to add default value to ConstantOfShape by @AlexandreEichenberger in #3174
- add side effect to krnl and ZLow ops by @chentong319 in #3180
- Adding Lowering for LpNormalization op from onnx to krnl by @jagadeeshvx in #3176
- Remove most warnings (includes building for NNPA) by @AlexandreEichenberger in #3179
- Added lowering for MeanVarianceNorm From ONNX-To-Krnl by @LekkalaSravya3 in #3178
- Adding Lowering for Binarizer op from onnx to krnl by @jagadeeshvx in #3184
- Adding Lowering for BitShift op from onnx to krnl by @jagadeeshvx in #3186
- Added support for negative permute values in transform by @AlexandreEichenberger in #3191
- Parallel fixes v1 by @AlexandreEichenberger in #3192
- added lowering for hammingwindow to krnl dialect by @LekkalaSravya3 in #3189
- RunONNXModel test that saved and loaded models use the same compiler options by @AlexandreEichenberger in #3195
- Allows POW with different types (convert 2nd input to 1st input type) by @AlexandreEichenberger in #3175
- Omitted change from prior PR on Pow handling by @AlexandreEichenberger in #3200
- Changes in onnxmlir package by @chentong319 in #3198
- Added lowering for Blackman Window to Krnl Dialect by @LekkalaSravya3 in #3197
- added support for uniform_random op with shape handling by @LekkalaSravya3 in #3194
- Making sure that the instrumentation get the same node name by @AlexandreEichenberger in #3201
- Avoid using onnx_proto in build code by @cyyever in #3182
- Enhance --shapeInformation for more cases by @tungld in #3208
- Added support for Hann Window to krnl Dialect by @jagadeeshvx in #3205
- [NNPA] Support multiple NNPAs by @tungld in #3199
- update llvm and stablehlo by @Sunny-Anand in #3206
- Add an optimization/canonicalization pattern to move Relu and LeakyRelu before Split operations. by @jorickert in #3211
- Fuse MatMul and Mul if one of the operands of Mul is a scalar constant by @tungld in #3214
- Fix the issue where deallocation is put at the end of a function by @tungld in #3219
- fix for s390x-z/OS zdnnx missing member by @Sunny-Anand in #3224
- Explore the first two outermost dimensions for parallelism for Concat by @tungld in #3226
- Use greedy rewriter instead of partial conversion in recomposition pass and fuse locations when fusing convolutions. Also fix bugs in the parallel conv recomposition by @jorickert in #3207
- Compute static shape when collapsing innermost or outtermost dims by @tungld in #3228
- Krnl iterate org indices by @tungld in #3218
- Add Support for RandomUniformLike in Krnl Dialect by @jagadeeshvx in #3223
- update llvm and stablehlo by @jagadeeshvx in #3221
- Generate parallel code for onnx.expand by @tungld in #3231
- Add two patterns to remove unused zlow.stick and zlow.unstick operations by @tungld in #3230
- [RunONNXModel.py] Add an option to store profiling info to a file by @tungld in #3234
- Generate parallel code for onnx.Slice by @tungld in #3237
- update llvm and stablehlo by @jagadeeshvx in #3232
- [zdnnx] Use OMP_NUM_THREADS to control the number of threads for NNPA by @tungld in #3227
- Remove unused function argument attributes to avoid llvm warnings by @tungld in #3241
- [NFC] Create a common helper function for emitting krnl.parallel by @tungld in #3239
- Fix dimensional operand cannot be used as a symbol in LayoutTransform by @tungld in #3240
- Edit an error message when opening the .constants.bin file by @tungld in #3245
- Fuse MatMul and Div if the 2nd operand of Div is a scalar constant by @tungld in #3242
- Enable debug listing of operations that vanished by @AlexandreEichenberger in #3249
- [zdnnx] Parallelize quantized matmul for multiple NNPAs by @tungld in #3256
- Recompose a Layer/RMSNorm even if the scale multiplication has multiple usesp… by @jorickert in #3247
- Added z17 CPU vs NNPA performance model by @AlexandreEichenberger in #3252
- Bump stablehlo to e07debd5e257ec1e118f18c54068977b89f03b2f by @jorickert in #3246
- Use onnxmlir package in RunONNXModel.py by @chentong319 in #3248
- Fix the bug in new RunONNXModel.py by @chentong319 in #3259
- fix an error in func.FuncOp lowering to krnl by @chentong319 in #3264
- Special case for shape inference for onnx.Range operator by @tungld in #3268
- Fixed index expr in safe-code-gen mode by @AlexandreEichenberger in #3270
- Fusing elementwise computations with Stick/Unstick by @AlexandreEichenberger in #3267
- [NNPA] Rewrite zlow.unstick -> view -> zlow.stick into zlow.reshape by @tungld in #3272
- [NNPA] JSON configuration file for device_placement and quantization by @tungld in #3266
- Added missing vector model for elementwise operations by @AlexandreEichenberger in #3276
- fix some bugs in onnxmlir package by @chentong319 in #3283
- Llvm bump attempt5 by @christopherlmunoz in #3282
- Fix eraseOp in ZLowRewrite.cpp by @tungld in #3284
- A rule to remove operands in ConcatOp whose dimension size at axis is zero by @tungld in #3286
- Modifying build to find openmp's runtime build files by @christopherlmunoz in #3290
- OpenMP redundant build command flag by @christopherlmunoz in #3293
- add Python decoding algorithms by @adrian-selk in #3288
- Fix creating view in zdnnx by @tungld in #3297
- Add sum and total option to report time in make-report.py by @AlexandreEichenberger in #3295
- fix pytorch CVE by @Sunny-Anand in #3299
- Fix two issues to make omp work again with --parallel and zdnnx by @tungld in #3296
- Normalization with stick by @AlexandreEichenberger in #3298
- Building OpenMP on MacOS by @AlexandreEichenberger in #3301
- Fix for benchmark breakage by @AlexandreEichenberger in #3303
- Upgrade to protobuf 6.31.1 by @gongsu832 in #3275
- Enhance shape inferences for MatMul and Gemm to allow updating input dimensions by @tungld in #3289
- Adding listing helper tools by @AlexandreEichenberger in #3304
- Get dim from the input of reshape and transpose by @tungld in #3306
- Added a bit of functionality to IsolatePass, fixed a small bug by @AlexandreEichenberger in #3307
- fixing protobuf jenkins issue by @christopherlmunoz in #3318
- Support merged decoder onnx model in python script by @tungld in #3310
- Fix a bug for PyRuntimeC by @tungld in #3313
- Add a compile option --replace-op-with-its-operand by @tungld in #3300
- start of llvm/stablehlo bump by @christopherlmunoz in #3302
- Test RUST v1.89 for jenkins by @tungld in #3320
- New algorithm for stick/unstick decomposition by @tungld in #3309
- Fix some bugs in onnxmlirtorch package by @tungld in #3319
- Add ShapeInference for RandomUniformLike. by @jorickert in #3292
- Starting example of using table gen for pass by @chentong319 in #3322
- Add a new compile option --printONNXBasicIR by @tungld in #3323
- Extended Layout Transform by @AlexandreEichenberger in #3324
- Do not use onnx model types for the model output when options_.useOnnxModelTypes=false by @tungld in #3327
- Optimization of the ONNX TopK operator by @adrian-selk in #3329
- Hoist alloc/dealloc for OpenMP wsloop by @chentong319 in #3331
- Only include perf tests if LLVM_INCLUDE_BENCHMARKS is enabled, as it … by @jorickert in #3317
- Fix topk decl by @Sunny-Anand in #3334
- Bump LLVM to 42a8ff877d47131ecb1280a1cc7e5e3c3bca6952 ; StableHLO to 3acda594d678e285ed0c8b4179bc1e90348e5997 by @jorickert in #3333
- Enable zhigh stick-unstick decomposition by default by @tungld in #3337
- update absl and openmp in release mode by @Sunny-Anand in #3338
- Update onnxmlirtorch package with various changes by @tungld in #3332
- Support empty inputs to the compiled model and remove input arguments in check-onnx-backend-constant by @tungld in #3344
- Make dynamic dimension analysis for Reshape more general by @tungld in #3341
- Added comment to script and doc by @AlexandreEichenberger in #3312
- Fix a bug in saving a json file and a bug in onnx to zhigh for quantization by @tungld in #3340
- Make verifyInputTensor true by default by @chentong319 in #3079
- fix osx build for deprecation of git image by @Sunny-Anand in #3348
- Add ONNXToLinalg conversion pass with MatMul support by @kimm240 in #3343
- ONNX To LLVM Pass Backend using linalg by @kimm240 in #3349
- Remove mlir usePropertiesForAttributes option as it is deprecated in upcoming LLVM by @tungld in #3357
- [onnxmIirtorch] Export/import cache from disk and add config.py by @tungld in #3350
- upgrade llvm by @Sunny-Anand in #3345
- revert absl update by @Sunny-Anand in #3361
- Fix protobuf build error due to undefined reference to absl functions by @gongsu832 in #3363
- Add PyRuntime-light build into Jenkin build by @chentong319 in #3359
- Extend the verifyInputTensors to shape information provided with compilation option by @chentong319 in #3346
- Fix errors caused by the new version of python formatter: black by @chentong319 in #3366
- Enable One-Shot Bufferization for Mixed Linalg and ONNX Operations by @kimm240 in #3358
- Simplify the source code for light weight PyRuntime on Linux by @chentong319 in #3364
- Refactor ONNXToLinalg pass to use TableGen by @kimm240 in #3362
- Handle onnx.Concat whose some inputs are tensor<0xdtype> by @tungld in #3353
- Implement Selective Linalg Conversion with --linalg-ops Option by @kimm240 in #3356
- Fix a bug in importing dim_param by @tungld in #3377
- CMake of absl for light-weight PyRuntime by @chentong319 in #3375
- Relax protobuf requirements by @tungld in #3381
- Fix Lambda capture issue(#3378) in ElementsAttrBuilder::allEqual by @2876225417 in #3379
- Add new pattern of LT -> reshape-merge(squeeze) -> LT by @AlexandreEichenberger in #3380
- Set OM_CONSTANT_PATH in onnxmlirdocker.py by @tungld in #3372
- Profiling into a buffer by @AlexandreEichenberger in #3342
- [onnxmlirtorch] Support decoder models by @tungld in #3382
- Modernize pass by @juhyunbae17 in #3373
- Implement ONNX Basic Conv to Linalg lowering by @kimm240 in #3376
- Stick/unstick fp32 data alloc at 4k pages by @AlexandreEichenberger in #3250
- Remove duplicated onnx.Dim refering to the same dynamic dimensions by @tungld in #3389
- [NFC] Add useful information to passes printed by IsolatePass.py: def use count and constant scalars. by @AlexandreEichenberger in #3390
- Split the cmake file for build target by @chentong319 in #3385
- Detection and removal of specific Slice of a Concat pattern. by @AlexandreEichenberger in #3391
- [NFC] Rename onnxmlirtorch to torch_onnxmIir by @tungld in #3386
- Upgrade base images by @gongsu832 in #3384
- Move zdlc_pyrt into onnx-mlir by @chentong319 in #3397
- [NNPA] Add global NNPA configuration object and GenerateConfigFile pass and cleanup JSON config handling by @tungld in #3398
- Removal of the C++ Runtime files that are not needed by @AlexandreEichenberger in #3392
- New JSON config file with compile_options and --config-file compile flag by @tungld in #3403
- upgrade onnx to 1.20.1(Opset 25) by @Sunny-Anand in #3367
- Change the order of options to options_from_config_file then options_cli by @tungld in #3409
- Decouple compilation from the onnx-mlir/mlir/llvm code base by @AlexandreEichenberger in #3402
- Use of python Compile session in RunONNXModel.py by @AlexandreEichenberger in #3407
- Print out a message when there is a file created to store constant tensors by @tungld in #3413
- LLVM Bump by @christopherlmunoz in #3410
- Fix building warnings by @tungld in #3404
- [NNPA] Support matching tensor information in the JSON config file for NNPA by @tungld in #3412
- Added the ability to provide a path to onnx-mlir in the C++ and Python interface by @AlexandreEichenberger in #3411
- Safer exec session run by @AlexandreEichenberger in #3414
- CVE upgrades by @christopherlmunoz in #3415
- Fix multiarch manifest by @gongsu832 in #3418
- Removing standalone absl build by @christopherlmunoz in #3419
- protobuf@33 on macos breaks MLIR cache due to protoc not found by @gongsu832 in #3421
- Add an option to append decoding strategies into the input model by @tungld in #3417
- Added minor check on relations between alloc and data ptr in OMTensors by @AlexandreEichenberger in #3420
- Fix the assertion about axis in ONNX Flatten op by @tungld in #3424
- For MVS Systems, move to malloc instead of MMAP for Constant Files by @JacobEngelbrecht in #3400
- Grid sample support by @AlexandreEichenberger in #3426
- Force AveragePool work in the same way as torch and onnxruntime with a flag by @chentong319 in #3429
- Add Global Constructors/Destructors for Constant Loading by @tungld in #3428
- Grid sample Optimization by @AlexandreEichenberger in #3442
- Add some run utility to OMPyInfer by @chentong319 in #3425
- Include dtype in lightweight hash for placeholder nodes by @3em0 in #3427
- [NNPA] Call ShapeInference before lowering zhigh by @tungld in #3440
- Use malloc on z/OS and mmap on others for loading an external constant file by @tungld in #3444
- Fix shape inference for some ops whose outputs/inputs may have encoding by @tungld in #3445
- Simplify PyRuntime by @chentong319 in #3446
- [NNPA] Fix several bugs in ProcessStickData by @tungld in #3447
- Implementation of ConvTranspose supporting dynamic Input Shape by @AlexandreEichenberger in #3443
- Stand-alone onnx-mlir by @AlexandreEichenberger in #3416
- Cleanup of ONNX Tensor data by @AlexandreEichenberger in #3448
- Update device control for ONNXToZHigh by @chentong319 in #3450
- Create dependabot.yml by @andife in #3423
- Hardlink Check by @christopherlmunoz in #3456
- small update to OMPyInfer by @chentong319 in #3449
- Load pytorch model in f32 type by @tungld in #3458
- Do not create two capsules for the same pointer by @tungld in #3462
- Updated scripts and instructions for standalone by @AlexandreEichenberger in #3457
- Execution session debug by @AlexandreEichenberger in #3461
- Fix the lowering of GatherND if dataRank - indicesRank > 1 by @tungld in #3463
- disable hsperfdata by @gongsu832 in #3464
- Upgrade to ONNX v1.21.0 for CVE-2026-34445 by @gongsu832 in #3469
- Disable dynamic size for NNPA Pool Operations by @AlexandreEichenberger in #3467
- Update of pyruntime doc to reflect the new interface (no functional change) by @AlexandreEichenberger in #3472
- OMPyCompile package by @chentong319 in #3460
- Support debug info of the compiled model for gdb/lldb by @chentong319 in #3422
- Beefing up runtime for invalid inputs by @AlexandreEichenberger in #3471
- [NNPA, zdnnx] Fix issues related to handling very large tensors by @tungld in #3475
- Fix compile options in JsonConfigFile.md by @tungld in #3476
- Fix GCC warnings in ONNX dialect build on Linux by @compilersutra in #3478
- [NNPA, zdnnx] Only show warning in debug mode by @tungld in #3480
- Remove redundant include by @primenumber in #3479
- A lightweight DimAnalysis that can be called from ShapeHelper by @tungld in #3473
- Add convertion of SinOp/CosOp ONNX to TOSA by @primenumber in #3466
- Embed compile info into the .so file by @tungld in #3482
- fix cve's for pillow, update torch and torchvision by @Sunny-Anand in #3488
- PY Bind signatures, and simplification of PyExecutionSession by @AlexandreEichenberger in #3494
- Helper script updated (fixLitTests and RunONNXModel) by @AlexandreEichenberger in #3495
- NNPA: Add the missed check for pool ops by @chentong319 in #3497
- Rename the package OMPyInfer to om_pyrt by @chentong319 in #3496
- Fix a bug in bound check for input by @chentong319 in #3498
- Update devcontainer example by @jorickert in #3470
- Container in C++ implementation by @AlexandreEichenberger in #3502
- Convolution optimizations for 1x1 (moved) and 2D conv (using im2col) by @AlexandreEichenberger in #3499
- update main to upcoming release branch by @Sunny-Anand in #3510
- Removed the conditions on the output shape for Conv for NNPA by @AlexandreEichenberger in #3512
- Support more onnx operators for the pattern unstick -> onnx op -> stick by @tungld in #3508
- Fixing lit tests for the right separators between tests by @AlexandreEichenberger in #3513
- Add two rewrite patterns: expand-slice and expand-stick by @tungld in #3514
- Add a new pass --onnx-cse-with-node-name to do sub-common expression removal for onnx ops by @tungld in #3516
- Add conversion of Clip/Erf/Tanh/GeluOp ONNX To TOSA by @primenumber in #3506
- [RFC]Implement reifyResultShapes for ONNXAddOp by @kimm240 in #3483
- Added support for optional outputPath to compiler interfaces by @AlexandreEichenberger in #3518
- Fix a lib dependency in torch_onnxmlir and the constant path by @tungld in #3517
- [NNPA] Rewrite N-D Transpose-MatMul pattern into 3D Transpose-MatMul for IBM z17 by @tungld in #3519
- [NNPA] Enable transpose-matmul in zdnnx and introduce an environment variable ZDNNX_MAX_TILE_SIZE_IN_MB by @tungld in #3521
New Contributors
- @kumarappan-cmyk made their first contribution in #3107
- @Arkar-Hema made their first contribution in #3109
- @andife made their first contribution in #3118
- @logeshwaranmcw made their first contribution in #3139
- @LekkalaSravya3 made their first contribution in #3156
- @jagadeeshvx made their first contribution in #3165
- @VishaliSenthilkumar made their first contribution in #3154
- @cyyever made their first contribution in #3182
- @adrian-selk made their first contribution in #3288
- @kimm240 made their first contribution in #3343
- @2876225417 made their first contribution in #3379
- @juhyunbae17 made their first contribution in #3373
- @JacobEngelbrecht made their first contribution in #3400
- @3em0 made their first contribution in #3427
- @compilersutra made their first contribution in #3478
- @primenumber made their first contribution in #3479
Full Changelog: v0.5.0.0...v.0.5.1.0