Skip to content

Releases: AmbiqAI/ns-cmsis-nn

v7.27.0

22 Jun 21:48
13eeb3b

Choose a tag to compare

7.27.0 (2026-06-22)

Features

  • add arm_where_s16 kernel + unit test (634aa2d)
  • add tensor shaping kernels (tile, broadcast_to, scatter_nd, mirror_pad, select_v2, where, reverse_sequence, dynamic_update_slice) (8335274)
  • add tensor shaping kernels (tile, broadcast_to, scatter_nd, mirror_pad, select_v2, where, reverse_sequence, dynamic_update_slice) (8335274)
  • add tensor shaping kernels (tile, broadcast_to, scatter_nd, mirror_pad, select_v2, where, reverse_sequence, dynamic_update_slice) (dd04ea4)
  • Update docs with MVE vs DSP vs Scalar C kernel-benchmarks.md (8246c60)
  • Update docs with MVE vs DSP vs Scalar C kernel-benchmarks.md (f33a188)

Bug Fixes

  • add new operator groups to CMake build options (3a3c1d7)
  • avoid single-rounding requantize overflow (#197) (d6735fd)
  • broadcast_to: decouple stride computation from broadcast mask (0b14668)
  • Correct formatting in arm_tile_s8.c (6266e9e)
  • Correct kernel-benchmarks.md cycle counts (9e75bea)
  • pass CI contracts (formatting, PDSC, SSoT, Zephyr/NSX wiring) (1cd2050)
  • Update benchmark with LP mode results (8c1eebc)
  • Update conv benchmark paragraph and add extra prj.conf setting to Zephyr doc (c2fa2c5)
  • zephyr: align Kconfig with renamed-knob contract (e2b1a61)

Refactoring

  • nsx: make ns-cmsis-nn SoC compatibility wildcard (#200) (3dc5bf2)
  • use arm_memcpy_s8/s16 instead of raw memcpy (ea005a6)

v7.26.0

19 May 19:38
2bb8195

Choose a tag to compare

7.26.0 (2026-05-19)

Features

  • ci: publish multi-toolchain staticlibs (0c33b87)

Bug Fixes

  • Add missing endif to zephyr/Kconfig (#192) (46b6b39)
  • release: harden pack changelog and CI image tagging (#183) (fa8e5ea)

v7.25.0

15 May 18:39
52710bd

Choose a tag to compare

7.25.0 (2026-05-15)

Features

  • release: complete static releases — SDK tarball, find_package, Zephyr and CMSIS-Pack prebuilt (#182) (aa304cb)

Bug Fixes

  • repair devcontainer build failures (#136) (47a075a)
  • toolchain: propagate NS_CMSIS_NN_TARGET_CPU to try_compile (#180) (c46b443)

v7.24.1

14 May 20:03
968d178

Choose a tag to compare

7.24.1 (2026-05-08)

Bug Fixes

  • Change vld1q_s32() to vldrwq_s32() in arm_rsqrt_s16.c (45e120e)
  • Change vld1q_s32() to vldrwq_s32() in arm_rsqrt_s16.c (45e120e)
  • Change vld1q_s32() to vldrwq_s32() in arm_rsqrt_s16.c (95ffa0d)
  • correct pointer increment in per-row scalar (0c94945)
  • correct pointer increment in per-row scalar broadcast path for arm_squared_difference_s16 and s8 (055edf9)
  • correct pointer increments in elementwise functions to ensure proper data handling (70d893d)
  • Ensure helia-core-tester.yml fetches submodule tag (996f0f4)
  • force submodule tag fetch in coverage-merge-summary job (817b9d2)
  • update helia tag retrieval to use abbrev=0 and handle missing tags (853fa42)
  • update subproject commit reference in helia-core-tester (9951017)

v7.24.0

24 Apr 23:30
8d62c8c

Choose a tag to compare

7.24.0 (2026-04-24)

Features

  • add NSX module support (source + prebuilt modes) (#128) (9d8badc)
  • add pattern to dependabot to group bot PRs (#127) (56722a1)
  • route ATfE/Clang to ACLE intrinsics path (#131) (99d4435)

v7.23.0

10 Apr 19:15
295b105

Choose a tag to compare

7.23.0 (2026-04-10)

Features

  • add arm_sqrt_s16 kernel with MVE-optimized path (af1429c)
  • add arm_sqrt_s16 kernel with MVE-optimized path (3c97bcd)
  • add depthwise fast path to arm_convolve_wrapper_s16 (45d68a3)
  • add int16 rsqrt kernels and generated tests (5b4e8ac)

Bug Fixes

  • add arm_convolve_s16_depthwise.c to PDSC (c78ce6b)
  • add arm_sqrt_s16.c to PDSC source file listing (2e87b7e)
  • depthwise scalar path uses arm_nn_requantize for int32 bias, add uint16 offset overflow guard (932650a)
  • remove unnecessary casts discarding const from LUT pointers (576da80)

Refactoring

  • remove duplicate static helper, call arm_convolve_s16_group_ch_mult_1 from wrapper (73679b3)
  • rename arm_convolve_s16_depthwise to arm_convolve_s16_group_ch_mult_1 (03b2f86)

v7.22.0

24 Mar 18:29
60e75a9

Choose a tag to compare

7.22.0 (2026-03-24)

Features

  • Add int8 SQRT kernel. (2aa4bc5)
  • Add int8/int16 Squared Difference (fa6ba21)
  • Add MVE path to squared difference, add int16 tests (75e7d06)
  • Update ARM.CMSIS-NN.pdsc with squared_differnece (f69896a)
  • Vectorize arm_sqrt_s8.c and place LUT into TCM (6d38e45)

Bug Fixes

  • Default RefactoredTestGen to legacy Keras and fall back to modern Keras only when tf_keras is unavailable (9344acf)
  • Scope Keras/Tensorflow upgrade requirements to SQRT test generation only by restoring baseline UnitTest deps and reverting non-SQRT generators to tf_keras (010a198)

v7.21.0

19 Feb 18:51
c8a2336

Choose a tag to compare

7.21.0 (2026-02-18)

Features

  • Add s32 variant of concatenation (7e95d01)
  • Add s32 variant of concatenation (7e95d01)
  • Add s32 variant of concatenation (825d406)
  • Add support for s32 strided slice (1df6b34)
  • Add support for s32 strided slice. Add int32 io support to test.py (4ba775d)
  • Changes arm_strided_slice_s32() to use arm_memcpy_s32() (5afc9df)

Bug Fixes

  • Add shape checks, correct loop counter type to prevent overflow, move copy size out of loop, and add unit tests (02217c5)
  • Move validate_s32() to Utils/validate.h and update zephyr/CMakeLists.txt to include s32 variant of strided slice (049714d)
  • Remove duplicate output_tensor.h test data (aa9f026)
  • Update ARM>CMSIS-NN.pdsc file with arm_concatenation_s32.c (46dccd6)

v7.20.0

06 Feb 23:26
0d0e9db

Choose a tag to compare

7.20.0 (2026-02-06)

Features

  • Add int8/int16 absolute value (09e1b8d)

v7.19.0

15 Jan 20:08
ce244a3

Choose a tag to compare

7.19.0 (2026-01-14)

Features

  • Add Resize Nearest Neighbor Operator (3c49c3f)
  • Move nearest neighbor coordinate mapping into precomputed x/y arrays stored in ctx->buf (feeb063)
  • Move scale and offset computation out of GetNearestNeighbor (b37d6e1)

Bug Fixes

  • Add tflite-micro back to requirements.txt (16b2195)
  • Correct resize functions to pass in correct size to arm_memcpy invocation. (8723c80)
  • Correct whitespace inconsistencies (d6861ec)
  • Include ARM_CMSIS_NN_ARG_ERROR as a possible return code (51d77d1)
  • Update pdsc file (04c8839)
  • Update repository URL for Ethos-U core platform in test setup script (c160efe)
  • Update URL for Ethos-U core platform (7e0fd73)