Release v0.9.7
Pre-release🚀 Release v0.9.7
📦 What's Included
C++ Build Artifacts
Contains: Compiled C++ libraries, public headers, CMake config files, and dependencies.
System Packages (.deb / .rpm)
- Ubuntu 22.04: DEB packages are attached under Assets
- Ubuntu 24.04: DEB packages are attached under Assets
- Ubuntu 24.04 (arm64): DEB packages are attached under Assets
- Fedora 39: RPM packages are attached under Assets
- Includes:
umd-runtime,umd-devandumd-pythonpackages
Python Wheels
- manylinux_2_28 wheels for Python 3.9-3.13 (x86_64 and aarch64)
- Compatible with Ubuntu 20.04+, RHEL 8+, and other modern Linux distributions
- Includes Python bindings for tt-umd
- PyPI: https://pypi.org/project/tt-umd/
🚀 Release Notes for v0.9.7
Changed
- Release notes for version 0.9.7 not found in CHANGELOG file.
📋 Pull Requests Summary
- Publishing new version 0.9.7 (#2930)
- Introduce EthFirmwareHeartbeatError, UnexpectedRoutingFirmwareConfigError and EthFirmwareMismatchError (#2904)
- Fix BH NOC1 ETH harvested mapping for translated space (#2923)
- Simulation: per-chip SimulationSocket (#2800)
- Add CMFWMismatchError and UnsupportedCMFWError to TopologyDiscovery health errors (#2900)
- Add generic OpTimeoutGuard timing primitive (#2842)
- Update ClusterOptions in MicrobenchmarkOpenCluster test (#2921)
- Silence dead-store warning in warm reset stale-FD test (#2909)
- Return device health related errors in TopologyDiscovery (#2866)
- Reduce fetch-depth for metal in ttsim CI (#2905)
- wormhole: fix ARC telemetry address rebase (#2857)
- Disable flaky test (#2898)
- Use GetSocDescAbsPath for ClusterWH. OneDramOneTensixNoEthSocDesc test (#2867)
- ci: fix ttsim cluster-desc paths after tt-metal submodule move (#2903)
- Add missing chip_to_bus_id to 6U yaml example (#2897)
- Add stale file descriptor Cluster recovery test for warm reset (#2809)
- Move noc_multicast_write into TTDevice (#2704)
- Use tt_kmd_lib for setting power state and KMD API version (#2854)
- Introduce UMD debug-level logs in exalens tests (#2850)
- Add register transactions in DeviceProtocol/TTDevice (#2803)
- Add reset function to tt-kmd-lib (#2844)
- Increase latest supported CMFW version to 19.7.1 (#2855)
- Add reconfigure version for registers (#2818)
- Fix virtual call to set_arc_coordinate during WormholeTTDevice construction (#2847)
- Expose DeviceTimeoutError to Python (#2848)
- Fix RobustMutex's robustness (#2821)
- Add Blackhole galaxy (UBB) cluster descriptor example (#2846)
- Add DeviceTimeoutError exception type (#2840)
- Fix for 6U failing (#2839)
- Display ARC SMC Init status in ArcStartupError (#2829)
- Clarify BlackholeTTDevice constructor dispatch (#2838)
- write_to_core_range for PCIProtocol (#2573)
- SWEmuleChip: Skip RISC-reset Broadcast + Fix Duplicate DRAM Backing (#2791)
- More CoreCoord methods in TTDevice (#2817)
- Quasar TRISC reset condition fix (#2814)
- noc_multicast over EthernetBroadcast for Remote (#2572)
- Make P150 multi-chip cluster descs 2-col harvested (#2828)
- Modify bug template (#2827)
- Rename SiliconDriver tests to Cluster (#2831)
- Move SocArchDescriptor arg from init to TTDevice ctor (#2801)
- Temp disable 6u (#2830)
- Make HangDetector NOC register read injectable (#2799)
- device/soc_descriptor: fix off-by-one boundary check in serialize_dram_cores (#2624)
- Remove redundant build-image yml (#2766)
- Remove unused write_regs (#2527)
- Implement RemoteProtocol's write to core range (#2571)
- Guard methods that will segfault on uninitialized TTDevice (#2781)
- Default num channels in tests (#2805)
- Expose ETH hearbeat and link status (#2789)
- Fix truncation of telemetry_table_addr to uint32_t (#2790)
- Remove p100 from PR and merge queue runs (#2796)
- Retrain DRAM cores on BIST failure (#2768)
- EthernetBroadcast implementation for Single Remote (#2570)
- Reduce some log levels to log_debug (#2777)
- Temporarily disable TestFirmwareInfoProvider.FanSpeed (#2780)
- pcie: add portable SIMD memcpy paths for non-x86 (GCC/Clang) (#2710)
- Skip warning if local only topo discovery (#2385)
- Fix CMake targets (#2187)
- Fix exalens build and test (#2784)
- Add SMN read/write round-trip test on RTL simulator (#2769)
- Clarify sysmem NOC address mismatch errors (likely stale process) (#2776)
- Refactor Chip/TTDevice constructors (part 2) (#2585)
- Create dedicated EthernetBroadcast (#2569)
- feat: add Cluster factory methods and fix ClusterOptions defaults (#2731)
📝 Detailed Pull Request Information
Publishing new version 0.9.7 (#2930)
Author: Aleksandar Djordjevic | Date: 2026-06-26 | Commit: 90ce6d9
Publishing new version 0.9.7
Introduce EthFirmwareHeartbeatError, UnexpectedRoutingFirmwareConfigError and EthFirmwareMismatchError (#2904)
Author: Aleksa Marković | Date: 2026-06-26 | Commit: bb50720
Introduce EthFirmwareHeartbeatError,
UnexpectedRoutingFirmwareConfigError and EthFirmwareMismatchError.
Fix BH NOC1 ETH harvested mapping for translated space (#2923)
Author: Nenad Buncic | Date: 2026-06-25 | Commit: d15915d
This pull request introduces a Blackhole-architecture-specific
workaround for coordinate translation of harvested ETH cores in the
SocDescriptor logic. It ensures that harvested ETH cores are mapped
correctly when using the NOC1 network, addressing previously known
translation issues. Additionally, the test suite is updated to reflect
these fixes by removing the ETH core skip condition for harvested cores
on Blackhole.
Simulation: per-chip SimulationSocket (#2800)
Author: Pavle Janevski | Date: 2026-06-25 | Commit: cd33a7c
Adds SimulationSocket — a per-chip UNIX domain socket that surfaces a
simulated device on disk ("the card"). Standalone and independently
tested; nothing consumes it yet — the discovery that drives the
host-vs-client decision is the stacked PR.
- Binds a per-chip socket; reclaims a stale socket left by a dead owner.
try_createis the non-throwing host-vs-client arbiter (bind/reclaim
→ host; a live owner →nullptr).- Accept-and-drop (liveness only); request serving lands in a later PR.
- Closes and removes the socket file on destruction.
Add CMFWMismatchError and UnsupportedCMFWError to TopologyDiscovery health errors (#2900)
Author: Aleksa Marković | Date: 2026-06-25 | Commit: 873ab40
Add CMFWMismatchError and UnsupportedCMFWError to TopologyDiscovery
health errors.
Add generic OpTimeoutGuard timing primitive (#2842)
Author: Pavle Janevski | Date: 2026-06-25 | Commit: b3ed06b
Adds one small, standalone, header-only timing primitive in utils,
generic and unit-testable without hardware:
OpTimeoutGuard— times a single operation against a fixed per-op
wall-clock budget and reacts to overruns.record_and_check()takes the
timestamp sampled immediately before the op began; if the elapsed time
exceeds the budget the guard reacts, otherwise it does nothing. A budget
of0disables the check entirely (matching theutils::check_timeout
convention).
Two caller-supplied behaviors layer on top of the bare budget check:- an optional false-alarm veto (
is_false_alarm), consulted only on
an apparent overrun: returningtruemeans "false positive, the op is
healthy" and the overrun is ignored; returningfalse(or having no
callback) confirms it. - an overrun action (
on_overrun), invoked on a confirmed overrun
ason_overrun(delta, args...)— the measured delta followed by
whatever extra arguments the caller forwarded torecord_and_check().
The action and the per-call arguments are deliberately generic so the
guard stays reusable: the domain-specific reaction (e.g. throwing a
typed timeout error with operation metadata) lives at the call site, not
in the guard. The elapsed delta is captured the moment the overrun is
detected, before the (possibly slow) false-alarm probe runs, so the
reported delta reflects the op's time and not the probe's.
This is the timing building block for the per-op MMIO timeout work
(#2629), which consumes it from its own memcpy adapter — split into its
own PR so it can be reviewed and tested in isolation.
Update ClusterOptions in MicrobenchmarkOpenCluster test (#2921)
Author: Aleksa Jovanovic | Date: 2026-06-25 | Commit: 394f314
Since #2731 changed the ClusterOptions default for
num_host_mem_ch_per_mmio_device
from 0 to std::nullopt, the
MicrobenchmarkOpenCluster.ClusterConstructor benchmark
started auto-allocating host memory channels (hugepage mapping) in the
constructor. This
dominated the measured time and regressed the benchmark.
This restores the benchmark to measuring pure Cluster construction cost
by setting the
field explicitly to 0.
Silence dead-store warning in warm reset stale-FD test (#2909)
Author: Copilot | Date: 2026-06-25 | Commit: 73d9ccd
The warning is real: P1 intentionally keeps a Cluster alive to hold
stale FDs, but the variable had no explicit read. This change preserves
behavior and makes intent explicit with a no-op use.
auto cluster = std::make_unique<Cluster>();
(void)cluster;Return device health related errors in TopologyDiscovery (#2866)
Author: Aleksa Marković | Date: 2026-06-24 | Commit: 7b53909
Return device health related errors that occured in TopologyDiscovery.
The errors are accessible through
ClusterDescriptor::get_health_errors().
Reduce fetch-depth for metal in ttsim CI (#2905)
Author: Aleksa Marković | Date: 2026-06-24 | Commit: 8d5c30f
Reduce fetch-depth for metal in ttsim CI. Fetching tt-metal can take up
to 4 minutes. This PR aims to reduce the time spent fetching tt-metal in
ttsim CI. Raise ttsim test job timeout to 60 minutes.
wormhole: fix ARC telemetry address rebase (#2857)
Author: Matt Craighead | Date: 2026-06-24 | Commit: 18330df
NOC_NODEID_X_0/Y_0 in the reset unit hold the telemetry table/data
addresses as seen by the ARC CPU (full addresses within its CSM region),
not CSM-relative offsets. Rebasing them with ARC_CSM_OFFSET_NOC
(0x810000000) double-counted the CSM base, landing 0x10000000 past the
real data.
Because Wormhole firmware still supports legacy telemetry, UMD would
fall back to that instead of erroring, masking the error.
Disable flaky test (#2898)
Author: Nenad Buncic | Date: 2026-06-24 | Commit: f6ff26b
No description section found.
Use GetSocDescAbsPath for ClusterWH. OneDramOneTensixNoEthSocDesc test (#2867)
Author: William Ly | Date: 2026-06-23 | Commit: 52d2c06
ClusterWH.OneDramOneTensixNoEthSocDesc fails when running in upstream
container where compile and runtime paths differ. Other tests do not see
this issue due to using test_utils::GetSocDescAbsPath
[ RUN ] ClusterWH.OneDramOneTensixNoEthSocDesc
2026-06-22 13:15:35.543 | info | UMD | Cluster constructor started. (cluster.cpp:348)
unknown file: Failure
C++ exception with description "SocDescriptor file does not exist at path: tests/soc_descs/wormhole_b0_one_dram_one_tensix_no_eth.yaml
Location: /work/tt_metal/third_party/umd/device/soc_arch_descriptor.cpp:89
ci: fix ttsim cluster-desc paths after tt-metal submodule move (#2903)
Author: Ridvan Song | Date: 2026-06-23 | Commit: 7d6f1aa
No description section found.
Add missing chip_to_bus_id to 6U yaml example (#2897)
Author: Nenad Buncic | Date: 2026-06-23 | Commit: 4d90b75
This pull request adds a new mapping for chip IDs to bus IDs in the
cluster descriptor YAML file. This provides an explicit association
between each chip and its corresponding bus ID.
Add stale file descriptor Cluster recovery test for warm reset (#2809)
Author: Nenad Buncic | Date: 2026-06-23 | Commit: 663ad8e
Reproduces the scenario from issue #2463 where UMD cannot reinitialize
in Docker
on galaxy machines when stale file descriptors keep old cdevs alive
after PCI
remove/re-probe, blocking device access for other processes in the
container.
Requires KMD >= 2.8.0 which disables PCIe hot-plug on galaxies to avoid
this.
Move noc_multicast_write into TTDevice (#2704)
Author: Pavle Janevski | Date: 2026-06-22 | Commit: a95e3cc
Push noc_multicast_write variance from the chip layer into TTDevice,
so LocalChip and SimulationChip both reduce to thin delegations.
This is a PR that is pushing everything into TTDevice layer so Chip
class stays unchanged for every backend.
Use tt_kmd_lib for setting power state and KMD API version (#2854)
Author: Aleksa Marković | Date: 2026-06-22 | Commit: 573c51f
Use tt_kmd_lib for setting power state and KMD API version
Introduce UMD debug-level logs in exalens tests (#2850)
Author: Aleksa Marković | Date: 2026-06-22 | Commit: 3534341
No description section found.
Add register transactions in DeviceProtocol/TTDevice (#2803)
Author: Nenad Buncic | Date: 2026-06-21 | Commit: ed944bc
This pull request adds support for explicit register access methods in
the device protocol layer, ensuring register accesses are properly
validated for alignment and size. It introduces new methods for reading
from and writing to device registers, implements these methods for all
supported protocols (JTAG, PCIe, Remote), and enforces 4-byte alignment
and size constraints for register operations. Additionally, it refactors
the TTDevice interface and implementation to expose these new register
access methods to users.
Add reset function to tt-kmd-lib (#2844)
Author: Aleksa Marković | Date: 2026-06-20 | Commit: e9772f5
Add reset functionality to tt-kmd-lib, removing the need for a direct
ioctl call in PciDevice.
Increase latest supported CMFW version to 19.7.1 (#2855)
Author: Aleksa Marković | Date: 2026-06-19 | Commit: e09a6b6
Increase latest support CMFW to 19.7.1.
Add reconfigure version for registers (#2818)
Author: Nenad Buncic | Date: 2026-06-19 | Commit: 44455fa
This pull request refactors and extends the TlbWindow and
SiliconTlbWindow classes to add new register-level reconfiguration
methods, and significantly simplifies the logic for reconfiguring and
transferring memory blocks. The changes introduce reusable helper
functions and improve maintainability by reducing code duplication.
Fix virtual call to set_arc_coordinate during WormholeTTDevice construction (#2847)
Author: Copilot | Date: 2026-06-18 | Commit: 7cd8aba
The RemoteCommunication-based constructor was calling
set_arc_coordinate() without qualification, unlike the other two
WormholeTTDevice constructors which already used
WormholeTTDevice::set_arc_coordinate(). Using the fully-qualified name
makes the non-virtual dispatch explicit and consistent.
// Before
TTDevice(std::move(remote_communication), ...) {
set_arc_coordinate(); // unqualified virtual call — analyzer warning
// After
TTDevice(std::move(remote_communication), ...) {
WormholeTTDevice::set_arc_coordinate(); // explicit, matches other constructorsExpose DeviceTimeoutError to Python (#2848)
Author: Pavle Janevski | Date: 2026-06-18 | Commit: 3d6c490
DeviceTimeoutError is the host-side MMIO per-op timeout exception. It
was not exposed to the Python bindings, so a Python caller could not
catch it specifically — a timeout would surface only as the generic base
exception (or crash the host). This exposes it as a catchable Python
exception so callers can recover gracefully.
It is thrown wrapped as UmdException<DeviceTimeoutError>. Registering
it as a Python exception after the base error.UmdBaseException (and as
a subclass of it) makes its translator take precedence, so both except error.DeviceTimeoutError and except error.UmdBaseException catch it.
This mirrors the existing SigbusError binding.
Fix RobustMutex's robustness (#2821)
Author: Bojan Roško | Date: 2026-06-17 | Commit: d164615
I've verified manually (and there is one test in this PR), that robust
mutex still works correctly on abort or if c++ program receives a
signal. The only it can be broken, which is actually easily
reproducible, is to unmap the pthread mutex while it is locked, then the
kernel loses track of the relevant structure which is used to recover
the lock if process terminates. This uncovered a use-after-free bug in
tt-telemetry which led to this scenario, but nevertheless, I think this
is a good approach for RobustMutex as leaking mmaps this way is very
unlikely to lead to further issues (you would have to use >65k locks and
let them leak).
Add Blackhole galaxy (UBB) cluster descriptor example (#2846)
Author: Pavle Janevski | Date: 2026-06-17 | Commit: 5fa07eb
Adds a 32-chip Blackhole galaxy cluster descriptor (board_type: ubb_blackhole, all chips MMIO-attached) to the example set. This is the
first UBB-Blackhole descriptor in the repo and lets mock/offline tooling
bring up a 32-chip Blackhole device. The cluster descriptor schema
previously only allowed ubb as a UBB board type, so the schema enum is
extended with the canonical ubb_blackhole and ubb_wormhole names
that UMD already parses and emits via board_type_canonical_name_map.
Add DeviceTimeoutError exception type (#2840)
Author: Pavle Janevski | Date: 2026-06-17 | Commit: 489c3fc
Adds DeviceTimeoutError, a structured exception for host-side MMIO
operations that overrun a wall-clock budget. It is the dependency leaf
of the per-op MMIO timeout work (#2629) — split out into its own PR so
the timeout mechanism can stack on top of it.
DeviceTimeoutError is distinct from SigbusError: SigbusError
signals the platform PCIe completion timeout (a hardware-detected
fault), whereas DeviceTimeoutError signals that a software per-op
budget elapsed — typically because writes were piling up against a slow
or dead NOC before SIGBUS could fire. Keeping them as separate types
lets callers branch on retry policy.
There are no throwers yet; this PR only introduces the type and its
metadata.
Fix for 6U failing (#2839)
Author: Nenad Buncic | Date: 2026-06-17 | Commit: 746b945
The PR introduces a fix for the way the ARC interrupt is being read -
previously it has been read once - and now it is read multiple times
until the timeout is met or if the bit is cleared before the timeout has
exceeded.
Note: More details are given on issue #2834 why this fix is suitable.
Display ARC SMC Init status in ArcStartupError (#2829)
Author: Aleksa Marković | Date: 2026-06-17 | Commit: b88517a
Display ARC SMC Init status in ArcStartupError. Implemented in CMFW
19.11, ARC firmware now provides flags that give more details when ARC
startup failed. These are now surfaced in ArcStartupError.
Clarify BlackholeTTDevice constructor dispatch (#2838)
Author: Copilot | Date: 2026-06-17 | Commit: af95d45
Clang Static Analyzer flagged BlackholeTTDevice::set_arc_coordinate
being called during construction, where virtual dispatch is
intentionally bypassed. This updates the constructor code to make that
non-virtual dispatch explicit and remove the ambiguity.
write_to_core_range for PCIProtocol (#2573)
Author: Bojan Roško | Date: 2026-06-17 | Commit: 8a8703d
Unifies multicast to just call write_to_core_range in case of both local
and remote chips.
SWEmuleChip: Skip RISC-reset Broadcast + Fix Duplicate DRAM Backing (#2791)
Author: Armin Ale | Date: 2026-06-16 | Commit: 96e5f93
Two SWEmuleChip fixes for tt-emule's software-emulation backend, both
gated on
ChipType::SWEMULE (no-ops for silicon/simulation).
More CoreCoord methods in TTDevice (#2817)
Author: Aleksa Marković | Date: 2026-06-16 | Commit: 38dc723
Adapt more methods in TTDevice to use CoreCoord. Fix potential NOC
mismatches when accessing ARC core inside TTDevice.
Quasar TRISC reset condition fix (#2814)
Author: Bojan Roško | Date: 2026-06-16 | Commit: b77bd55
The intention of the condition was clearly to call reset_tensix if any
of the trisc cores are selected on grendel.
This change doesn't change anything for non grendel arch.
noc_multicast over EthernetBroadcast for Remote (#2572)
Author: Bojan Roško | Date: 2026-06-16 | Commit: 6f53d24
As a crown jewel of the whole stack of changes, enable ethernet
broadcast when calling multicast for remote chip.
Make P150 multi-chip cluster descs 2-col harvested (#2828)
Author: Pavle Janevski | Date: 2026-06-16 | Commit: 19bdbd3
Commit 3c73908 updated the single-chip blackhole_P150.yaml example to
be 2-column harvested (harvest_mask: 192), since P150 now requires two
columns harvested. The multi-chip P150 cluster descriptor examples were
not updated and still had harvest_mask: 0.
Modify bug template (#2827)
Author: Nenad Buncic | Date: 2026-06-16 | Commit: 2084340
Add UMD commit, KMD and FW versions and type of machine
Rename SiliconDriver tests to Cluster (#2831)
Author: Aleksa Marković | Date: 2026-06-15 | Commit: 07f5161
Rename SiliconDriver{WH,BH} test suite to Cluster{WH,BH}.
Move SocArchDescriptor arg from init to TTDevice ctor (#2801)
Author: Aleksa Marković | Date: 2026-06-15 | Commit: dc4792a
Move SocArchDescriptor argument from init_tt_device() to TTDevice
ctor.
Temp disable 6u (#2830)
Author: Nenad Buncic | Date: 2026-06-15 | Commit: 7cbf334
6U in CI fails consistently, in local the change is not reproducible.
Make HangDetector NOC register read injectable (#2799)
Author: Pavle Janevski | Date: 2026-06-15 | Commit: 65060be
Structural prep extracted from #2629, addressing review feedback that
HangDetector should stay unaware of anything below the protocol level
(no TlbWindow, no locks).
HangDetector now reads its NOC liveness register through an injectable
NocRegReader, defaulting to the current protocol-based read. A higher
layer can override it via set_noc_reg_reader() with a reader that owns
its own probe window and lock, keeping that knowledge out of
HangDetector. The override consumer lands later with the per-op MMIO
timeout work. Behavior is unchanged.
device/soc_descriptor: fix off-by-one boundary check in serialize_dram_cores (#2624)
Author: Sayak Mondal | Date: 2026-06-12 | Commit: a21de68
serialize_dram_cores uses > instead of >= when checking whether a
DRAM core coordinate falls outside the grid, causing a coordinate equal
to exactly grid_size.x or grid_size.y to pass the outer filter while
being silently dropped by the inner < guard. This results in an empty
YAML sequence [] being emitted for that DRAM bank — a structurally
valid but semantically wrong entry in the serialized descriptor.
Every other boundary comparison in the same function (lines 61 and 93)
and across the file correctly uses < / >=. This was a plain > vs
>= slip.
Remove redundant build-image yml (#2766)
Author: Bojan Roško | Date: 2026-06-10 | Commit: c962197
This workflow became redundant with the addition of docker-image.
We don't need to manually run docker builds anymore.
Remove unused write_regs (#2527)
Author: Bojan Roško | Date: 2026-06-10 | Commit: 90c6fa7
Remove unused function.
Note that SIliconTlbWindow::write_regs is still there
Implement RemoteProtocol's write to core range (#2571)
Author: Bojan Roško | Date: 2026-06-10 | Commit: 696e12b
Further prepare usage of EthernetBroadcast by adding it to remote
protocol
Guard methods that will segfault on uninitialized TTDevice (#2781)
Author: Aleksa Marković | Date: 2026-06-10 | Commit: 3cabdc3
Guard methods that will segfault on uninitialized TTDevice
Default num channels in tests (#2805)
Author: Bojan Roško | Date: 2026-06-10 | Commit: 3090ea0
After default was changed in ClusterOptions, changing the default used
in tests.
This PR should show significant speedup.
Expose ETH hearbeat and link status (#2789)
Author: Nenad Buncic | Date: 2026-06-09 | Commit: c029cb3
This pull request adds support for reporting per-link Ethernet link and
heartbeat status, while deprecating retrain status for newer FW
versions.
Fix truncation of telemetry_table_addr to uint32_t (#2790)
Author: Matt Craighead | Date: 2026-06-09 | Commit: 2873e80
Address math was truncating a uint64_t down to a uint32_t.
Remove p100 from PR and merge queue runs (#2796)
Author: Bojan Roško | Date: 2026-06-09 | Commit: 97ed5c9
Our merge queue is bogged down on p100 runners
Retrain DRAM cores on BIST failure (#2768)
Author: Pavle Janevski | Date: 2026-06-08 | Commit: ecf539a
As of FW 19.7.0 the GDDR BIST status is reported in the upper 16 bits
of the GDDR_STATUS telemetry tag ([31:16]), while the GDDR training
status stays in the lower 16 bits ([15:0]). A channel can train
successfully yet fail BIST, in which case it should be retrained
rather than reported as healthy. This change folds a BIST failure into
the per-channel DramTrainingStatus, so the existing
TTDevice::wait_dram_channel_training retrain loop issues a directed
retrain on BIST failure as well, not just on a training failure.
EthernetBroadcast implementation for Single Remote (#2570)
Author: Bojan Roško | Date: 2026-06-08 | Commit: 89ad090
This PR modifies extracted EthernetBroadcast functionality, so that it
is adapted for use towards a single remote chip
Reduce some log levels to log_debug (#2777)
Author: Pavle Janevski | Date: 2026-06-07 | Commit: b4a5999
Continue trimming per-run log noise by demoting non-actionable lines to
log_debug.
Temporarily disable TestFirmwareInfoProvider.FanSpeed (#2780)
Author: Aleksa Marković | Date: 2026-06-06 | Commit: cff6b78
Temporarily disable TestFirmwareInfoProvider.FanSpeed until fan speed
updates from CMFW 19.10 are implemented.
pcie: add portable SIMD memcpy paths for non-x86 (GCC/Clang) (#2710)
Author: Luca Barbato | Date: 2026-06-06 | Commit: 2c93260
No description section found.
Skip warning if local only topo discovery (#2385)
Author: Bojan Roško | Date: 2026-06-06 | Commit: 5a722db
We get warning that not enough chips were found per board. But this is
expected, because we didn't even do remote discovery
Fix CMake targets (#2187)
Author: Tristan Ross | Date: 2026-06-06 | Commit: bbf0285
Fixes various CMake issues which prevent tt-umd from configuring and
building correctly for Nix.
Fix exalens build and test (#2784)
Author: Bojan Roško | Date: 2026-06-06 | Commit: 54de42f
I've let claude figure out how to fix this workflow.
Failure can be seen in
https://github.com/tenstorrent/tt-umd/actions/runs/27025349217/job/79803878705
Add SMN read/write round-trip test on RTL simulator (#2769)
Author: Pavle Janevski | Date: 2026-06-04 | Commit: e85ef95
Add test coverage for the SMN (System Management Network) read/write API
on the RTL simulator. The SMN path is implemented on
RtlSimCommunicator / RtlSimulationTTDevice and is reached from
read_from_device / write_to_device when NocId::SYSTEM_NOC is
selected; previously it was only exercised manually.
Clarify sysmem NOC address mismatch errors (likely stale process) (#2776)
Author: Pavle Janevski | Date: 2026-06-04 | Commit: a94092a
When sysmem is mapped at an unexpected NOC address, the cause is almost
always another process on the machine — often a stale or crashed run
from a previous job — still holding the sysmem NOC address space UMD
requires, so the current process gets bumped to a different address. The
previous log messages just said Proceeding could lead to undefined behavior without naming the likely cause or the remedy (kill the
offending processes), which made these failures hard to diagnose in the
field.
This PR rewords both the hard error in pin_or_map_iommu() and the
warning in pin_or_map_hugepages() to explain the likely cause and tell
the user to find and kill the other processes using the Tenstorrent
device(s), then retry. No behavior change beyond the log/exception text
(and the extra context now included in the hugepage warning).
Refactor Chip/TTDevice constructors (part 2) (#2585)
Author: Aleksa Marković | Date: 2026-06-04 | Commit: ae6beb2
Simplified constructors for Chip and other improvements. Remove
SocDescriptor from Chip.
Create dedicated EthernetBroadcast (#2569)
Author: Bojan Roško | Date: 2026-06-04 | Commit: feec84d
Scope out EthernetBroadcast class out of Cluster class with clear
interface
feat: add Cluster factory methods and fix ClusterOptions defaults (#2731)
Author: Ridvan Song | Date: 2026-06-04 | Commit: 9bd70c7
No description section found.
This release was automatically generated when the VERSION file was updated.
Release notes were extracted from CHANGELOG file.