-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Description
The HVX conv2d kernel introduced in #12204 only works in simulation, even with #13002 applied.
Zeroing out allocated VTCM memory in VTCMAllocation::VTCMAllocation() (a memset inserted at line 79 in hexagon_buffer.cc makes the kernel run correctly on real hardware. Conversely, filling a VTCM allocation with junk values (e.g., filling it with 0xCC) causes the kernel to fail in simulation as well.
It seems like something isn't being zeroed properly, or some VTCM is being accessed that shouldn't be. Unfortunately, I haven't been able to nail down where this happens yet.
Expected behavior
Executing the tests/python/contrib/test_hexagon/topi/test_conv2d_fp16_intrin.py tests should work on real hardware.
Actual behavior
Tests fail with result mismatches.
Environment
Ubuntu 20.04 with clang++ 14, TVM with PR #13002 applied, Hexagon SDK 5.1.0.0 with Hexagon Tools 8.5.13, using a development version of LLVM 16.
Steps to reproduce
- Set
ANDROID_SERIAL_NUMBERto a valid Hexagon device serial number - Run
pytest tests/python/contrib/test_hexagon/topi/test_conv2d_fp16_intrin.py
Triage
- needs-triage
cc: @kparzysz-quic @quic-sanirudh
cc @mehrdadh