test(python): add comprehensive RecursiveHash test suite by junrushao · Pull Request #485 · apache/tvm-ffi

junrushao · 2026-02-28T00:36:19Z

Summary

Expose RecursiveHash to the Python FFI API (_ffi_api.py stub + __all__)
Add TestHash and TestCustomHash reflected test fixture classes to tvm_ffi.testing
Add comprehensive test_dataclass_hash.py covering the full RecursiveHash contract

Architecture

Two new reflected test fixture classes registered via C++ reflection:
- TestHash (testing.TestHash): exercises Hash(false) field exclusion on hash_ignored
- TestCustomHash (testing.TestCustomHash): exercises __ffi_hash__ custom hook (hashes only key, ignores label)

Test Coverage

Category	What's tested
Primitives	int, float, bool, str, bytes, None, DataType, Device
NaN handling	All NaN payloads hash equal; canonicalization in nested containers
Signed zero	`+0.0` and `-0.0` hash identically
Containers	Array, List, Shape, Map, Dict — equal/different/empty/nested
Reflected objects	TestIntPair, inherited fields (3-level), objects with container fields
Field exclusion	`Hash(false)` via TestHash; `Compare(false)` implies hash-off
Custom hooks	`__ffi_hash__` via TestCustomHash and TestCustomCompare
Cycle detection	Self-referential List/Dict hashing succeeds gracefully
Consistency law	`RecursiveEq(a, b) ⟹ RecursiveHash(a) == RecursiveHash(b)` — primitives, containers, reflected objects, custom hooks
Aliasing invariants	Shared vs duplicated references produce identical hashes
Recursion depth	127 and 1000 levels of nesting (iterative heap-based stack)
DAG scaling	Shared binary DAG hashing is linear, not exponential (warm-up + averaged)
Guard	`__ffi_eq__` without `__ffi_hash__` raises ValueError

Test Plan

uv run pytest -vvs tests/python/test_dataclass_hash.py

gemini-code-assist · 2026-02-28T00:36:38Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the Python FFI by integrating recursive hashing capabilities. This feature allows for consistent and reliable hashing of complex, nested data structures, including those with cyclic references and custom hashing logic. The changes are thoroughly validated through a new, comprehensive test suite that ensures correctness across various data types and scenarios, thereby improving the robustness and predictability of object comparisons and hashing within the system.

Highlights

RecursiveHash Exposure: The RecursiveHash function has been exposed to the Python FFI, allowing Python code to utilize this deep hashing functionality.
New Test Classes: Two new test classes, TestHash and TestCustomHash, were introduced to facilitate comprehensive testing of hashing behaviors, including ignored fields and custom hashing hooks.
Comprehensive Test Suite: A new, extensive test file (test_dataclass_hash.py) was added, covering primitives, containers, reflected objects, HashOff flags, custom hooks, cycle detection, consistency laws (RecursiveEq => RecursiveHash), and aliasing invariants.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

python/tvm_ffi/_ffi_api.py
- Added RecursiveHash function to the FFI API definition.
- Included RecursiveHash in the list of exported FFI functions.
python/tvm_ffi/testing/init.py
- Imported the newly defined TestCustomHash and TestHash classes into the testing module's __init__.py.
python/tvm_ffi/testing/testing.py
- Defined the TestHash class, an Object with a field (hash_ignored) marked to be excluded from hashing.
- Defined the TestCustomHash class, an Object with a custom __ffi_hash__ hook that hashes only its key field.
tests/python/test_dataclass_hash.py
- Added a new test file dedicated to ffi.RecursiveHash.
- Implemented tests for primitive types (int, float, bool, string, bytes, None), including edge cases like NaN and signed zero.
- Included tests for various container types (Array, List, Map, Dict) and their nested forms.
- Added tests for reflected objects, inherited fields, and objects with container fields.
- Verified behavior with HashOff and CompareOff flags.
- Ensured cycle safety and handling of shared references/aliasing invariants.
- Validated the consistency law: RecursiveEq(a, b) implies RecursiveHash(a) == RecursiveHash(b).
- Included adversarial quality checks and tests for custom __ffi_hash__ hooks.
- Added regression tests for the Eq=>Hash invariant and guards for types defining __ffi_eq__ but not __ffi_hash__.

Activity

The author has implemented and verified the changes with Python tests in test_dataclass_hash.py.
The pull request was generated using Claude Code.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

Architecture: - Adds two new reflected test fixture classes (TestHash, TestCustomHash) to the testing module, registered via the C++ reflection system. TestHash exercises Hash(false) field exclusion; TestCustomHash exercises the __ffi_hash__ custom hook (hashes only `key`, ignores `label`). - Exposes RecursiveHash in the Python FFI API stub (_ffi_api.py) so tests can call it directly without private imports. Public Interfaces: - `tvm_ffi._ffi_api.RecursiveHash` added to TYPE_CHECKING stub and __all__. - `tvm_ffi.testing.TestHash` and `tvm_ffi.testing.TestCustomHash` exported as public test fixtures. UI/UX: - none Behavioral Changes: - No runtime behavioral changes; this is a test-only addition. Docs: - Test docstrings serve as specification documentation for hash semantics. Tests: - Executed: N/A (test-only commit; no build validation in this cherry-pick) - Result: 1033-line test file covering: - Primitives (int, float, bool, str, bytes, None, DataType, Device) - NaN canonicalization (all NaN payloads hash equal) - Signed-zero normalization (+0.0 == -0.0 for hashing) - Containers (Array, List, Shape, Map, Dict) including nesting - Reflected objects (TestIntPair, inherited fields, container fields) - Hash(false) / Compare(false) field exclusion - Custom __ffi_hash__ hook via TestCustomHash - Cycle detection (self-referential List/Dict) - Consistency law: RecursiveEq(a,b) => RecursiveHash(a)==RecursiveHash(b) - Aliasing invariants (shared vs duplicated references) - Recursion depth (127 and 1000 levels) - Shared DAG scaling (linear, not exponential) - Guard: __ffi_eq__ without __ffi_hash__ raises ValueError - Parametrized cyclic-structure mismatch tests Untested Edge Cases: - Cross-process hash stability (hashes may differ across builds/platforms) - Thread-safety of RecursiveHash under concurrent mutation

gemini-code-assist

Code Review

This pull request introduces RecursiveHash to the Python FFI and adds a comprehensive test suite for it. The changes to expose the new function and test classes are correct. The new test file test_dataclass_hash.py is very thorough, covering a wide range of types, edge cases like cycles and aliasing, and consistency with RecursiveEq. I have a couple of suggestions to improve the maintainability and reliability of the new test suite.

gemini-code-assist · 2026-02-28T00:38:12Z

tests/python/test_dataclass_hash.py

+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""Tests for ffi.RecursiveHash."""


This is a very comprehensive test suite, which is great. However, at over 1000 lines, this file is becoming quite large and difficult to navigate. For better long-term maintainability, consider splitting it into smaller, more focused files based on the type of data being tested. For example, you could have test_hash_primitives.py, test_hash_containers.py, test_hash_objects.py, and test_hash_edge_cases.py. The existing sections in the file provide a good structure for such a split.

gemini-code-assist · 2026-02-28T00:38:12Z

tests/python/test_dataclass_hash.py

+    t0 = time.perf_counter()
+    RecursiveHash(d18)
+    t18 = time.perf_counter() - t0
+
+    t0 = time.perf_counter()
+    RecursiveHash(d19)
+    t19 = time.perf_counter() - t0


This performance test can be flaky because the execution time of a single RecursiveHash call can be very short and subject to system noise. To get a more stable and reliable measurement, it's better to run the function in a loop and average the time.

A warm-up call before starting the measurements can also help reduce noise from one-time setup costs (e.g., JIT compilation if applicable, cache warming).

Suggested change

t0 = time.perf_counter()

RecursiveHash(d18)

t18 = time.perf_counter() - t0

t0 = time.perf_counter()

RecursiveHash(d19)

t19 = time.perf_counter() - t0

# Warm-up run to mitigate one-time setup costs

RecursiveHash(_make_shared_binary_dag(10))

repeats = 10

t0 = time.perf_counter()

for _ in range(repeats):

RecursiveHash(d18)

t18 = (time.perf_counter() - t0) / repeats

t0 = time.perf_counter()

for _ in range(repeats):

RecursiveHash(d19)

t19 = (time.perf_counter() - t0) / repeats

- Remove 24 tests that incorrectly assumed RecursiveEq raises ValueError on distinct cyclic structures (it handles them gracefully instead). These also tested RecursiveEq behavior, not RecursiveHash. - Add warm-up + averaging (10 repeats) to the DAG scaling perf test to reduce flakiness from system noise.

junrushao force-pushed the 2026-02-27/recursive-hash branch from 05d52b8 to 0368686 Compare February 28, 2026 00:37

junrushao changed the title ~~feat: support recursive hash (Python integration + tests)~~ test(python): add comprehensive RecursiveHash test suite Feb 28, 2026

gemini-code-assist bot reviewed Feb 28, 2026

View reviewed changes

junrushao mentioned this pull request Feb 28, 2026

Staged Commits: Reintroduce @c_class #477

Closed

10 tasks

tqchen approved these changes Feb 28, 2026

View reviewed changes

junrushao merged commit 5796ff4 into apache:main Feb 28, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(python): add comprehensive RecursiveHash test suite#485

test(python): add comprehensive RecursiveHash test suite#485
junrushao merged 2 commits intoapache:mainfrom
junrushao:2026-02-27/recursive-hash

junrushao commented Feb 28, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Feb 28, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 28, 2026

Uh oh!

gemini-code-assist bot Feb 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

junrushao commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Architecture

Test Coverage

Test Plan

Uh oh!

gemini-code-assist bot commented Feb 28, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

junrushao commented Feb 28, 2026 •

edited

Loading