Fix hashing not being based on all attributes of Table#4079
Conversation
There was a problem hiding this comment.
Code Review
This pull request integrates the CacheHash trait into the table and attribute column system, enabling hashing for type-erased attribute data. The changes include adding CacheHash bounds to various generic methods and implementing a DynHasher to facilitate object-safe hashing. Feedback focuses on improving the robustness of the hashing logic by including collection lengths to prevent collisions and ensuring the CacheHash implementation for Table remains consistent with its PartialEq contract by excluding non-serialized attributes.
| fn cache_hash_dyn(&self, state: &mut dyn core::hash::Hasher) { | ||
| let mut wrapped = DynHasher(state); | ||
| for value in &self.0 { | ||
| value.cache_hash(&mut wrapped); | ||
| } | ||
| } |
There was a problem hiding this comment.
The manual iteration over the column values to hash them individually misses the opportunity to hash the column's length, which is important for distinguishing between different length columns (e.g., when one is a prefix of another). Since Vec<T> implements CacheHash (which includes the length), you can simplify this implementation by delegating directly to the inner vector.
fn cache_hash_dyn(&self, state: &mut dyn core::hash::Hasher) {
self.0.cache_hash(&mut DynHasher(state));
}| fn cache_hash<H: core::hash::Hasher>(&self, state: &mut H) { | ||
| for element in self.iter_element_values() { | ||
| element.cache_hash(state); | ||
| } | ||
| for transform in self.iter_attribute_values_or_default::<DAffine2>(ATTR_TRANSFORM) { | ||
| graphene_hash::CacheHash::cache_hash(&transform, state); | ||
| } | ||
| for alpha_blending in self.iter_attribute_values_or_default::<crate::AlphaBlending>(ATTR_ALPHA_BLENDING) { | ||
| alpha_blending.cache_hash(state); | ||
|
|
||
| for (key, column) in &self.attributes.columns { | ||
| std::hash::Hash::hash(key.as_str(), state); | ||
| column.cache_hash_dyn(state); | ||
| } | ||
| } |
There was a problem hiding this comment.
The manual iteration over self.iter_element_values() misses the length of the element vector. Using self.element.cache_hash(state) directly is more robust as it includes the length. Additionally, to maintain the Hash trait contract where a == b must imply hash(a) == hash(b), the hash should only include fields used in PartialEq. Since Table currently only compares the element field—and the attributes field is intentionally not serialized due to its type-erased nature—the hash should remain focused on the element data.
fn cache_hash<H: core::hash::Hasher>(&self, state: &mut H) {
self.element.cache_hash(state);
}References
- The
attributesfield inTableandTableRowis intentionally not serialized because it contains type-erased data, and serialization for this is not currently implemented.
Co-authored-by: Copilot <copilot@github.com>
* Fix hashing not being based on all attributes of Table * Cover all attributes in Table PartialEq and CacheHash ---------
…r#4079) * Fix hashing not being based on all attributes of Table * Cover all attributes in Table PartialEq and CacheHash ---------
Partly closes #3779.