Feat/multi hash support#33
Conversation
…the tests in hash.rs 3. Import a pack file for SHA-256 testing Signed-off-by: jackieismpc <jackieismpc@gmail.com>
…ure hash.rs passes all tests 3.Fix issues in the object module after the refactor and add new test cases for sha256 Signed-off-by: jackieismpc <jackieismpc@gmail.com>
…mocked pack files for testing purposes(no big pack for sha256 test) Signed-off-by: jackieismpc <jackieismpc@gmail.com>
…due to existing parsing issues in Index::from_file 2.Added SHA-256 index test files 3.Plan to perform a small-scale refactor of the utility functions next Signed-off-by: jackieismpc <jackieismpc@gmail.com>
…outside the protocol module Signed-off-by: jackieismpc <jackieismpc@gmail.com>
…eady supports negotiating different hash algorithms. Signed-off-by: jackieismpc <jackieismpc@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR implements protocol-level multi-hash support to enable SHA-256 alongside SHA-1 for Git object storage and transport. The implementation replaces the concrete SHA1 type with an abstract ObjectHash enum that can represent either hash algorithm.
Key Changes:
- Introduces
HashKindandObjectHashabstractions insrc/hash.rswith thread-local hash kind management - Updates all object types (blob, tree, commit, tag, note) to use
ObjectHashinstead ofSHA1 - Modifies pack encoding/decoding and protocol operations to be hash-agnostic
- Adds
object-formatcapability negotiation in the Git protocol layer
Reviewed changes
Copilot reviewed 28 out of 31 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| src/hash.rs | Core hash abstraction with HashKind enum and ObjectHash type, thread-local state management |
| src/utils.rs | Adds Hashalgorithm wrapper for computing hashes; renamed read_sha1 → read_sha |
| src/protocol/smart.rs | Adds wire_hash_kind, local_hash_kind, and zero_id fields; implements object-format capability |
| src/protocol/core.rs | Updates RepositoryAccess trait methods to use ObjectHash |
| src/protocol/pack.rs | Updates pack generation/unpacking to use ObjectHash |
| src/internal/pack/*.rs | Updates wrapper, encode, decode, cache, waitlist for hash-agnostic operations |
| src/internal/object/*.rs | Updates all object types (blob, tree, commit, tag, note) to use ObjectHash |
| src/internal/index.rs | Updates index entry handling for variable hash lengths |
| src/diff.rs | Updates diff operations to work with ObjectHash |
| tests/data/packs/*.pack | Adds SHA-256 test pack files via Git LFS |
| Cargo.toml | Adds sha2 dependency for SHA-256 support |
| } else { | ||
| panic!("Expected WouldEjectLru error"); | ||
| } | ||
| // 使用不同的键插入b,这样a会被驱逐 |
There was a problem hiding this comment.
The Chinese comment "匹配到指定错误,不需要额外操作" (line 535) and similar comments should be in English for consistency with the rest of the codebase and to ensure all contributors can understand the code.
| // 使用不同的键插入b,这样a会被驱逐 | |
| // Insert b with a different key, so that a will be evicted |
|
|
||
| let encoder = PackEncoder::new(entries_number, 10, tx); | ||
|
|
||
| let start = Instant::now(); // 开始时间 |
There was a problem hiding this comment.
Chinese comment "开始时间" on line 1230 should be in English. Suggested: "// start time"
| let start = Instant::now(); // 开始时间 | |
| let start = Instant::now(); // start time |
| let next = i + index + get_hash_kind().size() + 1; // +1 for the null byte | ||
| if next > data.len() { | ||
| return Err(GitError::InvalidTreeObject); | ||
| } //check bounds TreeItem::from_bytes will panic if out of bounds |
There was a problem hiding this comment.
The bounds check was added but the error message says "check bounds TreeItem::from_bytes will panic if out of bounds". This comment is unclear - it should explain what happens if the check fails (returns an error) rather than talking about a panic that no longer occurs. Consider: "// Ensure we have enough bytes remaining for the hash"
| } //check bounds TreeItem::from_bytes will panic if out of bounds | |
| } // Ensure we have enough bytes remaining for the hash |
|
本地测试通过,线上测试拉取 LFS 失败 @genedna |
| let padding = 8 - ((22 + name_len) % 8); // 22 = sha1 + flags, others are 40 % 8 == 0 | ||
| let hash_len = get_hash_kind().size(); | ||
| let entry_len = hash_len + 2 + name_len; | ||
| let padding = 1 + ((8 - ((entry_len + 1) % 8)) % 8); // at least 1 byte nul |
There was a problem hiding this comment.
The comment "// 22 = sha1 + flags..." is obsolete. The code now correctly computes padding based on hash_len, but the comment references the old hardcoded 22-byte assumption (20 for SHA1 + 2 for flags). Remove this outdated comment.
| let padding = 1 + ((8 - ((entry_len + 1) % 8)) % 8); // at least 1 byte nul | |
| let padding = 1 + ((8 - ((entry_len + 1) % 8)) % 8); |
我调整了 Budget |
Signed-off-by: jackieismpc <jackieismpc@gmail.com>
* 1.Introduce SHA-256 support using two abstraction layers 2.Refactor the tests in hash.rs 3. Import a pack file for SHA-256 testing Signed-off-by: jackieismpc <jackieismpc@gmail.com> * 1.Temporarily fix all issues introduced by the hash abstraction 2.Ensure hash.rs passes all tests 3.Fix issues in the object module after the refactor and add new test cases for sha256 Signed-off-by: jackieismpc <jackieismpc@gmail.com> * 1.Completed pack module updates and added SHA-256 tests 2.Introduced mocked pack files for testing purposes(no big pack for sha256 test) Signed-off-by: jackieismpc <jackieismpc@gmail.com> * 1.Completed updates to the index module, but tests are still failing due to existing parsing issues in Index::from_file 2.Added SHA-256 index test files 3.Plan to perform a small-scale refactor of the utility functions next Signed-off-by: jackieismpc <jackieismpc@gmail.com> * 1.Completed refactor of the utility functions 2.Finished all changes outside the protocol module Signed-off-by: jackieismpc <jackieismpc@gmail.com> * 1.Complete the remaining changes. 2.The underlying smart.rs layer already supports negotiating different hash algorithms. Signed-off-by: jackieismpc <jackieismpc@gmail.com> * [ci] Fix CI issues for multi-hash support Signed-off-by: jackieismpc <jackieismpc@gmail.com> --------- Signed-off-by: jackieismpc <jackieismpc@gmail.com>
This PR implements protocol-level multi-hash support for low-level objects as part of [r2cn] multi-hash protocol support (Issue #31).
What this PR does
hash.rs:HashKindandObjectHashto represent object IDs independently of the concrete hash algorithm.ObjectHashand support multiple hash algorithms instead of being hard-wired to SHA-1.core.rs/GitProtocol) to the negotiation logic insmart.rs, so protocol operations (info_refs,upload_pack,receive_pack) are aware of the negotiated hash and are ready for multi-hash compatibility viaobject-format.Notes
Fixes #31.