Skip to content

Conversation

@czyt
Copy link
Member

@czyt czyt commented May 21, 2025

This commit introduces several enhancements to the mdx library:

  1. fs.FS Interface Implementation:

    • Added mdict_fs.go to provide an fs.FS compatible wrapper for MDX/MDD files.
    • Allows treating dictionary files as read-only file systems.
    • Supports Open, Stat, Read, Close, Seek for individual entries (keywords/resources).
    • Supports ReadDir on the root to list all keywords/resources.
    • Known limitation: MDD resource lookup in Open() is currently a linear scan; optimization is recommended.
  2. Checksum Verification:

    • Implemented Alder32 checksum verification for:
      • Dictionary header info (readMDictFileHeader)
      • Decompressed key block info (decodeKeyBlockInfo)
      • Each decompressed key block (decodeKeyEntries)
      • Decompressed record blocks (_fetchAndDecodeRecordBlock)
    • Errors are returned upon checksum mismatch.
  3. Encrypted File Support:

    • Added handling for encrypted key block metadata in readKeyBlockMeta.
    • Added handling for encrypted key block info in decodeKeyBlockInfo.
    • Utilizes mdxDecrypt for decryption.
  4. Compression/Decompression Logic:

    • Corrected and completed LZO decompression logic in decodeKeyEntries and record block handling.
    • Corrected and completed ZLIB decompression logic in decodeKeyBlockInfo and record block handling.
    • Improved error handling for decompression failures and size mismatches.
  5. Refactoring and Best Practices:

    • Error Handling: Replaced errors.New with fmt.Errorf for more contextual error messages and consistently wrapped errors. Removed a panic in splitKeyBlock.
    • Code Duplication:
      • Merged keywordEntryToIndex and keywordEntryToIndex1 into a single function.
      • Refactored record block data retrieval logic from locateByKeywordEntry and locateDefByKWIndex into common helper functions (_fetchAndDecodeRecordBlock, _locateDefByKWIndexInternal).
    • Logging: Enhanced logging with more contextual information and consistent levels.
  6. Testing:

    • Initiated testing for the fs.FS implementation with mdict_fs_test.go.
    • Due to the complexity of generating specific binary test files for all scenarios (checksums, encryption, varied compression), comprehensive unit testing for these aspects was not fully completed and is recommended as a follow-up.

These changes aim to make the library more robust, feature-rich, and maintainable.

This commit introduces several enhancements to the mdx library:

1.  **`fs.FS` Interface Implementation:**
    *   Added `mdict_fs.go` to provide an `fs.FS` compatible wrapper for MDX/MDD files.
    *   Allows treating dictionary files as read-only file systems.
    *   Supports `Open`, `Stat`, `Read`, `Close`, `Seek` for individual entries (keywords/resources).
    *   Supports `ReadDir` on the root to list all keywords/resources.
    *   Known limitation: MDD resource lookup in `Open()` is currently a linear scan; optimization is recommended.

2.  **Checksum Verification:**
    *   Implemented Alder32 checksum verification for:
        *   Dictionary header info (`readMDictFileHeader`)
        *   Decompressed key block info (`decodeKeyBlockInfo`)
        *   Each decompressed key block (`decodeKeyEntries`)
        *   Decompressed record blocks (`_fetchAndDecodeRecordBlock`)
    *   Errors are returned upon checksum mismatch.

3.  **Encrypted File Support:**
    *   Added handling for encrypted key block metadata in `readKeyBlockMeta`.
    *   Added handling for encrypted key block info in `decodeKeyBlockInfo`.
    *   Utilizes `mdxDecrypt` for decryption.

4.  **Compression/Decompression Logic:**
    *   Corrected and completed LZO decompression logic in `decodeKeyEntries` and record block handling.
    *   Corrected and completed ZLIB decompression logic in `decodeKeyBlockInfo` and record block handling.
    *   Improved error handling for decompression failures and size mismatches.

5.  **Refactoring and Best Practices:**
    *   **Error Handling:** Replaced `errors.New` with `fmt.Errorf` for more contextual error messages and consistently wrapped errors. Removed a `panic` in `splitKeyBlock`.
    *   **Code Duplication:**
        *   Merged `keywordEntryToIndex` and `keywordEntryToIndex1` into a single function.
        *   Refactored record block data retrieval logic from `locateByKeywordEntry` and `locateDefByKWIndex` into common helper functions (`_fetchAndDecodeRecordBlock`, `_locateDefByKWIndexInternal`).
    *   **Logging:** Enhanced logging with more contextual information and consistent levels.

6.  **Testing:**
    *   Initiated testing for the `fs.FS` implementation with `mdict_fs_test.go`.
    *   Due to the complexity of generating specific binary test files for all scenarios (checksums, encryption, varied compression), comprehensive unit testing for these aspects was not fully completed and is recommended as a follow-up.

These changes aim to make the library more robust, feature-rich, and maintainable.
@czyt czyt merged commit 262f5b0 into main May 21, 2025
@czyt czyt deleted the feature/mdict-enhancements branch May 21, 2025 04:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants