Skip to content

[RFC 33] Add RotorQuant/IsoQuant comparison#45

Merged
lwwmanning merged 1 commit intodevelopfrom
claude/review-quant-discussion-UPr2n
Apr 9, 2026
Merged

[RFC 33] Add RotorQuant/IsoQuant comparison#45
lwwmanning merged 1 commit intodevelopfrom
claude/review-quant-discussion-UPr2n

Conversation

@lwwmanning
Copy link
Copy Markdown
Contributor

Summary

This PR adds a comprehensive comparison section between TurboQuant's full-dimension SORF approach and alternative block-diagonal rotation strategies (RotorQuant, IsoQuant, PlanarQuant), along with empirical evidence and additional experimental validation plans.

Key Changes

  • New comparison section: Added detailed analysis of RotorQuant/IsoQuant approaches that use small-block rotations (SO(2)/SO(3)/SO(4)) vs. TurboQuant's full-dimension SORF

    • Includes quantitative comparison table showing MSE regressions (10.8× worse for RotorQuant at 3-bit)
    • Documents the fundamental decorrelation limitation of block-diagonal rotations with small blocks
    • Explains why RFC 0033's B≥64 constraint is validated by this empirical evidence
  • Strengthened decorrelation analysis: Added section in the "Coordinate distribution" discussion that:

    • References RotorQuant/IsoQuant experimental failures as direct evidence of decorrelation failure modes
    • Contextualizes Stage 2's B=256 blocks (24 butterfly mixing stages) against RotorQuant's 3-4 raw coordinates
    • Clarifies the scale difference between the approaches
  • Enhanced experimental plan: Extended the block-size testing section with:

    • New measurement of cross-block coordinate correlation on real embeddings (Contriever, OpenAI)
    • Explicit comparison of block-decomposed (B=256) vs. single-block (B=d) SORF at d=768
    • Quantification methodology using average absolute Pearson correlation between coordinates in different blocks
    • Clear connection to RotorQuant/IsoQuant findings to determine where decorrelation gap becomes negligible
  • Added reference [13]: Documented RotorQuant/IsoQuant work with full citation and rejection rationale

Notable Details

The comparison demonstrates that while RFC 0033 uses block-diagonal decomposition (like RotorQuant), the critical difference is block size and internal structure: B=256 with 3-round SORF provides substantially more mixing than RotorQuant's 3-4 dimensional groups. The new experimental plan directly addresses whether B=256 is large enough to avoid meaningful decorrelation loss through empirical measurement on real embeddings.

https://claude.ai/code/session_016qKqZ579LA83p7ThoAdqut

@lwwmanning lwwmanning changed the title Add RotorQuant/IsoQuant comparison and cross-block correlation analysis [RFC 33] Add RotorQuant/IsoQuant comparison Apr 9, 2026
…0033

Incorporate findings from TheTom/turboquant_plus#34, where small block-diagonal
rotations (SO(2)/SO(3)/SO(4)) caused 10x+ MSE regressions on real KV-cache
data. This empirical evidence strengthens the case for large block sizes (B=256+)
in Stage 2 and motivates a new experimental plan item measuring cross-block
correlation on real embeddings.

https://claude.ai/code/session_016qKqZ579LA83p7ThoAdqut
Signed-off-by: Will Manning <will@willmanning.io>
@lwwmanning lwwmanning force-pushed the claude/review-quant-discussion-UPr2n branch from b5a171d to 9ea08f1 Compare April 9, 2026 16:18
@lwwmanning lwwmanning merged commit ee2c78a into develop Apr 9, 2026
3 checks passed
@lwwmanning lwwmanning deleted the claude/review-quant-discussion-UPr2n branch April 9, 2026 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants