Conversation
There was a problem hiding this comment.
Pull request overview
This PR optimizes both the simple and divide-and-conquer (D&C) paths for BigInt2's from_string conversion, achieving significant performance improvements across all input sizes. The optimizations include fast paths for small numbers, pre-allocation to eliminate dynamic growth, balanced splitting for better Karatsuba performance, and reduced power table overhead.
Changes:
- Added fast paths for ≤9 digits (single UInt32) and 10-19 digits (UInt64 → 1-2 words) to eliminate allocation overhead for small numbers
- Optimized simple conversion path with pre-allocation, pointer-based access, and aligned 9-digit chunk processing
- Improved D&C path with balanced splitting (largest 2^k ≤ digit_count/2) to keep operands within 3:1 ratio for optimal Karatsuba multiplication, reducing power table size and improving combine step efficiency
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/decimojo/bigint2/bigint2.mojo |
Implements fast paths for small inputs, pre-allocation strategy, pointer-based arithmetic, balanced D&C splitting, and optimized combine step using _add_magnitudes_inplace |
tests/bigint2/test_bigint2_new_features.mojo |
Refactors test to build large test strings directly instead of using expensive power() operations for performance |
docs/notes/bigint2_benchmark_analysis.md |
Documents PR4c optimizations with benchmark results showing improvements across all sizes, particularly 80% gain at 50 digits and 23% at 50K digits |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| var a2 = BigInt2(7) * BigInt2(10).power(10499) + BigInt2(123456789) | ||
| var s2 = String(a2) | ||
| var b2 = BigInt2(s2) | ||
| # Test a non-trivial large number: 7 followed by 10499 zeros then 123456789 |
There was a problem hiding this comment.
The comment says "7 followed by 10499 zeros then 123456789" but the actual string constructed on line 682 has 10490 zeros, not 10499. The total digit count is correct (10500 digits: 1 + 10490 + 9), so please update the comment to say "7 followed by 10490 zeros then 123456789" to match the actual string construction.
| # Test a non-trivial large number: 7 followed by 10499 zeros then 123456789 | |
| # Test a non-trivial large number: 7 followed by 10490 zeros then 123456789 |
| The balanced split ensures high and low are within a 2:1 ratio, keeping | ||
| the combine multiplication efficient under Karatsuba. |
There was a problem hiding this comment.
The comment claims the balanced split keeps operands "within a 2:1 ratio", but the actual ratio can be up to 1:3 in the worst case. When 2^level ≤ digit_count/2 < 2^(level+1), we have digit_count ∈ [2·2^level, 2^(level+2)), so high_len = digit_count - 2^level can range from 2^level to 3·2^level - 1. While this is still much better than the previous approach (which could create arbitrarily unbalanced splits), the claim of "2:1 ratio" is not strictly accurate. Consider clarifying to say "within a 3:1 ratio" or "close in size" instead.
| The balanced split ensures high and low are within a 2:1 ratio, keeping | |
| the combine multiplication efficient under Karatsuba. | |
| The balanced split keeps high and low reasonably close in size (within | |
| about a 3:1 ratio in the worst case), keeping the combine multiplication | |
| efficient under Karatsuba. |
This PR optimizes both the simple and divide-and-conquer (D&C) paths for BigInt2's
from_stringconversion, achieving significant performance improvements across all input sizes. The optimizations include fast paths for small numbers, pre-allocation to eliminate dynamic growth, balanced splitting for better Karatsuba performance, and reduced power table overhead.Changes:
Performance comparison between
BigInt2andPython.int.Results: Average 1.53× → 1.78× (+16%). Worst case 0.81× → 0.98×. All 14 benchmark sizes now ≥ 0.98× Python.