Skip to content

[integer][BigInt2] Optimize both the simple and D&C paths for from_string#148

Merged
forfudan merged 1 commit intomainfrom
dev
Feb 21, 2026
Merged

[integer][BigInt2] Optimize both the simple and D&C paths for from_string#148
forfudan merged 1 commit intomainfrom
dev

Conversation

@forfudan
Copy link
Copy Markdown
Owner

@forfudan forfudan commented Feb 21, 2026

This PR optimizes both the simple and divide-and-conquer (D&C) paths for BigInt2's from_string conversion, achieving significant performance improvements across all input sizes. The optimizations include fast paths for small numbers, pre-allocation to eliminate dynamic growth, balanced splitting for better Karatsuba performance, and reduced power table overhead.

Changes:

  • Added fast paths for ≤9 digits (single UInt32) and 10-19 digits (UInt64 → 1-2 words) to eliminate allocation overhead for small numbers
  • Optimized simple conversion path with pre-allocation, pointer-based access, and aligned 9-digit chunk processing
  • Improved D&C path with balanced splitting (largest 2^k ≤ digit_count/2) to keep operands within 3:1 ratio for optimal Karatsuba multiplication, reducing power table size and improving combine step efficiency

Performance comparison between BigInt2 and Python.int.

Digits Before After Change
2 3.67× 3.83× ~same
9 3.60× 3.50× ~same
20 2.38× 2.63× +10%
50 1.33× 2.40× +80%
100 1.19× 1.93× +62%
200 0.94× 1.10× FIXED
500 1.18× 1.24× +5%
1000 1.06× 1.10× +4%
2000 1.04× 1.11× +7%
5000 1.13× 1.12× ~same
10000 1.16× 1.06× ~same
20000 0.83× 0.98× +18%
50000 0.81× 1.00× +23%

Results: Average 1.53× → 1.78× (+16%). Worst case 0.81× → 0.98×. All 14 benchmark sizes now ≥ 0.98× Python.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes both the simple and divide-and-conquer (D&C) paths for BigInt2's from_string conversion, achieving significant performance improvements across all input sizes. The optimizations include fast paths for small numbers, pre-allocation to eliminate dynamic growth, balanced splitting for better Karatsuba performance, and reduced power table overhead.

Changes:

  • Added fast paths for ≤9 digits (single UInt32) and 10-19 digits (UInt64 → 1-2 words) to eliminate allocation overhead for small numbers
  • Optimized simple conversion path with pre-allocation, pointer-based access, and aligned 9-digit chunk processing
  • Improved D&C path with balanced splitting (largest 2^k ≤ digit_count/2) to keep operands within 3:1 ratio for optimal Karatsuba multiplication, reducing power table size and improving combine step efficiency

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/decimojo/bigint2/bigint2.mojo Implements fast paths for small inputs, pre-allocation strategy, pointer-based arithmetic, balanced D&C splitting, and optimized combine step using _add_magnitudes_inplace
tests/bigint2/test_bigint2_new_features.mojo Refactors test to build large test strings directly instead of using expensive power() operations for performance
docs/notes/bigint2_benchmark_analysis.md Documents PR4c optimizations with benchmark results showing improvements across all sizes, particularly 80% gain at 50 digits and 23% at 50K digits

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

var a2 = BigInt2(7) * BigInt2(10).power(10499) + BigInt2(123456789)
var s2 = String(a2)
var b2 = BigInt2(s2)
# Test a non-trivial large number: 7 followed by 10499 zeros then 123456789
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "7 followed by 10499 zeros then 123456789" but the actual string constructed on line 682 has 10490 zeros, not 10499. The total digit count is correct (10500 digits: 1 + 10490 + 9), so please update the comment to say "7 followed by 10490 zeros then 123456789" to match the actual string construction.

Suggested change
# Test a non-trivial large number: 7 followed by 10499 zeros then 123456789
# Test a non-trivial large number: 7 followed by 10490 zeros then 123456789

Copilot uses AI. Check for mistakes.
Comment on lines +1444 to +1445
The balanced split ensures high and low are within a 2:1 ratio, keeping
the combine multiplication efficient under Karatsuba.
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment claims the balanced split keeps operands "within a 2:1 ratio", but the actual ratio can be up to 1:3 in the worst case. When 2^level ≤ digit_count/2 < 2^(level+1), we have digit_count ∈ [2·2^level, 2^(level+2)), so high_len = digit_count - 2^level can range from 2^level to 3·2^level - 1. While this is still much better than the previous approach (which could create arbitrarily unbalanced splits), the claim of "2:1 ratio" is not strictly accurate. Consider clarifying to say "within a 3:1 ratio" or "close in size" instead.

Suggested change
The balanced split ensures high and low are within a 2:1 ratio, keeping
the combine multiplication efficient under Karatsuba.
The balanced split keeps high and low reasonably close in size (within
about a 3:1 ratio in the worst case), keeping the combine multiplication
efficient under Karatsuba.

Copilot uses AI. Check for mistakes.
@forfudan forfudan merged commit 6dfd09a into main Feb 21, 2026
5 checks passed
@forfudan forfudan deleted the dev branch February 21, 2026 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants