⏺ All done! Here's the summary: by navicore · Pull Request #191 · navicore/patch-seq

navicore · 2026-01-06T03:40:24Z

Leibniz Pi Benchmark Added (Issue #190)

Files Created

benchmarks/compute/leibniz_pi.seq - Seq implementation using Leibniz formula
benchmarks/compute/leibniz_pi.rs - Rust implementation
benchmarks/compute/leibniz_pi.go - Go implementation

Results

Benchmark	Seq	Rust	Go	Seq/Rust
leibniz_pi (100M iterations)	2977ms	80ms	79ms	37x

What it Tests

Floating-point arithmetic (f.+, f./)
Loop iteration with recursion
Conditional sign alternation (k % 2)
Deep stack manipulation with pick and roll

Algorithm

π/4 = 1 - 1/3 + 1/5 - 1/7 + 1/9 - ...
π = 4 × Σ(k=0 to n) (-1)^k / (2k + 1)

The 37x slowdown for floating-point operations is consistent with the other compute benchmarks (13-32x range), confirming the overhead is systematic rather than specific to any operation type.

Leibniz Pi Benchmark Added (Issue #190) Files Created - benchmarks/compute/leibniz_pi.seq - Seq implementation using Leibniz formula - benchmarks/compute/leibniz_pi.rs - Rust implementation - benchmarks/compute/leibniz_pi.go - Go implementation Results | Benchmark | Seq | Rust | Go | Seq/Rust | |------------------------------|--------|------|------|----------| | leibniz_pi (100M iterations) | 2977ms | 80ms | 79ms | 37x | What it Tests - Floating-point arithmetic (f.+, f./) - Loop iteration with recursion - Conditional sign alternation (k % 2) - Deep stack manipulation with pick and roll Algorithm π/4 = 1 - 1/3 + 1/5 - 1/7 + 1/9 - ... π = 4 × Σ(k=0 to n) (-1)^k / (2k + 1) The 37x slowdown for floating-point operations is consistent with the other compute benchmarks (13-32x range), confirming the overhead is systematic rather than specific to any operation type.

claude · 2026-01-06T03:42:22Z

Pull Request Review - Leibniz Pi Benchmark

Overview

This PR adds a Leibniz formula π calculation benchmark to compare Seq, Rust, and Go floating-point performance. The implementation is solid and follows the repository's established patterns well.

✅ Code Quality & Best Practices

Strengths:

Consistent style: All three implementations (Seq, Rust, Go) follow the established patterns from existing benchmarks (fib.seq, primes.seq)
Clear documentation: Comments explain the algorithm, stack effects, and expected results
Proper stack effect annotations: leibniz-step and leibniz-loop have correct stack effect signatures
Good function decomposition: Split into logical units (leibniz-step, leibniz-loop, leibniz)
Integration complete: Updated run.sh, README.md, and LATEST_RUN.txt appropriately

Minor Style Observations:

The Seq implementation correctly uses deep stack manipulation (pick, roll) which is idiomatic for the language
Comment style matches existing benchmarks (inline stack state comments)

🐛 Potential Issues

1. Precision Inconsistency in Seq Implementation (leibniz_pi.seq:56)

3.14159265 f.- dup 0.0 f.< if -1.0 f.* then  # abs
0.0000001 f.< if 0 else 1 then

Issue: The hardcoded π value 3.14159265 only has 8 decimal places, while the Rust/Go implementations use std::f64::consts::PI (15+ digits). This creates an artificial precision floor.

Impact: The test might pass even if the Seq implementation has reduced precision, masking potential floating-point bugs.

Recommendation:

# Use higher precision constant
3.141592653589793 f.- dup 0.0 f.< if -1.0 f.* then  # abs

Or add a comment explaining why the lower precision is acceptable for this benchmark.

2. Abs Value Implementation Uses Branching (leibniz_pi.seq:56)

The Seq version computes absolute value using dup 0.0 f.< if -1.0 f.* then, while Rust/Go use .abs() method. This is correct but adds extra branches. Consider if Seq has a native f.abs function (checking stdlib would confirm).

⚡ Performance Considerations

1. Algorithm Correctness

All three implementations correctly compute the Leibniz series. The 37x Seq/Rust slowdown is consistent with other benchmarks (13-32x range), suggesting no algorithmic inefficiency specific to this benchmark.

2. Sign Alternation Strategy

Rust/Go: Use sign = -sign (one multiplication per iteration)
Seq: Uses k % 2 with conditional (leibniz_pi.seq:13) - this is more expensive than sign flipping

Analysis: The modulo operation 2 i.% 0 i.= happens every iteration, which is slower than a floating-point negation. The Rust/Go approach of maintaining a sign variable that flips is more efficient.

Suggested Optimization (if you want to explore):

: leibniz-step ( Float Float Int Int -- Float Float Int Int )
  # Stack: sum sign k n
  # ... use sign directly instead of computing (-1)^k

However, this would change the function signature and might not be the intent of this benchmark (testing modulo performance).

3. Stack Manipulation Overhead

The Seq implementation uses 3 roll, rot rot, and pick for stack shuffling (leibniz_pi.seq:19-26). This is unavoidable given Seq's stack-based nature and correctly demonstrates the language's overhead for maintaining multiple local values.

🔒 Security Concerns

✅ No security issues detected

No external input parsing
No memory unsafe operations
All implementations use safe arithmetic
Exit codes properly validate results

🧪 Test Coverage

Strengths:

✅ All three implementations include result validation
✅ Exit codes indicate pass/fail (0 for success, 1 for failure)
✅ Error tolerance is appropriate (1e-7) for 100M iterations
✅ Expected result documented in comments

Verification:

The Leibniz series converges slowly: error ≈ 1/n. With n=100M:

Expected error: ~1e-8 ✅
Test threshold: 1e-7 ✅ (appropriate safety margin)

Missing:

No CI/automated test that runs benchmarks to verify they still produce correct results (but this may be by design since benchmarks are expensive)
The benchmarks/run.sh updates look correct for integration

📊 Documentation Review

Updates to README.md:

✅ Added leibniz_pi to benchmark table (benchmarks/README.md:138)
✅ Updated performance numbers for other benchmarks (slight improvements)
✅ Added entry to compute README (benchmarks/compute/README.md:38-45)

Minor Issue:

The benchmark results table shows rounded times (e.g., "2900ms" for Seq), but it's unclear if these are averages, medians, or single runs. Consider documenting the measurement methodology (though this may exist elsewhere in the docs).

Summary & Recommendations

Must Fix:

None - the implementation is correct and functional

Should Consider:

Precision constant: Use full-precision π value in Seq test (leibniz_pi.seq:56)
Document modulo choice: Add comment explaining why k % 2 is used instead of sign flipping (if intentional for testing modulo performance)

Nice to Have:

Check if Seq has f.abs builtin to simplify absolute value calculation
Consider adding a comment about convergence rate (1/n) to explain why 100M iterations is chosen

Rating: Approve with Minor Suggestions ✅

This is a well-implemented benchmark that correctly follows the repository's conventions. The identified issues are minor and don't affect correctness. The 37x slowdown aligns with expected Seq performance characteristics and provides valuable data for the project.

Files Reviewed:

benchmarks/compute/leibniz_pi.seq:1-58
benchmarks/compute/leibniz_pi.rs:1-33
benchmarks/compute/leibniz_pi.go:1-39
benchmarks/README.md (updates)
benchmarks/compute/README.md (updates)
benchmarks/compute/run.sh (integration)

The issue documents: - The performance inversion between Seq and compiled languages - Evidence from the leibniz_pi benchmark - Four potential optimization approaches: a. Register allocation for loop variables b. Peephole optimization for stack patterns c. Local variable support (named locals) d. Specialization for small N values in roll/pick - Links to related optimization issues (#186, #189) This gives good context for future optimization work on stack operations.

claude · 2026-01-06T04:02:41Z

Code Review: Leibniz Pi Benchmark

Thank you for this contribution! This is a well-structured PR that adds a valuable floating-point benchmark. Below is my detailed review:

✅ Strengths

1. Excellent Consistency

The implementation follows existing benchmark patterns perfectly (comparing fib.seq, primes.seq, sum_squares.seq)
All three languages (Seq, Rust, Go) implement the same algorithm with the same iteration count
Documentation updates are comprehensive and consistent across all README files

2. Good Code Quality

Clear, readable implementations in all three languages
Proper error checking with exit codes
Accurate expected results with appropriate tolerances

3. Comprehensive Integration

Properly integrated into the benchmark suite (run.sh, compute/run.sh)
Documentation updated in multiple locations
LATEST_RUN.txt properly updated

🔍 Issues Found

1. Critical: Incorrect Seq Implementation ⚠️

Location: benchmarks/compute/leibniz_pi.seq:37-40

The Seq implementation has a logic error in the loop termination condition:

: leibniz-loop ( Float Float Int Int -- Float )
  # Stack: sum sign k n
  # Loop while k < n
  over over i.< if           # ← WRONG: This checks n < k
    leibniz-step
    leibniz-loop

Problem: over over i.< with stack sum sign k n produces sum sign k n k n, then i.< checks if n < k, which is backwards.

Expected: Should check k < n to continue the loop.

Fix: Use 2dup swap i.< or restructure to get k n in correct order:

  2dup swap i.< if  # k n -> k n k n -> k n n k -> k n (n < k is false, k < n is true)

Impact: This bug means the loop condition is inverted. The benchmark may still produce output, but likely exits early or has undefined behavior.

2. Minor: Sign Flipping Logic is Overly Complex

Location: benchmarks/compute/leibniz_pi.seq:34-37

  # Flip sign: sum' sign k n -> sum' sign' k n
  rot                           # sum' k n sign
  -1.0 f.*                      # sum' k n sign'
  rot rot                       # sum' sign' k n

Suggestion: This could be simplified with better stack manipulation or a comment explaining why the rotations are necessary. Consider:

Adding a helper word like : negate ( Float -- Float ) -1.0 f.* ;
Or documenting why three rotations are needed here

3. Minor: Magic Number in Verification

Location: benchmarks/compute/leibniz_pi.seq:68

  3.141592653589793 f.- dup 0.0 f.< if -1.0 f.* then  # abs

Suggestion: The hardcoded Pi value should either:

Be defined as a constant with a name
Have a comment explaining it's the expected Pi value for comparison
Reference where this value comes from (std::f64::consts::PI equivalent)

4. Documentation: Performance Analysis Unclear

The PR description states:

"The 37x slowdown for floating-point operations is consistent with the other compute benchmarks (13-32x range)"

Observation: 37x is actually at the higher end of this range, not "consistent" with 13-32x. It's worth investigating whether:

Float operations in Seq have additional overhead compared to integer ops
The recursive implementation adds overhead compared to the iterative Rust/Go versions
The complex stack manipulations contribute to the slowdown

Suggestion: Add a note in the documentation about why this benchmark is at the higher end of the performance range.

🧪 Testing Concerns

Missing Test Coverage

No unit tests for the Seq implementation
The verification only checks the final result, not intermediate steps
Existing benchmarks like fib.seq and primes.seq also only verify final output

Recommendation: While this matches the existing pattern, consider adding:

A test that verifies the calculation with a small iteration count (e.g., 10 iterations)
Verification that the algorithm converges in the expected direction

🔒 Security & Performance

Security: ✅ No Concerns

No external input handling
No unsafe operations
Fixed iteration count prevents DoS

Performance: ✅ Appropriate

100M iterations is reasonable for a benchmark
Converges to ~8 decimal places as documented
Matches Rust/Go iteration counts for fair comparison

📝 Recommendations

High Priority

Fix the loop termination condition in leibniz_pi.seq:37 - This is a critical bug
Verify the Seq implementation actually produces correct output - Test manually or add validation

Medium Priority

Simplify the sign-flipping logic or add explanatory comments
Add a constant for the Pi reference value instead of magic number

Low Priority

Document why this benchmark is slower (37x vs 13-32x range)
Consider adding a small-iteration test for easier verification

📊 Summary

Category	Rating	Notes
Code Quality	⚠️ Good with issues	Logic bug in Seq implementation
Documentation	✅ Excellent	Comprehensive and consistent
Testing	⚠️ Minimal	Matches existing pattern but limited
Integration	✅ Excellent	Properly integrated into suite
Security	✅ Good	No concerns

Overall: This is a valuable addition to the benchmark suite, but requires fixing the critical logic bug in the Seq implementation before merging. Once fixed, this will be a solid contribution.

Would you like me to provide a patch for the loop condition fix?

claude · 2026-01-06T04:11:59Z

Code Review: Leibniz Pi Benchmark

Thank you for this contribution! This is a well-implemented benchmark that adds valuable floating-point testing to the suite.

Strengths

Excellent Documentation: All three implementations include clear comments explaining the algorithm and build instructions.
Consistent Implementation: Follows established patterns from other compute benchmarks with proper structure and error checking.
Comprehensive Integration: Properly integrated into benchmarks/compute/run.sh, benchmarks/run.sh, and all documentation.
Accurate Results: All implementations verify results within acceptable tolerance (1e-7).
Performance Insights: The 32-37x slowdown provides valuable data on Seq floating-point performance.

Code Quality Issues

Seq Implementation (leibniz_pi.seq:30-37)

Complex Stack Manipulation - The multiple 3 roll operations could benefit from clearer comments or simplification.
Sign Flipping - Uses multiplication (-1.0 f.*) which works but could use a comment explaining the approach.
Magic Numbers - Line 57 uses hardcoded constants (0.0 1.0 0) without named definitions.

Rust Implementation - Minor: uses mutable variables (acceptable for benchmarks).

Go Implementation - No issues found, clean idiomatic code.

Potential Bugs

None identified. All implementations handle edge cases correctly and validate results properly.

Performance Considerations

Algorithm choice (Leibniz formula) is appropriate for CPU benchmarking despite slow convergence.
100M iterations well-chosen for measurable runtime differences.

Security Concerns

None. Pure computational benchmark with no external input, file I/O, or unsafe operations.

Test Coverage

Present: All implementations validate output, proper exit codes, CI integration.

Recommendation: Consider adding a quick smoke test with fewer iterations (e.g., 1000) for faster verification.

Documentation

Excellent overall. Minor suggestion: Note in benchmarks/compute/README.md:45 that ~8 decimal place accuracy is specific to 100M iterations.

Recommendations Summary

Required before merge: None - code is functional and follows established patterns.

Optional improvements:

Add clearer comments for stack manipulation in leibniz_pi.seq:30-37
Add explanatory comments for magic numbers
Consider fast smoke test variant

Overall Assessment

This is a high-quality contribution that:

Follows repository conventions
Includes comprehensive documentation
Works correctly across all languages
Integrates properly with benchmark infrastructure
Provides valuable performance data

Recommendation: Approve with optional suggestions for future refinement.

Review generated by Claude Code

navicore mentioned this pull request Jan 6, 2026

pi benchmark #190

Closed

use fmath:abs

c3c2bc9

navicore merged commit 99b0cc2 into main Jan 6, 2026
2 checks passed

navicore deleted the i-179 branch January 6, 2026 04:20

navicore mentioned this pull request Jan 7, 2026

Optimize dup after literal push (Int, Float, Bool) #195

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⏺ All done! Here's the summary:#191

⏺ All done! Here's the summary:#191
navicore merged 3 commits intomainfrom
i-179

navicore commented Jan 6, 2026

Uh oh!

claude Bot commented Jan 6, 2026

Uh oh!

claude Bot commented Jan 6, 2026

Uh oh!

claude Bot commented Jan 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

navicore commented Jan 6, 2026

Uh oh!

claude Bot commented Jan 6, 2026

Pull Request Review - Leibniz Pi Benchmark

Overview

✅ Code Quality & Best Practices

🐛 Potential Issues

1. Precision Inconsistency in Seq Implementation (leibniz_pi.seq:56)

2. Abs Value Implementation Uses Branching (leibniz_pi.seq:56)

⚡ Performance Considerations

1. Algorithm Correctness

2. Sign Alternation Strategy

3. Stack Manipulation Overhead

🔒 Security Concerns

🧪 Test Coverage

Strengths:

Verification:

Missing:

📊 Documentation Review

Updates to README.md:

Minor Issue:

Summary & Recommendations

Must Fix:

Should Consider:

Nice to Have:

Rating: Approve with Minor Suggestions ✅

Uh oh!

claude Bot commented Jan 6, 2026

Code Review: Leibniz Pi Benchmark

✅ Strengths

1. Excellent Consistency

2. Good Code Quality

3. Comprehensive Integration

🔍 Issues Found

1. Critical: Incorrect Seq Implementation ⚠️

2. Minor: Sign Flipping Logic is Overly Complex

3. Minor: Magic Number in Verification

4. Documentation: Performance Analysis Unclear

🧪 Testing Concerns

Missing Test Coverage

🔒 Security & Performance

Security: ✅ No Concerns

Performance: ✅ Appropriate

📝 Recommendations

High Priority

Medium Priority

Low Priority

📊 Summary

Uh oh!

claude Bot commented Jan 6, 2026

Code Review: Leibniz Pi Benchmark

Strengths

Code Quality Issues

Potential Bugs

Performance Considerations

Security Concerns

Test Coverage

Documentation

Recommendations Summary

Overall Assessment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant