Skip to content

Integrate WebAssembly counters and enhanced COCOMO estimates#79

Merged
simonw merged 1 commit intomainfrom
wasm-sloccount-integration
Oct 21, 2025
Merged

Integrate WebAssembly counters and enhanced COCOMO estimates#79
simonw merged 1 commit intomainfrom
wasm-sloccount-integration

Conversation

@simonw
Copy link
Owner

@simonw simonw commented Oct 21, 2025

Summary

This PR integrates the original SLOCCount tool to run entirely in the browser via WebAssembly, completing the implementation with both Perl and C language counters.

Changes

WebAssembly Integration

  • Compiled SLOCCount C programs to WebAssembly using Emscripten v4.0.17
    • c_count.wasm: Main C/C++/JavaScript/Java counter
    • php_count.wasm, ml_count.wasm, pig_count.wasm, lexcount1.wasm
  • Implemented hybrid routing between Perl scripts and WASM binaries
    • Perl scripts handle: Python, Ruby, Perl, SQL
    • WASM binaries handle: JavaScript, C, C++, Java, PHP, Go, Rust
  • Uses original SLOCCount algorithms without modification

Enhanced COCOMO Cost Estimation

  • Year preset toggle: Switch between 2000 (original) and 2025 (updated) estimates
    • 2000: Effort 2.4, Salary $56,286, Overhead 2.4x
    • 2025: Effort 2.94, Salary $133,080, Overhead 1.85x
  • All parameters editable: Effort coefficient, salary, overhead multiplier
  • Prominent warnings: Yellow banner emphasizing estimates are extremely rough
  • Footnote links: Asterisks on cost fields link to detailed explanation section
  • Transparency: Notes that 2025 values are rough guesses with link to source

Documentation

  • Updated lib/README.md with:
    • WASM compilation process and commands
    • Emscripten configuration options
    • Explanation of hybrid Perl + WASM architecture
  • Added credit to David A. Wheeler with link to original SLOCCount

Testing

  • Added 3 new comprehensive tests:
    • test_cocomo_year_presets: Verifies year preset switching
    • test_cost_estimate_footnote_link: Tests footnote navigation
    • test_wasm_javascript_counter: Confirms WASM integration works
  • All 18 tests passing

Technical Details

The application now uses a hybrid approach:

  1. Detects file extension
  2. Routes to appropriate counter (Perl script or WASM binary)
  3. Writes files to virtual filesystem
  4. Executes counter and captures output
  5. Aggregates results across all languages

Both Perl (via WebPerl) and C programs (via Emscripten WASM) run entirely in the browser with no server required.

Testing

Run tests with:

uv-test

All 18 tests pass, including existing functionality and new WASM/COCOMO features.


Claude Code transcript: https://gistpreview.github.io/?79ca231e801fe1188268a54d30aa67ed

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Oct 21, 2025

Deploying tools with  Cloudflare Pages  Cloudflare Pages

Latest commit: 38c973d
Status: ✅  Deploy successful!
Preview URL: https://a2557e32.tools-b1q.pages.dev
Branch Preview URL: https://wasm-sloccount-integration.tools-b1q.pages.dev

View logs

@claude
Copy link
Contributor

claude bot commented Oct 21, 2025

Claude finished @simonw's task —— View job


PR Review: WebAssembly SLOCCount Integration

I've completed a comprehensive review of this pull request. Here's my detailed feedback:

Review Tasks

  • Read repository guidelines (CLAUDE.md)
  • Analyze main implementation (sloccount.html)
  • Review WebAssembly integration approach
  • Check COCOMO estimation logic
  • Assess security implications
  • Review test coverage
  • Evaluate performance considerations
  • Provide comprehensive feedback

Overall Assessment

This is an excellent implementation that successfully integrates WebAssembly to provide client-side source code analysis. The approach is well-architected and maintains the integrity of the original SLOCCount algorithms while providing a modern web interface.

Strengths

WebAssembly Integration

  • Excellent hybrid approach: Perl scripts via WebPerl for some languages (Python, Ruby, Perl, SQL) and WASM binaries for others (JavaScript, C, C++, Java, PHP, Go, Rust)
  • Proper Emscripten configuration: Uses appropriate flags (EXPORTED_RUNTIME_METHODS, ALLOW_MEMORY_GROWTH, FORCE_FILESYSTEM)
  • Clean module loading: Properly structured module factory pattern with error handling
  • File system abstraction: Correctly writes files to virtual filesystem at /tmp/ before analysis

COCOMO Implementation

  • Year presets are valuable: 2000 vs 2025 presets with realistic parameter updates (sloccount.html:1141-1147)
  • Transparent estimation: Clear warnings about rough estimates and prominent disclaimers
  • Editable parameters: All COCOMO factors (effort coefficient, salary, overhead multiplier) are user-adjustable
  • Real-time recalculation: Updates estimates when parameters change
  • Proper linking: Footnote asterisks correctly link to detailed explanations

Code Quality

  • Good separation of concerns: Language detection, file routing, and counter execution are well-separated
  • Error handling: Comprehensive try-catch blocks throughout async operations
  • User feedback: Clear status messages during processing phases
  • Mobile responsive: CSS breakpoints and mobile-friendly design

Test Coverage

  • Comprehensive test suite: 18 tests covering multiple scenarios
  • New functionality tested: 3 new tests specifically for COCOMO presets, footnote links, and WASM integration
  • Real-world scenarios: Tests include GitHub repository analysis and various file types
  • Edge cases covered: Validation, error states, and mobile responsiveness

Areas for Improvement

⚠️ Performance Considerations

  1. Memory Usage (sloccount.html:848-850):

    // Limit files to avoid rate limiting and browser memory issues
    const maxFiles = 500;
    • Good mitigation, but consider dynamic limits based on available memory
    • Large repositories could still cause browser slowdowns
  2. Batch Processing (sloccount.html:860-887):

    • Current batch size of 10 files is conservative
    • Could implement adaptive batching based on file sizes

⚠️ Error Handling

  1. WASM Loading Errors (sloccount.html:504-512):

    window.CCountModule = await createCCountModule({
        print: (text) => { window.wasmOutput.push(text); },
        printErr: (text) => { console.error('c_count WASM:', text); },
    • Could benefit from more specific error handling for WASM instantiation failures
    • Should handle cases where WASM isn't supported
  2. Network Timeouts: GitHub API calls lack explicit timeout handling

⚠️ Security Considerations

  1. Input Sanitization:

    • File contents are processed directly without sanitization
    • While this is expected for code analysis, consider limits on file sizes
  2. CORS Dependency:

    • Relies on raw.githubusercontent.com having permissive CORS (sloccount.html:867)
    • This is correct but worth documenting as a dependency

⚠️ Code Organization

  1. Large Function (sloccount.html:536-753):

    • runSloccount() function is quite large (217 lines)
    • Could be broken into smaller, more focused functions
  2. Global State:

    • Some reliance on window objects for data passing between Perl and JavaScript
    • Consider more structured approach for data exchange

Minor Issues

🔧 Documentation

  • Excellent README: The lib/README.md is very comprehensive and well-structured
  • Clear licensing: Proper attribution and license compliance
  • Build instructions: Complete WASM compilation process documented

🔧 User Experience

  • Helpful warnings: Cost estimate disclaimers are prominent and appropriate
  • Progressive disclosure: Technical details are available but not overwhelming
  • Accessibility: Good use of semantic HTML and ARIA considerations

Technical Deep Dive

WebAssembly Architecture

The hybrid Perl + WASM approach is particularly clever:

// Language routing logic (sloccount.html:539-559)
const extToCounter = {
    'py': { counter: 'python_count', type: 'perl', lang: 'Python' },
    'js': { counter: 'javascript_count', type: 'wasm', lang: 'JavaScript' },
    // ...
};

This maintains the original SLOCCount algorithm fidelity while leveraging the strengths of both execution environments.

Security Assessment

  • ✅ No arbitrary code execution risks
  • ✅ Files processed in sandboxed environments (WebPerl VM + WASM)
  • ✅ No server-side components that could be compromised
  • ✅ Appropriate CORS usage for GitHub API access

Recommendations

  1. Performance: Consider implementing memory monitoring and dynamic batching
  2. Error Handling: Add more specific error messages for WASM loading failures
  3. Code Organization: Refactor runSloccount() into smaller functions
  4. Documentation: Add JSDoc comments for complex functions
  5. Testing: Consider adding integration tests for memory limits and large files

Conclusion

This is a high-quality implementation that successfully achieves its goals. The WebAssembly integration is well-executed, the COCOMO enhancements are valuable, and the test coverage is comprehensive. The code is production-ready with only minor optimization opportunities.

Recommendation: ✅ APPROVE

The implementation demonstrates strong technical competence and thoughtful engineering decisions. The hybrid approach preserving original SLOCCount algorithms while enabling browser execution is particularly commendable.


- Compile SLOCCount C programs to WebAssembly using Emscripten v4.0.17
  - c_count.wasm: Main C/C++/JavaScript/Java counter
  - php_count.wasm, ml_count.wasm, pig_count.wasm, lexcount1.wasm

- Implement hybrid Perl + WASM counter routing
  - Perl scripts for Python, Ruby, Perl, SQL
  - WASM binaries for JavaScript, C, C++, Java, PHP
  - Uses original SLOCCount algorithms without modification

- Add enhanced COCOMO cost estimation features
  - Year preset toggle: 2000 (original) vs 2025 (updated)
  - All COCOMO parameters now editable (effort coefficient, salary, overhead)
  - Prominent warning about estimate accuracy
  - Footnote links from cost displays to detailed explanation

- Update documentation
  - Add WASM compilation process to lib/README.md
  - Include Emscripten build commands and configuration

- Add comprehensive tests
  - Test COCOMO year preset switching
  - Test footnote link navigation
  - Test WASM JavaScript counter integration
  - All 18 tests passing

This completes the integration of the original SLOCCount tool to run
entirely in the browser via WebAssembly, supporting both Perl and C
language counters.
@simonw simonw force-pushed the wasm-sloccount-integration branch from 10635e9 to 38c973d Compare October 21, 2025 19:47
@claude
Copy link
Contributor

claude bot commented Oct 21, 2025

Claude finished @simonw's task —— View job


PR Review: WebAssembly SLOCCount Integration

I've completed a comprehensive review of this pull request. Here's my detailed feedback:

Review Tasks

  • Read repository guidelines (CLAUDE.md)
  • Analyze main implementation (sloccount.html)
  • Review WebAssembly integration approach
  • Check COCOMO estimation logic
  • Assess security implications
  • Review test coverage
  • Evaluate performance considerations
  • Provide comprehensive feedback

Overall Assessment

This is an excellent implementation that successfully integrates WebAssembly to provide client-side source code analysis. The approach is well-architected and maintains the integrity of the original SLOCCount algorithms while providing a modern web interface.

Strengths

WebAssembly Integration

  • Excellent hybrid approach: Perl scripts via WebPerl for some languages (Python, Ruby, Perl, SQL) and WASM binaries for others (JavaScript, C, C++, Java, PHP, Go, Rust)
  • Proper Emscripten configuration: Uses appropriate flags (EXPORTED_RUNTIME_METHODS, ALLOW_MEMORY_GROWTH, FORCE_FILESYSTEM)
  • Clean module loading: Properly structured module factory pattern with error handling
  • File system abstraction: Correctly writes files to virtual filesystem at /tmp/ before analysis

COCOMO Implementation

  • Year presets are valuable: 2000 vs 2025 presets with realistic parameter updates (sloccount.html:1141-1147)
  • Transparent estimation: Clear warnings about rough estimates and prominent disclaimers
  • Editable parameters: All COCOMO factors (effort coefficient, salary, overhead multiplier) are user-adjustable
  • Real-time recalculation: Updates estimates when parameters change
  • Proper linking: Footnote asterisks correctly link to detailed explanations

Code Quality

  • Good separation of concerns: Language detection, file routing, and counter execution are well-separated
  • Error handling: Comprehensive try-catch blocks throughout async operations
  • User feedback: Clear status messages during processing phases
  • Mobile responsive: CSS breakpoints and mobile-friendly design

Test Coverage

  • Comprehensive test suite: 18 tests covering multiple scenarios
  • New functionality tested: 3 new tests specifically for COCOMO presets, footnote links, and WASM integration
  • Real-world scenarios: Tests include GitHub repository analysis and various file types
  • Edge cases covered: Validation, error states, and mobile responsiveness

Areas for Improvement

⚠️ Performance Considerations

  1. Memory Usage (sloccount.html:848-850):

    // Limit files to avoid rate limiting and browser memory issues
    const maxFiles = 500;
    • Good mitigation, but consider dynamic limits based on available memory
    • Large repositories could still cause browser slowdowns
  2. Batch Processing (sloccount.html:860-887):

    • Current batch size of 10 files is conservative
    • Could implement adaptive batching based on file sizes

⚠️ Error Handling

  1. WASM Loading Errors (sloccount.html:504-512):

    window.CCountModule = await createCCountModule({
        print: (text) => { window.wasmOutput.push(text); },
        printErr: (text) => { console.error('c_count WASM:', text); },
    • Could benefit from more specific error handling for WASM instantiation failures
    • Should handle cases where WASM isn't supported
  2. Network Timeouts: GitHub API calls lack explicit timeout handling

⚠️ Security Considerations

  1. Input Sanitization:

    • File contents are processed directly without sanitization
    • While this is expected for code analysis, consider limits on file sizes
  2. CORS Dependency:

    • Relies on raw.githubusercontent.com having permissive CORS (sloccount.html:867)
    • This is correct but worth documenting as a dependency

⚠️ Code Organization

  1. Large Function (sloccount.html:536-753):

    • runSloccount() function is quite large (217 lines)
    • Could be broken into smaller, more focused functions
  2. Global State:

    • Some reliance on window objects for data passing between Perl and JavaScript
    • Consider more structured approach for data exchange

Minor Issues

🔧 Documentation

  • Excellent README: The lib/README.md is very comprehensive and well-structured
  • Clear licensing: Proper attribution and license compliance
  • Build instructions: Complete WASM compilation process documented

🔧 User Experience

  • Helpful warnings: Cost estimate disclaimers are prominent and appropriate
  • Progressive disclosure: Technical details are available but not overwhelming
  • Accessibility: Good use of semantic HTML and ARIA considerations

Technical Deep Dive

WebAssembly Architecture

The hybrid Perl + WASM approach is particularly clever:

// Language routing logic (sloccount.html:539-559)
const extToCounter = {
    'py': { counter: 'python_count', type: 'perl', lang: 'Python' },
    'js': { counter: 'javascript_count', type: 'wasm', lang: 'JavaScript' },
    // ...
};

This maintains the original SLOCCount algorithm fidelity while leveraging the strengths of both execution environments.

Security Assessment

  • ✅ No arbitrary code execution risks
  • ✅ Files processed in sandboxed environments (WebPerl VM + WASM)
  • ✅ No server-side components that could be compromised
  • ✅ Appropriate CORS usage for GitHub API access

Recommendations

  1. Performance: Consider implementing memory monitoring and dynamic batching
  2. Error Handling: Add more specific error messages for WASM loading failures
  3. Code Organization: Refactor runSloccount() into smaller functions
  4. Documentation: Add JSDoc comments for complex functions
  5. Testing: Consider adding integration tests for memory limits and large files

Conclusion

This is a high-quality implementation that successfully achieves its goals. The WebAssembly integration is well-executed, the COCOMO enhancements are valuable, and the test coverage is comprehensive. The code is production-ready with only minor optimization opportunities.

Recommendation: ✅ APPROVE

The implementation demonstrates strong technical competence and thoughtful engineering decisions. The hybrid approach preserving original SLOCCount algorithms while enabling browser execution is particularly commendable.


@simonw simonw merged commit 473e89e into main Oct 21, 2025
4 checks passed
@simonw simonw deleted the wasm-sloccount-integration branch October 21, 2025 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant