Skip to content
This repository was archived by the owner on Apr 29, 2026. It is now read-only.

Modules#24

Merged
navicore merged 6 commits intomainfrom
modules
Nov 26, 2025
Merged

Modules#24
navicore merged 6 commits intomainfrom
modules

Conversation

@navicore
Copy link
Copy Markdown
Owner

No description provided.

…ted:

Include System Implementation

Files Changed/Created:

- compiler/src/ast.rs: Added SourceLocation, Include enum, updated Program and WordDef
- compiler/src/parser.rs: Added parse_include() to handle include std:name and include "path"
- compiler/src/resolver.rs: New module with include resolution and collision detection
- compiler/src/lib.rs: Integrated resolver into compilation pipeline
- docs/INCLUDE_SYSTEM.md: Design document

Features:

1. Standard library includes: include std:http loads from stdlib/http.seq
2. Relative includes: include "my-utils" loads ./my-utils.seq relative to current file
3. Collision detection: Clear error when same word is defined in multiple files
4. Double-include prevention: Files included multiple times are silently deduplicated
5. Recursive includes: Included files can include other files

Tested:

- include std:stack-utils - used 2dup from stdlib, output: 6
- include "my-utils" - used double and triple from local file, output: 10, 15
- Collision detection - clear error message showing both file locations

Known Issues (Pre-existing):

- stdlib/math.seq has parameter names (min, max) that the parser interprets as type names
- stdlib/http.seq has some type checking issues when functions are combined

All 144+ tests pass and there are no clippy warnings.
⏺ Write(examples/include_demo/http_simple.seq)
  ⎿  Wrote 56 lines to examples/include_demo/http_simple.seq
     # Simple HTTP Server using stdlib
     #
     # This example demonstrates using the std:http library to build
     # a minimal HTTP server with less boilerplate.
     #
     # Usage:
     #   seqc --output http_simple http_simple.seq
     #   ./http_simple
     #   curl http://localhost:8080/
     … +46 lines (ctrl+o to expand)

⏺ Bash(./target/release/seqc --output /tmp/http_simple /Users/navicore/git/navicore/patch-seq/examples/include_demo/http_simple.seq 2>&1)
  ⎿  Compiled /Users/navicore/git/navicore/patch-seq/examples/include_demo/http_simple.seq -> /tmp/http_simple

⏺ Both examples compile. Created examples/include_demo/ with:

  - main.seq - Demonstrates include std:math and include std:stack-utils with various function calls
  - http_simple.seq - A minimal HTTP server using include std:http, showing http-ok, http-not-found, and http-response

  The HTTP example also uses the fixed http-response function to show custom status codes (201 Created).
@claude
Copy link
Copy Markdown

claude Bot commented Nov 25, 2025

Pull Request Review: Modules System

Overview

This PR implements a module/include system for Seq, adding the ability to import code from the standard library and relative file paths. The implementation is solid and follows the design documented in docs/INCLUDE_SYSTEM.md.

✅ Strengths

1. Well-Designed Architecture

  • Clean separation of concerns with a dedicated resolver.rs module
  • Proper handling of circular dependencies via canonical path tracking
  • Good error messages with source location tracking for collision detection
  • Follows the documented design spec closely

2. Good Code Quality

  • Clear module documentation and helpful comments
  • Sensible fallback strategy for finding stdlib (env var → exe path → local)
  • Proper use of Rust idioms (HashSet for deduplication, Result for error handling)
  • Tests cover key scenarios (collision detection, no-collision cases)

3. Thoughtful Features

  • Include-once semantics prevent diamond dependency issues
  • Both std: and relative path includes supported
  • Automatic .seq extension addition is user-friendly
  • Word collision detection prevents hard-to-debug runtime issues

🐛 Potential Issues

1. Line Number Tracking Not Implemented (Critical for UX)

Location: compiler/src/resolver.rs:53

Without line numbers, collision errors will show file.seq:0 instead of the actual line, making debugging difficult. The TODO comment acknowledges this, but it significantly impacts error message quality.

Recommendation: Track line numbers during parsing or document this limitation prominently.

2. Test Coverage: Missing Integration Tests

Current State: Only unit tests in resolver.rs for collision detection logic

Missing Coverage:

  • No end-to-end tests that actually compile files with includes
  • No tests for recursive includes (A includes B, B includes C)
  • No tests for diamond dependencies
  • No tests for error cases (missing files, invalid syntax in includes)
  • No tests verifying the examples actually compile and run

Recommendation: Add integration tests that create temp files and run the compiler.

3. Parser Edge Cases

Location: compiler/src/parser.rs:60-94

The parse_include method could be more robust:

  • Empty paths like include "" are not validated
  • Escaped quotes in strings not handled
  • Could benefit from more specific error messages

4. Security: Path Traversal

Location: compiler/src/resolver.rs:106-117

No validation against path traversal attacks like include "../../../etc/passwd". While compiler typically runs on trusted code, adding validation would be defense in depth:

if rel_path.contains("..") {
    return Err("Include paths cannot contain ..".to_string());
}

5. Error Messages

Location: Multiple places using canonicalize()

Error messages like "Failed to canonicalize" may be unclear to users. Consider user-friendly wording like "Include file not found or inaccessible".

6. Test Name Mismatch

Location: compiler/src/resolver.rs:268

test_collision_detection_same_file_allowed has a misleading name - it actually tests that same-file collisions ARE detected. Consider renaming to test_collision_detection_same_file.

🎯 Performance Considerations

  • Files are canonicalized multiple times (minor inefficiency)
  • Sequential file reading (fine for now, could be parallelized if needed)
  • Overall performance should be good for typical use cases

🔒 Security Assessment

Overall: Low Risk - This is a compiler feature on trusted code. Main concern is path traversal (see issue #4).

📝 Documentation & Examples

Strengths:

  • Excellent docs/INCLUDE_SYSTEM.md documentation
  • Good example programs
  • stdlib files have helpful comments

Suggestions:

  • Add brief mention in README.md
  • Add troubleshooting section for common errors
  • Update "cem3" references to "Seq" in stdlib comments (stdlib/math.seq:1, stdlib/stack-utils.seq:1)

🧪 Testing Recommendations

High Priority:

  1. Add integration tests that compile and run programs with includes
  2. Test error cases (missing files, collisions)
  3. Test diamond dependencies
  4. Verify examples work

Medium Priority:

  1. Test SEQ_STDLIB environment variable
  2. Test stdlib discovery in different locations
  3. Performance test with many includes

✅ What Works Well

  1. Overall architecture is sound and extensible
  2. Error handling is comprehensive
  3. Design allows for future features (selective imports, packages)
  4. Backwards compatible
  5. stdlib modules are well-documented

🎯 Recommendation

Conditional Approval - This is a solid implementation of an important feature. Main gaps:

Before Merge (High Priority):

  1. Add at least basic integration tests
  2. Add path traversal validation
  3. Update stdlib comments from "cem3" to "Seq"
  4. Document line number limitation or implement tracking

After Merge (Can be follow-up PRs):

  1. Expand test coverage
  2. Enhance error messages
  3. Add troubleshooting docs

🎉 Conclusion

This is a well-designed and thoughtfully implemented feature. The code quality is good, error handling is solid, and it enables important functionality. With integration tests and a few security improvements, this is ready to merge.

Great work! The include system makes the standard library much more practical. 🚀


Review by Claude Code

@claude
Copy link
Copy Markdown

claude Bot commented Nov 25, 2025

Pull Request Review: Module/Include System

Summary

This PR introduces a module system for Seq with include statements supporting both standard library (std:) and relative path imports. The implementation is well-structured and follows the design documented in docs/INCLUDE_SYSTEM.md.


✅ Strengths

1. Clean Architecture

  • Resolver module: Well-separated concerns with clear responsibilities (path resolution, cycle prevention, collision detection)
  • AST extensions: Minimal, non-invasive changes to existing structures
  • Integration: Smoothly integrates into the compilation pipeline without disrupting existing functionality

2. Good Error Handling

  • Collision detection: Excellent error messages with file locations (e.g., compiler/src/resolver.rs:127-150)
  • Helpful hints: User-friendly suggestions when errors occur
  • Path resolution errors: Clear messages when includes can't be found

3. Documentation

  • Comprehensive design document (docs/INCLUDE_SYSTEM.md)
  • Good inline comments explaining complex logic
  • Working examples demonstrating real usage

4. Test Coverage

  • Unit tests for collision detection in resolver.rs
  • Example programs that serve as integration tests
  • Tests updated throughout the codebase to account for new AST fields

🔍 Issues & Recommendations

CRITICAL: Path Traversal Security Risk

Location: compiler/src/resolver.rs:109-119

Include::Relative(rel_path) => {
    let path = source_dir.join(format!("{}.seq", rel_path));
    if !path.exists() {
        return Err(format!(
            "Include file '{}' not found at {}",
            rel_path,
            path.display()
        ));
    }
    Ok(path)
}

Issue: The code doesn't prevent path traversal attacks. A malicious .seq file could include:

include "../../../etc/passwd"
include "../../../../home/user/.ssh/id_rsa"

While these specific files won't parse as Seq code, this could:

  1. Leak information about file existence through error messages
  2. Potentially expose source code from other projects
  3. Allow reading arbitrary .seq files outside the project

Recommendation: Add path validation before resolving:

Include::Relative(rel_path) => {
    // Prevent path traversal
    if rel_path.contains("..") || rel_path.starts_with('/') {
        return Err(format!(
            "Invalid include path '{}': relative paths cannot contain '..' or start with '/'",
            rel_path
        ));
    }
    
    let path = source_dir.join(format!("{}.seq", rel_path));
    
    // Ensure the resolved path is still within source_dir or a subdirectory
    let canonical = path.canonicalize().map_err(|e| {
        format!("Failed to resolve include path {}: {}", path.display(), e)
    })?;
    
    let canonical_source = source_dir.canonicalize().map_err(|e| {
        format!("Failed to canonicalize source directory: {}", e)
    })?;
    
    if !canonical.starts_with(&canonical_source) {
        return Err(format!(
            "Include path '{}' is outside the source directory",
            rel_path
        ));
    }
    
    if !canonical.exists() {
        return Err(format!(
            "Include file '{}' not found at {}",
            rel_path,
            canonical.display()
        ));
    }
    
    Ok(canonical)
}

Bug: Missing Line Number Tracking

Location: compiler/src/resolver.rs:52

word.source = Some(SourceLocation {
    file: source_path.clone(),
    line: 0, // TODO: Track actual line numbers
});

Issue: The TODO indicates line numbers aren't being tracked, which defeats much of the purpose of the SourceLocation type. Users will see collision errors like:

Word 'http-ok' is defined multiple times:
  - stdlib/http.seq:0
  - my-http.seq:0

Impact: Makes debugging collisions significantly harder.

Recommendation:

  1. Short-term: Update the Parser to track line numbers for word definitions
  2. Add a line field to WordDef during parsing
  3. Pass line numbers through to the resolver

Alternatively, if this is a known limitation for the initial release, document it clearly in the PR description.


Code Quality Issues

1. Inconsistent Naming Convention

Location: compiler/src/resolver.rs:175-179

The function uses let-chain syntax which requires unstable Rust features:

if let Ok(exe_path) = std::env::current_exe()
    && let Some(exe_dir) = exe_path.parent()
{

Issue: This will fail to compile on stable Rust. The let_chains feature is still unstable (as of Rust 1.83).

Recommendation: Rewrite using nested if let or and_then:

if let Ok(exe_path) = std::env::current_exe() {
    if let Some(exe_dir) = exe_path.parent() {
        // ...
    }
}

2. Unused Test Case Logic

Location: compiler/src/resolver.rs:268-295

The test test_collision_detection_same_file_allowed has a misleading name and comment:

#[test]
fn test_collision_detection_same_file_allowed() {
    // Same word in same file is fine (parser should catch true duplicates)
    // ...
    // This IS a collision - same name in same file on different lines
    // (though typically parser would catch this as a parse error)
    let result = check_collisions(&words);
    assert!(result.is_err());
}

Issue: The test name suggests same-file collisions are allowed, but the test expects an error. The comments are contradictory.

Recommendation: Either:

  • Rename to test_collision_detection_same_file_different_lines
  • Or clarify the expected behavior in the design doc

Performance Considerations

1. Redundant Canonicalization

Location: compiler/src/resolver.rs:36-39, 64-66

The code canonicalizes paths multiple times:

let source_path = source_path.canonicalize()?;  // Line 36
// ...
let canonical = included_path.canonicalize()?;  // Line 64

Impact: File system calls on every include. For deep include trees, this could add up.

Recommendation: Consider caching canonicalized paths in the Resolver struct.

2. Multiple String Allocations in Error Messages

Location: compiler/src/resolver.rs:137-143

for (name, locations) in definitions {
    if locations.len() > 1 {
        let mut msg = format!("Word '{}' is defined multiple times:\n", name);
        for loc in &locations {
            msg.push_str(&format!("  - {}\n", loc));
        }
        msg.push_str("\nHint: Rename one of the definitions to avoid collision.");
        errors.push(msg);
    }
}

Recommendation: Pre-allocate string capacity or use a more efficient string builder approach for large numbers of collisions.


Test Coverage Gaps

Missing Test Cases:

  1. No integration test for the actual include resolution

    • Tests only check collision detection logic
    • No test that actually loads and merges included files
    • No test for recursive includes (A includes B, B includes C)
  2. No test for diamond dependencies

    • A includes B and C, both B and C include D
    • Should verify D is only included once
  3. No test for circular include detection

    • A includes B, B includes A
    • Currently would likely infinite loop or stack overflow
  4. No test for find_stdlib() function

    • Different environment configurations
    • Missing stdlib directory scenarios
  5. No test for error messages

    • Verify error message format and content
    • Check that helpful hints are actually included

Recommendation: Add integration tests in a tests/ directory that:

#[test]
fn test_include_stdlib_module() {
    // Create temp stdlib, compile file that includes it
}

#[test]
fn test_include_relative_path() {
    // Create temp project with multiple files
}

#[test]
fn test_circular_include_error() {
    // Should detect and report circular dependencies
}

Documentation Issues

1. Outdated Comments in stdlib

Location: stdlib/math.seq:1, stdlib/http.seq:5

Both files still reference "cem3" instead of "Seq":

# Math Standard Library for cem3
# Stack Manipulation Utilities for cem3

And say:

# Since cem3 does not yet have a module system, these functions must be
# copied directly into your program file.

Issue: This PR literally adds the module system, making these comments incorrect.

Recommendation: Update to:

# Math Standard Library for Seq
#
# ## Usage
#
# include std:math

2. Missing Documentation in AST

Location: compiler/src/ast.rs:6-15

The new SourceLocation type lacks documentation about its purpose in collision detection.

Recommendation: Add doc comments:

/// Source location for error reporting and collision detection.
/// Used to track where word definitions originate when merging
/// multiple files via the include system.
#[derive(Debug, Clone, PartialEq)]
pub struct SourceLocation {
    /// Source file path
    pub file: PathBuf,
    /// Line number in source file (0-indexed)
    pub line: usize,
}

🎯 Additional Observations

Positive Design Choices:

  1. Include-once semantics: Prevents diamond dependency issues (resolver.rs:67-69)
  2. Extension inference: User writes include "foo", compiler adds .seq automatically
  3. Recursive resolution: Includes in included files work correctly
  4. Conservative approach: No namespaces yet, keeping it simple

Future Considerations:

Per docs/INCLUDE_SYSTEM.md, the design allows for future extensions:

  • User library paths (--lib flag)
  • Package system (include pkg:json)
  • Selective imports (include std:http { http-ok, http-not-found })

These are wisely deferred for now. Get the basics right first.


📋 Action Items

Must Fix (Blocking):

  1. ⚠️ Path traversal vulnerability - Add validation to prevent .. in relative paths
  2. 🐛 Unstable Rust feature - Remove let_chains syntax for stable compatibility

Should Fix (Pre-merge):

  1. 📝 Update stdlib documentation to reference Seq instead of cem3
  2. 🧪 Add integration tests for include resolution
  3. 📊 Implement line number tracking or document the limitation

Nice to Have (Post-merge):

  1. 🏎️ Performance: Cache canonicalized paths
  2. 🧪 Add circular include detection/test
  3. 📚 Add doc comments to public types

🎉 Conclusion

This is a solid implementation of a module system that follows good software engineering practices. The architecture is clean, the design is well-documented, and the feature works as advertised.

However, the path traversal security issue is critical and must be fixed before merge. The unstable Rust feature should also be addressed for compatibility.

With those fixes, this PR will be a valuable addition to Seq. Nice work! 🚀

Overall Assessment: Approve with required changes (security fix + stable Rust compatibility)

From the review, here's what I addressed:

| Issue                          | Status           | Notes
        |
|--------------------------------|------------------|-------------------------------------------------------------------------------------------------
--------|
| Path traversal security        | ✅ Fixed          | Added validation to reject .. and absolute paths starting with /
         |
| let_chains unstable syntax     | ❌ Not an issue   | let_chains was stabilized in Rust 1.88, we're on 1.91.1
         |
| stdlib comments "cem3" → "Seq" | ✅ Fixed          | Updated stdlib/math.seq and stdlib/stack-utils.seq
         |
| Misleading test name           | ✅ Fixed          | Renamed test_collision_detection_same_file_allowed →
test_collision_detection_same_file_different_lines |
| Line number tracking           | Known limitation | Already has TODO in code, not blocking
        |
| Integration tests              | Future work      | Good suggestion but not blocking for merge
        |

Files Modified:

- compiler/src/resolver.rs - Path traversal validation + test name fix
- stdlib/math.seq - Updated comments to reference Seq and include syntax
- stdlib/stack-utils.seq - Updated comments to reference Seq and include syntax

Verified:

- All tests pass
- Path traversal blocked: include "../etc/passwd" → "paths cannot contain '..'"
- Examples still compile and run correctly
@claude
Copy link
Copy Markdown

claude Bot commented Nov 26, 2025

PR Review: Module System Implementation

This PR implements a module/include system for Seq, adding support for include std:name and include "path". The implementation adds 774 lines.

Strengths

  1. Well-Designed Security - Prevents path traversal attacks (resolver.rs:112), blocks absolute paths (resolver.rs:118), uses canonical paths for deduplication (resolver.rs:64)

  2. Collision Detection - Excellent error messages for word name conflicts with file:line format and helpful hints

  3. Clean Parser Integration - The parse_include() method (parser.rs:57-94) cleanly handles both syntaxes with good error messages

  4. Diamond Dependency Prevention - Uses HashSet to track included files (resolver.rs:63-69)

  5. Thorough Test Coverage - Good unit tests for collision detection

Critical Issues

1. SECURITY: Path Traversal Bypass (HIGH PRIORITY)

Location: resolver.rs:112

The current check only looks for literal ".." substring which can be bypassed:

  • Windows: backslash vs forward slash mixing
  • Multiple slashes: "subdir//../../file.seq"

Recommendation: After constructing the path, verify it is still within allowed directory using canonical paths and starts_with check.

2. Windows Path Handling (MEDIUM PRIORITY)

Location: resolver.rs:118

The check for absolute paths only catches Unix-style paths starting with /. On Windows, absolute paths can start with C:\ or \.

Recommendation: Use Path::is_absolute() instead of manual string checking.

3. Missing Line Number Tracking

Location: ast.rs:51-53, resolver.rs:50-52

Line numbers are set to 0 (TODO comment). Collision errors will show file.seq:0 instead of actual line numbers.

4. Missing Tests

No tests for recursive includes, circular includes, invalid stdlib modules, file I/O errors, Windows paths, or deduplication.

Security Summary

  • Path traversal: HIGH severity - Needs fix
  • Absolute path blocking: MEDIUM severity - Incomplete (Windows)
  • Symlink handling: LOW severity - Good (via canonicalize)

Checklist Before Merge

  • Fix path traversal security issue
  • Fix Windows absolute path detection
  • Add recursive include test
  • Add circular include test

Overall Assessment

This is solid foundational work with good architecture and error handling. The primary concern is the path traversal security check needs hardening before merge.

Recommendation: Approve with required changes - fix the security issues, then merge.

Great work on collision detection and stdlib location logic!

The updated resolver.rs now has three layers of protection:

1. Early rejection of .. (line 112-116)

if rel_path.contains("..") {
    return Err("paths cannot contain '..'");
}

2. Cross-platform absolute path detection (line 119-126)

let rel_as_path = std::path::Path::new(rel_path);
if rel_as_path.is_absolute() {  // Works on Unix AND Windows
    return Err("paths cannot be absolute");
}

3. Canonical path verification (line 137-151)

let canonical_path = path.canonicalize()?;
let canonical_source = source_dir.canonicalize()?;

if !canonical_path.starts_with(&canonical_source) {
    return Err("resolves outside the source directory");
}

This third layer is the defense-in-depth the review asked for - it catches any bypass attempts (symlinks, encoded paths, multiple slashes, etc.) by
verifying the final resolved path is actually within the allowed directory.

- All tests pass
- Include demo works
- Path traversal blocked
@claude
Copy link
Copy Markdown

claude Bot commented Nov 26, 2025

Code Review - PR #24: Module System

Summary

This PR implements a module/include system for the Seq language, allowing code reuse via include std:name for standard library modules and include "path" for relative imports. The implementation is well-designed and follows the documented specification in docs/INCLUDE_SYSTEM.md.

Strengths

Architecture & Design

  • Clean separation of concerns: The new resolver.rs module handles include resolution independently
  • Security-first approach: Path traversal prevention with multiple layers of defense
  • Collision detection: Excellent error messages showing exact file locations when word names collide
  • Include-once semantics: Prevents diamond dependency issues by tracking included files

Code Quality

  • Good error messages: Detailed, actionable error messages throughout
  • Consistent with existing patterns: Follows established AST structure and error handling patterns
  • Documentation: Comprehensive comments in stdlib files and the design doc

Issues & Concerns

1. Missing Line Number Tracking (Medium Priority)

Issue: Source locations are tracked but always set to line 0 (resolver.rs:53), making collision error messages less helpful.
Recommendation: Track line numbers during parsing and attach them to each WordDef.

2. Test Coverage Gap (High Priority)

Issue: No integration tests for the include system. Missing test cases for include resolution, error handling, and parser edge cases.
Recommendation: Add comprehensive integration tests before merge.

3. Include Cycle Prevention (Verified Working)

The code DOES handle cycles correctly by marking files as included BEFORE processing their dependencies.
Recommendation: Add a test to verify cycle prevention works and add explanatory comment.

4. Parser String Handling (Low Priority)

The parser doesn't handle escaped quotes within include paths (parser.rs:86-88).
Impact: Very low - file paths with quotes are unusual.

5. Stdlib Location Search (Low Priority)

Using current directory as fallback could be surprising if a random directory has a stdlib/ folder.
Recommendation: Consider making this an error in release builds or log a warning.

Best Practices Observed

  1. Defense in depth for security
  2. Clear, actionable error messages
  3. Idiomatic Rust patterns
  4. Proper use of canonical paths
  5. Excellent documentation

Recommendations Summary

Must Fix (before merge)

  • Add integration tests for include resolution
  • Add tests for error cases

Should Fix (near-term)

  • Implement line number tracking
  • Add cycle prevention test with explanatory comment

Nice to Have (future)

  • Handle escaped quotes in paths
  • Make stdlib fallback more explicit
  • Add security boundary comments

Suggested Test Cases

Add to compiler/tests/include_tests.rs:

  • test_std_include_loads_stdlib
  • test_relative_include
  • test_diamond_dependency
  • test_include_cycle_prevented
  • test_path_traversal_rejected
  • test_absolute_path_rejected
  • test_collision_detection
  • test_missing_file_error

Conclusion

This is a solid implementation that follows the design spec well. The architecture is clean, security is good, and code quality is high. Main gaps:

  1. Testing - needs comprehensive integration tests
  2. Line numbers - would improve error messages

Once tests are added, this should be safe to merge. Great work on security and error handling!

@claude
Copy link
Copy Markdown

claude Bot commented Nov 26, 2025

PR Review: Modules/Include System

Overview

This PR implements a solid foundation for an include/module system for the Seq language. The implementation is well-structured and follows good design principles. Here's my detailed review:

✅ Strengths

Code Quality & Design

  • Clean separation of concerns: The Resolver module is well-isolated and has a clear single responsibility
  • Good error messages: Collision detection provides helpful multi-line error messages with source locations
  • Security-conscious: Path validation prevents directory traversal attacks with multiple layers of defense (..-checking, absolute path rejection, canonicalization verification)
  • Prevents diamond dependency issues: The included HashSet properly prevents double-inclusion
  • Follows Rust idioms: Good use of Result<T, String>, proper error propagation, and clean module structure

Documentation

  • Excellent documentation: docs/INCLUDE_SYSTEM.md clearly explains the design, syntax, and future directions
  • Good inline comments: Code is well-commented, especially complex logic in resolver.rs
  • Working examples: The include_demo examples demonstrate both std: and relative includes

🔍 Issues & Suggestions

1. Missing Line Number Tracking (Minor)

Location: compiler/src/resolver.rs:53

line: 0, // TODO: Track actual line numbers

Impact: Error messages won't show exact line numbers for collision detection
Suggestion: Consider tracking line numbers during parsing for better error reporting. This could be a follow-up PR.

2. Parser State Management (Minor)

Location: compiler/src/parser.rs:57-94

The parse_include() method manually tokenizes the std:name syntax by expecting three separate tokens (std, :, name). This is fragile if the tokenizer changes.

Current code:

if token == "std" {
    if !self.consume(":") { ... }
    let name = self.advance()...
}

Consideration: This works but couples the parser to tokenizer behavior. Document this assumption or consider handling "std:http" as a single token.

3. Incomplete Test Coverage (Moderate)

Location: compiler/src/resolver.rs:238-331

The unit tests cover:

  • ✅ No collision case
  • ✅ Collision detection
  • ✅ Same file, different lines

Missing test cases:

  • ❌ Recursive include resolution (A includes B, B includes C)
  • ❌ Diamond dependencies (A includes B and C, both B and C include D)
  • ❌ Path traversal attack attempts (e.g., include "../../../etc/passwd")
  • ❌ Symlink handling
  • std: module not found error
  • ❌ Relative include not found error
  • ❌ Malformed include syntax errors

Suggestion: Add integration tests that actually exercise the file system operations. The current tests only verify collision detection logic.

4. Windows Path Handling (Minor Security Consideration)

Location: compiler/src/resolver.rs:119-121

let rel_as_path = std::path::Path::new(rel_path);
if rel_as_path.is_absolute() {
    return Err(...);
}

Issue: On Windows, paths like C:\evil.seq or \\network\share\evil.seq need to be blocked. The current is_absolute() should handle this, but it's not explicitly tested.

Suggestion: Add a test case for Windows absolute paths (even if running on Linux, you can construct PathBuf instances to test the logic).

5. Stdlib Discovery Robustness (Moderate)

Location: compiler/src/resolver.rs:195-236

The find_stdlib() function tries multiple strategies, which is good. However:

Issue: If SEQ_STDLIB is set to a non-existent path, it errors immediately without trying fallbacks.

if let Ok(path) = std::env::var("SEQ_STDLIB") {
    let path = PathBuf::from(path);
    if path.is_dir() {
        return Ok(path);
    }
    return Err(...); // ❌ Should this be a hard error or warning?
}

Consideration: Should an invalid SEQ_STDLIB be a hard error, or should it fall through to other discovery methods? Current behavior is probably correct (explicit env var should be respected), but document this choice.

6. Parser Include Placement (Minor Design)

Location: compiler/src/parser.rs:46-51

Includes can appear anywhere in the source file (between word definitions). This might lead to confusing code.

Current behavior: Allows:

: foo 42 ;
include std:http
: bar 10 ;

Consideration: Consider requiring all includes at the top of the file for consistency. The spec in INCLUDE_SYSTEM.md doesn't specify this, but it's a common convention.

7. let-else Syntax (Compatibility Warning)

Location: compiler/src/resolver.rs:206-208

if let Ok(exe_path) = std::env::current_exe()
    && let Some(exe_dir) = exe_path.parent()
{

This uses let-else chains which require Rust 1.65+. Ensure your CI and documentation specify the minimum Rust version.

🔒 Security Analysis

Overall: Good security practices

Path traversal protection: Multiple layers

  • Explicit .. rejection
  • Absolute path rejection
  • Canonicalization + prefix checking

No arbitrary file system access: Limited to source directory and stdlib

⚠️ Symlink consideration: canonicalize() follows symlinks. This is fine for now but could allow escaping source directory if an attacker can create symlinks. For a compiler, this is acceptable risk (user controls their filesystem).

🧪 Test Coverage Assessment

Current state: Basic unit tests exist, but no integration tests

Recommendations:

  1. Add integration tests in tests/ directory that:

    • Create temporary files/directories
    • Test actual include resolution
    • Verify error messages for malformed includes
    • Test stdlib discovery with different configurations
  2. Add tests for the parser changes:

    • Valid include syntax variations
    • Invalid include syntax (missing quotes, malformed std: prefix, etc.)
  3. Consider adding property-based tests (e.g., with proptest) to verify that no path can escape the source directory

📊 Performance Considerations

Current implementation: Reads and parses each file synchronously

Observations:

  • ✅ Includes are cached (won't re-parse same file twice)
  • ✅ Linear complexity relative to number of unique includes
  • ⚠️ Could be slow for large include trees (but acceptable for now)

Future optimization: Could parallelize parsing of independent includes, but not needed yet.

🎯 Recommendations Summary

Must Fix (Blockers)

None - this PR is ready to merge

Should Fix (Before Merge)

  1. Add integration tests for actual file system operations
  2. Document minimum Rust version requirement (for let-else syntax)
  3. Add test cases for error conditions (missing files, malformed syntax)

Could Fix (Follow-up PRs)

  1. Implement line number tracking for better error messages
  2. Consider enforcing includes at top of file
  3. Add property-based tests for path security
  4. Add more comprehensive examples in docs

✨ Overall Assessment

This is high-quality work that:

  • Implements the feature as designed
  • Follows security best practices
  • Has good documentation
  • Maintains code quality standards
  • Sets up for future extensibility

The main gap is test coverage - adding integration tests would make this excellent.

Recommendation: ✅ Approve with minor suggestions

Great job on the implementation! The security considerations and error handling show careful thought about edge cases.

@navicore navicore merged commit 425a374 into main Nov 26, 2025
2 checks passed
@navicore navicore deleted the modules branch November 26, 2025 01:45
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant