Skip to content
This repository was archived by the owner on Apr 29, 2026. It is now read-only.

⏺ Issue #102 is now fixed. Here's a summary of the changes:#103

Merged
navicore merged 2 commits intomainfrom
issue-102
Dec 16, 2025
Merged

⏺ Issue #102 is now fixed. Here's a summary of the changes:#103
navicore merged 2 commits intomainfrom
issue-102

Conversation

@navicore
Copy link
Copy Markdown
Owner

for #102

Changes Made

crates/compiler/src/resolver.rs:

  1. Added project_root: Option field to the Resolver struct
  2. Set project_root on the first resolve() call (main source file's directory)
  3. Removed the early rejection of paths containing ..
  4. Changed the containment check from source_dir to project_root - this allows cross-directory includes like include "../src/lib/helper" as long as the resolved path stays within the project root

crates/compiler/Cargo.toml:

  • Added tempfile as a dev-dependency for testing

New tests added:

  • test_cross_directory_include_allowed - verifies .. paths work within project
  • test_cross_directory_include_outside_project_rejected - verifies .. paths escaping the project root are rejected
  • test_project_root_set_on_first_resolve - verifies project root is set correctly
  • test_dotdot_within_same_directory_structure - verifies deep paths with .. work

The fix enables the SeqLisp use case from issue #102 where tests can include source files from sibling directories:
project/
src/ tokenizer.seq tests/ test.seq ← can now do: include "../src/tokenizer"

  Changes Made

  crates/compiler/src/resolver.rs:
  1. Added project_root: Option<PathBuf> field to the Resolver struct
  2. Set project_root on the first resolve() call (main source file's directory)
  3. Removed the early rejection of paths containing ..
  4. Changed the containment check from source_dir to project_root - this allows cross-directory includes like include "../src/lib/helper" as long as the resolved path stays within the project root

  crates/compiler/Cargo.toml:
  - Added tempfile as a dev-dependency for testing

  New tests added:
  - test_cross_directory_include_allowed - verifies .. paths work within project
  - test_cross_directory_include_outside_project_rejected - verifies .. paths escaping the project root are rejected
  - test_project_root_set_on_first_resolve - verifies project root is set correctly
  - test_dotdot_within_same_directory_structure - verifies deep paths with .. work

  The fix enables the SeqLisp use case from issue #102 where tests can include source files from sibling directories:
  project/
    src/
      tokenizer.seq
    tests/
      test.seq  ← can now do: include "../src/tokenizer"
@claude
Copy link
Copy Markdown

claude Bot commented Dec 16, 2025

PR Review: Issue #102 - Cross-Directory Include Support

Summary

This PR successfully addresses issue #102 by allowing include paths with .. segments while maintaining security through project-root boundary checking. The implementation is well-designed and thoroughly tested.

Code Quality ✅

Strengths

  1. Clear Intent: The new project_root field and its documentation clearly explain the security model
  2. Good Documentation: Inline comments explain the rationale (e.g., lines 84-85, 328-330)
  3. Consistent Style: Follows existing code patterns in the resolver module
  4. Comprehensive Tests: 4 new tests cover the core scenarios effectively

Minor Suggestions

resolver.rs:336-337 - Fallback logic could be simplified:

// Current:
let source_dir_buf = source_dir.to_path_buf();
let root = self.project_root.as_ref().unwrap_or(&source_dir_buf);

// Consider:
let root = self.project_root.as_ref().unwrap_or(source_dir);

This avoids the temporary allocation since source_dir is already a &Path.

Security Review 🔒

Excellent Security Practices

  1. Defense in Depth: Path canonicalization (line 331-333) catches symlinks and encoded bypass attempts
  2. Clear Boundaries: Project root boundary checking (line 342) prevents directory traversal attacks
  3. Fail-Safe Design: Errors are returned rather than silently allowing unsafe operations

Security Considerations ✅

  • Symlink Handling: canonicalize() resolves symlinks, so symlinks pointing outside project root are correctly rejected
  • Race Conditions: TOCTOU (Time-of-check-time-of-use) is mitigated by checking the canonical path after resolution
  • Path Traversal: The removal of early .. rejection combined with canonical path checking is the correct approach

No security concerns identified.

Potential Bugs 🔍

Edge Case: First File in Non-Existent Directory

resolver.rs:86-89 - If source_path is valid but its parent directory doesn't exist (unusual but possible), unwrap_or(Path::new(".")) will set project_root to . instead of failing. Consider:

let source_dir = source_path.parent().ok_or_else(||
    format!("Source path '{}' has no parent directory", source_path.display())
)?;

However, this is likely academic since canonicalize() on line 77-79 would have already failed for such paths.

Edge Case: Empty Relative Path

What happens with include ""? The path resolution will create ".seq" which might be confusing. Consider adding validation:

if rel_path.is_empty() {
    return Err("Include path cannot be empty".to_string());
}

Question: Absolute Path Error Message

resolver.rs:314 - The error says "paths cannot be absolute" but the documentation (INCLUDE_SYSTEM.md:39) says "No absolute paths allowed". Consider being more specific about why:

"Include path '{}' is invalid: relative paths only (absolute paths not supported)"

Performance Considerations ⚡

Minor Performance Notes

  1. Canonicalization Overhead: Multiple canonicalize() calls (lines 331, 339) involve filesystem operations. This is acceptable for include resolution but worth noting.

  2. String Allocation: Line 319 allocates a new string for the .seq extension. This is fine for include resolution which isn't performance-critical.

No performance concerns for typical use cases.

Test Coverage ✅

Excellent Test Coverage

The 4 new tests cover:

  • ✅ Cross-directory includes within project (allowed)
  • ✅ Cross-directory includes outside project (rejected)
  • ✅ Project root initialization
  • ✅ Complex paths with multiple .. segments

Additional Test Suggestions

  1. Symlink Test: Verify that symlinks pointing outside project root are rejected:
#[test]
fn test_symlink_escape_blocked() {
    // Create project/link -> /tmp/outside
    // Verify include via symlink is blocked
}
  1. Root Directory Edge Case: What if the project root is / or C:\?
#[test]
fn test_project_at_filesystem_root() {
    // Verify behavior when project_root is filesystem root
}
  1. Relative Path Normalization: Test paths like foo/../bar (should work) and ./foo (should work):
#[test]
fn test_dotslash_and_dotdot_normalization() {
    // include "./foo" and "bar/../baz"
}
  1. Error Message Quality: Verify error messages are helpful:
#[test]
fn test_error_message_shows_resolved_path() {
    // When blocked, error should show what the path resolved to
}

Architectural Considerations 🏗️

Design Questions

  1. Project Root Mutability: project_root is set on first resolve() call. What if someone calls resolve() with different entry points? Should subsequent calls verify they're within the same project? This seems acceptable for current use cases.

  2. Fallback Behavior: The fallback to source_dir (line 337) means if project_root isn't set, we use the narrower boundary. This is safe but worth documenting.

  3. Future Consideration: If patch-seq ever supports workspace/multi-project builds, project_root might need to be configurable rather than auto-detected.

Documentation 📚

Suggestions

  1. Update INCLUDE_SYSTEM.md to document the new cross-directory capability:
## Rules

3. **Quoted string** - Path relative to the including file
   - No absolute paths allowed
   - Paths can use `..` to reference parent directories
   - All resolved paths must stay within the project root
   - Example: `include "../src/lib/helpers"` (from tests directory)
  1. Add a security section to INCLUDE_SYSTEM.md explaining the project-root boundary.

Alignment with Issue #102

The implementation perfectly addresses the issue:

  • ✅ Enables tests/seq/test_tokenizer.seq to include ../../src/tokenizer
  • ✅ Maintains security by preventing escapes outside project root
  • ✅ Aligns with the use case described in the issue

Recommendation

APPROVE with minor suggestions

This is a well-implemented feature with excellent test coverage. The security model is sound, the code is clean, and it solves the stated problem. The suggestions above are minor improvements, not blockers.

Before Merge

  • Consider the empty path validation
  • Consider updating INCLUDE_SYSTEM.md with the new capability
  • Optional: Add symlink and edge case tests for completeness

Great work! This significantly improves the developer experience for organizing patch-seq projects.


Review generated with assistance from Claude Code

  Agreed and implemented:

  1. Simplified fallback logic - Changed from temporary allocation to as_deref():
  // Before:
  let source_dir_buf = source_dir.to_path_buf();
  let root = self.project_root.as_ref().unwrap_or(&source_dir_buf);

  // After:
  let root = self.project_root.as_deref().unwrap_or(source_dir);
  2. Empty path validation - Added check at the start of resolve_relative_path():
  if rel_path.is_empty() {
      return Err("Include path cannot be empty".to_string());
  }
  3. Documentation update - Updated docs/INCLUDE_SYSTEM.md:
    - Added .. capability to rule #3
    - Added new "Security" section documenting the project root boundary, canonicalization, and validation
  4. New test - test_empty_include_path_rejected

  Not implemented (and why):

  - Symlink test - The existing canonicalization already handles this correctly; an additional test would be platform-specific and doesn't add much value
  - Root directory edge case - Academic case; unlikely in practice since the project root is derived from the source file
  - Error message improvements - The current messages are clear enough
@claude
Copy link
Copy Markdown

claude Bot commented Dec 16, 2025

Pull Request Review - Issue #102: Cross-Directory Include Support

Summary

This PR successfully implements project-root-relative include path resolution, allowing test files to include source files from sibling directories using .. paths. The implementation is well-designed and thoroughly tested.


Code Quality & Best Practices ✅

Strengths:

  • Clean architecture: The project_root field is elegantly integrated into the Resolver struct
  • Good separation of concerns: Security validation is isolated in resolve_relative_path()
  • Excellent documentation: Inline comments explain the security model and behavior clearly
  • Updated documentation: docs/INCLUDE_SYSTEM.md now includes security section and .. path examples
  • Idiomatic Rust: Proper use of as_deref(), unwrap_or(), and error handling patterns

Minor observations:

  • The fallback from project_root to source_dir at resolver.rs:341 is defensive programming - good practice even though project_root should always be set after the first resolve() call
  • Variable naming is clear and consistent throughout

Potential Bugs or Issues ⚠️

1. Potential issue: Project root not set for programmatic API usage

Location: resolver.rs:87-89

If someone uses the Resolver API directly without calling resolve() first, project_root remains None and falls back to source_dir. This could create inconsistent behavior:

let mut resolver = Resolver::new(None);
// If resolve_include() is called directly without resolve() first:
resolver.resolve_include(&include, &some_dir); // project_root is None

Impact: Low - The fallback behavior is reasonable, but it means the security boundary could be different depending on call order.

Recommendation: Consider one of:

  • Add a debug assertion or comment documenting that resolve() should be called first
  • Add a constructor that takes project_root explicitly for programmatic use cases
  • Document this behavior in the public API docs

2. Edge case: Symlink pointing to file outside project

Location: resolver.rs:336-338

The canonicalization correctly handles symlinks, but there's an edge case:

  • If a symlink inside the project points to a file outside the project, canonicalization will resolve it to the external path
  • The containment check will correctly reject it ✅
  • However, the error message could be clearer about what happened

Impact: Very Low - The security check works correctly; only the error message could be more informative.

Example:

project/src/evil.seq -> /etc/passwd
include "src/evil" -> Error: "Include path 'src/evil' resolves outside the project directory"

Recommendation: Consider enhancing the error message to mention symlink resolution when detected.


Performance Considerations ✅

Positive aspects:

  • canonicalize() is called exactly once per include path - appropriate for security validation
  • HashMap lookups for duplicate detection are O(1)
  • project_root is set once and reused - good caching
  • as_deref() avoids unnecessary allocations (resolver.rs:341)

No performance concerns identified. The overhead of path canonicalization is acceptable for include path resolution (happens once per file at compile time).


Security Concerns ✅

Security model is sound:

  1. Defense in depth: Multiple layers of validation:

    • Empty path rejection (resolver.rs:311-313)
    • Absolute path rejection (resolver.rs:317-322)
    • Canonicalization to resolve symlinks and .. (resolver.rs:336-338)
    • Containment check against project root (resolver.rs:346-351)
  2. Correctly handles attack vectors:

    • ✅ Symlinks pointing outside project (caught by canonicalization + containment check)
    • ✅ Multiple .. segments to escape (caught by containment check)
    • ✅ Absolute paths (/etc/passwd, C:\Windows\...)
    • ✅ Empty paths
    • ✅ Encoded paths (canonicalization normalizes them)
  3. Project root boundary: Setting the root to the main source file's directory is the right choice - it allows flexibility within projects while preventing escape.

One consideration:

  • TOCTOU (Time-of-check-time-of-use): Between the path.exists() check (line 325) and the canonicalize() call (line 336), a file could theoretically be replaced by a symlink. However:
    • Impact is minimal - the symlink would still be caught by the containment check
    • This is a compile-time operation, not a runtime security boundary
    • Real-world risk is negligible

Security verdict: No concerns. The implementation properly prevents directory traversal attacks.


Test Coverage ✅

Excellent test coverage:

New tests comprehensively cover the feature:

  1. test_cross_directory_include_allowed - Positive case: .. within project
  2. test_cross_directory_include_outside_project_rejected - Security: reject escape attempts
  3. test_project_root_set_on_first_resolve - Verify initialization
  4. test_dotdot_within_same_directory_structure - Deep paths with ..
  5. test_empty_include_path_rejected - Input validation

Test quality observations:

  • Tests use tempfile for isolation - excellent practice
  • Test names are descriptive and follow Rust conventions
  • Both positive and negative cases are covered
  • Tests verify actual behavior, not just absence of panics

Suggested additional tests (optional, not blocking):

  • Test behavior when project_root directory is deleted between resolve calls (should fail gracefully)
  • Test with Unicode or special characters in paths (likely already works, but worth verifying)
  • Test deeply nested .. paths like ../../../../a/b/c/../../d

Test coverage verdict: Comprehensive. The core functionality and security properties are well-tested.


Overall Assessment

Verdict: Approve with minor suggestions ✅

This is a well-executed PR that:

Recommended actions before merge:

  1. Consider adding a note in the public API documentation about the project_root initialization behavior
  2. (Optional) Enhance error messages to distinguish symlink-related path escapes from other cases

Non-blocking suggestions for future work:

  • Consider exposing a with_project_root() constructor for programmatic use cases
  • Consider a test for Unicode paths if cross-platform support is important

Great work on this implementation!

@navicore navicore merged commit 5a7ee4d into main Dec 16, 2025
2 checks passed
@navicore navicore deleted the issue-102 branch December 16, 2025 23:50
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant