Skip to content

fix(vfs): stabilize file identity ingress#129

Merged
hongjr03 merged 5 commits into
masterfrom
fix/vfs-identity-ingress
May 24, 2026
Merged

fix(vfs): stabilize file identity ingress#129
hongjr03 merged 5 commits into
masterfrom
fix/vfs-identity-ingress

Conversation

@hongjr03
Copy link
Copy Markdown
Member

@hongjr03 hongjr03 commented May 24, 2026

This change moves file identity to the VFS ingress boundary and makes diagnostics publish against explicit document targets. Before this, the same physical file could enter through loader paths, LSP URIs, watcher paths, symlinks, different Windows spellings, or external diagnostic paths and keep inconsistent FileId, root ownership, open-buffer, or diagnostic state. That made behavior depend on ingress order and path spelling.

Previous Shape

flowchart TD
    Loader["Loader path"] --> VFS["VFS path map"]
    LSP["LSP URI path"] --> VFS
    Watcher["Watcher path"] --> VFS
    External["External diagnostics path"] --> VFS

    VFS --> FileId["FileId by first path spelling"]
    FileId --> FileSet["FileSet root classification"]
    FileId --> MemDocs["MemDocs by FileId"]
    FileId --> Diagnostics["Diagnostics cache by FileId"]

    MemDocs --> Version["Document version from current buffer"]
    VFS --> Uri["Primary VFS URI"]
    Version --> Publish["publishDiagnostics"]
    Uri --> Publish
Loading

New Shape

flowchart TD
    Loader["Loader path"] --> Identity["VFS identity service"]
    LSP["LSP URI path"] --> Identity
    Watcher["Watcher path"] --> Identity
    Save["Save/change/delete path"] --> Identity
    External["External diagnostics path"] --> Identity

    Identity --> FileId["Stable canonical FileId"]
    Identity --> Aliases["Alias evidence set"]

    FileId --> FileSet["Identity-aware file-set partition"]
    FileId --> MemDocs["One canonical analysis buffer"]
    Aliases --> OpenDocs["Open URI versions"]

    OpenDocs --> Targets["DiagnosticPublishTarget: FileId + URI + version"]
    Targets --> Cache["Diagnostics cache by FileId + URI"]
    Cache --> Publish["publishDiagnostics per open URI"]
Loading

Flow Difference

sequenceDiagram
    participant Client
    participant VFS
    participant MemDocs
    participant Diagnostics

    Client->>VFS: didOpen(alias URI)
    VFS->>VFS: register alias evidence and resolve canonical FileId
    VFS->>MemDocs: map alias URI to canonical FileId
    MemDocs->>MemDocs: keep URI-local version
    MemDocs->>Diagnostics: refresh diagnostic targets for FileId
    Diagnostics->>Client: publish to every attached open URI target

    Client->>MemDocs: didClose(alias URI)
    MemDocs->>Diagnostics: refresh diagnostic targets for FileId
    Diagnostics->>Client: clear stale alias diagnostics
Loading

What Changed

  • Added a VFS identity service that records exact path aliases, normalized path keys, and platform file identity keys without keeping long-lived OS file handles.
  • Handles late identity evidence by merging duplicate FileIds into a stable owner and redirecting aliases instead of silently preserving split identity.
  • Adds an explicit VFS ingress registration path for LSP opens, so URI aliases can be resolved before deciding whether incoming text should become VFS contents.
  • Makes file-set partitioning consume VFS alias evidence, so project/root ownership no longer depends only on the first path spelling inserted into VFS.
  • Separates exact single-file loads from directory scan roots in the project model.
  • Updates MemDocs to keep one canonical analysis buffer per FileId while tracking every open URI spelling with its own LSP version.
  • Detaches divergent alias opens from the canonical analysis buffer instead of letting a second URI’s alternate text overwrite or later corrupt the shared buffer.
  • Makes didChange atomic: invalid LSP ranges no longer advance URI versions or partially mutate document text.
  • Introduces explicit DiagnosticPublishTarget values so push diagnostics always use a matching (FileId, URI, version) tuple.
  • Scopes the diagnostics cache by (FileId, URI) and refreshes targets on both open and close, so alias documents receive diagnostics and stale alias diagnostics are cleared promptly.
  • Adds diagnostics freshness revisions so stale background diagnostic batches cannot publish or clear newer diagnostic state.
  • Keeps workspace diagnostics able to report unopened workspace files through a separate workspace target path, while push diagnostics stay scoped to open documents.
  • Adds Qihe run identity so logs, failures, and diagnostics from older runs cannot commit after a newer run starts.
  • Makes Qihe diagnostics respect pull-diagnostics clients by requesting diagnostic refresh instead of force-publishing.
  • Treats workspace reload as transactional: reload errors keep the existing workspace model instead of committing partial results.

Alias Buffer Policy

This keeps the current one-analysis-buffer-per-physical-file model explicit. If the same physical file is opened through multiple URI spellings with matching text, those URIs share the canonical analysis buffer and keep URI-local versions. If an alias opens with divergent text, the URI remains tracked for close/version bookkeeping but is detached from the canonical analysis buffer. That avoids silently mixing edits from incompatible URI-local text states.

Benefits

  • Prevents duplicate FileIds for the same physical file across symlink, junction, case, verbatim path, and open-before-load scenarios.
  • Makes root classification stable across aliases and ingress order.
  • Avoids permanently holding one file handle per file in VFS.
  • Preserves canonical analysis text while making LSP document versions URI-local.
  • Prevents divergent alias text from corrupting the canonical analysis buffer.
  • Prevents invalid didChange ranges from creating version/text skew.
  • Prevents diagnostics from pairing a primary VFS URI with an alias document version.
  • Prevents diagnostics cache hits for one URI from suppressing publication to another open URI.
  • Clears diagnostics from alias URIs when those documents close, even if file contents do not change.
  • Prevents stale diagnostics workers from overwriting or clearing newer diagnostics.
  • Prevents stale Qihe runs from replacing current diagnostics or progress state.
  • Avoids committing partial workspace models when project reload has errors.

Scope

This change closes the file-ingress, open-document, diagnostics-target, Qihe diagnostics freshness, and workspace reload transaction issues in this area. It does not replace the whole task runtime or external-process controller. Full external-process cancellation, temp workspace ownership, and log backpressure remain separate runtime/Qihe lifecycle work.

Validation

  • cargo test -p vfs
  • cargo test -p project-model
  • cargo test -p vizsla
  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets -- -D warnings
  • rustup run nightly cargo test --workspace
  • git diff --check

@hongjr03 hongjr03 merged commit a9f0890 into master May 24, 2026
7 checks passed
@hongjr03 hongjr03 deleted the fix/vfs-identity-ingress branch May 24, 2026 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant