Skip to content

feat: Shared Schema Cache (SchemaStore) — Step 6.1#538

Merged
tnaum-ms merged 8 commits intofeature/shell-integrationfrom
dev/tnaum/schema-store
Mar 24, 2026
Merged

feat: Shared Schema Cache (SchemaStore) — Step 6.1#538
tnaum-ms merged 8 commits intofeature/shell-integrationfrom
dev/tnaum/schema-store

Conversation

@tnaum-ms
Copy link
Copy Markdown
Collaborator

Step 6.1: Shared Schema Cache (SchemaStore)

Introduces a SchemaStore singleton that accumulates schema data per {clusterId, databaseName, collectionName}, enabling cross-tab and scratchpad schema sharing.

What Changed

New: SchemaStore singleton (src/documentdb/SchemaStore.ts)

  • Map<"clusterId::db::coll", SchemaAnalyzer> — shared schema cache
  • Debounced onDidChangeSchema event (1s per key) to avoid excessive consumer churn
  • clearSchema() / clearCluster() / clearDatabase() for lifecycle cleanup
  • Registered as VS Code Disposable in extension.ts

Refactored: ClusterSession delegates to SchemaStore

  • Removed private _schemaAnalyzer and _highestPageAccumulated fields
  • getKnownFields() and getCurrentPageAsTable() now delegate to SchemaStore
  • Removed unused getCurrentSchema() method
  • Added _clusterId, _databaseName, _collectionName for SchemaStore key construction
  • Page-based dedup dropped — stats are already documented as approximate

New: Scratchpad results feed SchemaStore

  • Extended ExecutionResult with source.namespace (extracted from @mongosh ShellResult.source)
  • executeScratchpadCode.ts feeds document results (very conservative: only Cursor and Document types)
  • Running db.users.find() in the scratchpad now populates field completions for the users collection across all Collection View tabs

Wired: Schema cleanup on disconnect/drop

  • removeConnectionclearCluster()
  • deleteCollectionclearSchema()
  • deleteDatabaseclearDatabase()
  • ClustersClient.deleteClientclearCluster() (lazy import to avoid circular dep)

Tests

  • 22 unit tests covering read/write ops, key isolation, event debouncing, lifecycle, singleton behavior
  • All existing scratchpad tests pass (52 tests across 4 suites)
  • Manual test plan (6 scenarios) in docs/plan/06.1-shared-schema-cache.md

Key Behaviors

Scenario Before After
Two tabs on same collection Independent schemas Shared — Tab B gets Tab A's field completions
Scratchpad find() result Documents discarded Fed to SchemaStore → completions available
Tab closed, new tab opened Schema lost Schema persists in SchemaStore
Connection removed Schema leaked Schema cleared via clearCluster()
Collection dropped Schema leaked Schema cleared via clearSchema()

Plan

Full design doc with architecture, decisions (D1–D7), resolved questions (Q1–Q3), deviations log: docs/plan/06.1-shared-schema-cache.md (gitignored, local only)

…1 WI-1)

Introduce SchemaStore — a shared, cluster-scoped schema cache that
accumulates schema data per {clusterId, databaseName, collectionName}.

Key features:
- Lazy singleton with Map<key, SchemaAnalyzer>
- Debounced onDidChangeSchema event (1s per key)
- clearSchema fires immediately (not debounced)
- clearCluster removes all schemas for a cluster
- Implements vscode.Disposable

21 unit tests covering read/write ops, key isolation, event
debouncing, and lifecycle management.
Remove _schemaAnalyzer and _highestPageAccumulated from ClusterSession.
Schema operations now delegate to the shared SchemaStore singleton:

- addDocuments → SchemaStore.addDocuments()
- getKnownFields → SchemaStore.getKnownFields()
- getPropertyNamesAtLevel → SchemaStore.getPropertyNamesAtLevel()
- Remove unused getCurrentSchema() method

ClusterSession stores _clusterId, _databaseName, _collectionName
for SchemaStore key construction.

Register SchemaStore as a VS Code disposable in extension.ts.

Drop page-based dedup (_highestPageAccumulated) — schema accumulates
monotonically and statistics are already documented as approximate.
Extend ExecutionResult with optional source.namespace extracted from
@MongoSH's ShellResult.source. This preserves the database/collection
name that was previously discarded.

executeScratchpadCode now feeds document results to SchemaStore using
a very conservative approach: only 'Cursor' and 'Document' result
types are fed. All other types (InsertOneResult, UpdateResult, etc.)
are skipped.

This enables cross-surface schema sharing — running db.users.find()
in the scratchpad populates field completions for the users collection
across all Collection View tabs.
Wire SchemaStore cleanup into existing lifecycle hooks:

- removeConnection: clearCluster() alongside CredentialCache cleanup
- deleteCollection: clearSchema() for the dropped collection
- deleteDatabase: clearDatabase() for all collections in the database
- ClustersClient.deleteClient: clearCluster() on client teardown

Add clearDatabase() method to SchemaStore for database-scoped cleanup.
Add unit test for clearDatabase().
@tnaum-ms tnaum-ms requested a review from a team as a code owner March 24, 2026 12:15
@tnaum-ms tnaum-ms requested a review from Copilot March 24, 2026 12:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces a shared, extension-wide schema cache (SchemaStore) keyed by {clusterId, databaseName, collectionName} to enable cross-tab and scratchpad-driven schema sharing, and wires lifecycle cleanup into disconnect/drop flows.

Changes:

  • Added SchemaStore singleton with per-collection schema accumulation, debounced change events, and clear/reset APIs.
  • Refactored ClusterSession to feed/query schema via SchemaStore instead of per-session accumulation.
  • Extended scratchpad execution results with source.namespace and feeds document results into SchemaStore; added schema cleanup hooks on connection/database/collection removal.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/extension.ts Registers SchemaStore for disposal with extension lifecycle.
src/documentdb/SchemaStore.ts Implements singleton schema cache, debounced change events, and clear/reset APIs.
src/documentdb/SchemaStore.test.ts Adds unit tests for SchemaStore behavior (read/write, isolation, debounce, lifecycle).
src/documentdb/ClusterSession.ts Delegates schema operations to SchemaStore and feeds query results into shared cache.
src/documentdb/ClustersClient.ts Clears SchemaStore data when a client is deleted.
src/documentdb/scratchpad/types.ts Extends ExecutionResult with optional source.namespace metadata.
src/documentdb/scratchpad/ScratchpadEvaluator.ts Extracts source.namespace from @mongosh shell results.
src/commands/scratchpad/executeScratchpadCode.ts Feeds scratchpad document/cursor results into SchemaStore.
src/commands/removeConnection/removeConnection.ts Clears cluster schema cache on connection removal.
src/commands/deleteDatabase/deleteDatabase.ts Clears database schema cache on database deletion.
src/commands/deleteCollection/deleteCollection.ts Clears collection schema cache on collection deletion.

No circular dependency exists between ClustersClient and SchemaStore.
Replace dynamic import() with a normal static import.

Addresses PR review finding #5.
@tnaum-ms
Copy link
Copy Markdown
Collaborator Author

Re: lazy import in deleteClient — Fixed in 700a4fd. Replaced the lazy import() with a static import. Confirmed no circular dependency exists: SchemaStore imports from schema-analyzer and vscode; ClustersClient imports from mongodb and local auth/credential modules. The two are independent.

@tnaum-ms
Copy link
Copy Markdown
Collaborator Author

PR Review Responses

Finding 1 (live cross-tab updates): Acknowledged. Severity assessed as Low. The shared SchemaStore data is consumed on the next query — tabs do not receive push updates. The onDidChangeSchema event was designed for Step 7 (Scratchpad CompletionItemProvider), which will be the first real consumer needing push semantics. Wiring a tRPC subscription for Collection View tabs is out of scope for the infrastructure PR. No code change.

Finding 2 (silent invalidation): Acknowledged. Severity assessed as Low. clearCluster/clearDatabase intentionally do not fire per-key events (plan §D4: "No granular events — consumers should treat cluster disconnect as full invalidation"). No production code subscribes to onDidChangeSchema, so events would have zero effect today. Invalidation contract will be designed with Step 7. No code change.

Finding 3 (WithId<Document> typing): Acknowledged. The type predicate in feedResultToSchemaStore is technically unsound — it does not verify _id exists. However, the underlying SchemaAnalyzer.addDocuments() in the schema-analyzer package also requires WithId<Document> because its primary data source is the driver's find() which always returns WithId<Document>. Loosening SchemaStore alone would still hit the same constraint at SchemaAnalyzer. The scratchpad path almost always returns _id (omitted only with explicit { _id: 0 } projections). Deferring a decision on whether to loosen the schema-analyzer package API. No code change for now.

Finding 4 (namespace.collection optionality): Acknowledged. The defensive guard is intentional. When @mongosh provides a namespace, collection is always present. The guard costs nothing and protects against future upstream changes. No code change.

Finding 5 (lazy import): Fixed in 700a4fd — replaced with static import.

Add '_id' in d to the type predicate in feedResultToSchemaStore so the
narrowing to WithId<Document> is sound. Documents from projections
with { _id: 0 } are now correctly excluded — their artificial shapes
should not feed schema analysis.

Addresses PR review finding #3.
@tnaum-ms
Copy link
Copy Markdown
Collaborator Author

Re: unsound type predicate — Fixed in d756cb6. Added '_id' in d to the type predicate so the narrowing to WithId<Document> is sound. Documents from { _id: 0 } projections are now excluded — their artificial shapes should not feed schema analysis.

@tnaum-ms tnaum-ms merged commit d6fcbf4 into feature/shell-integration Mar 24, 2026
5 checks passed
@tnaum-ms tnaum-ms deleted the dev/tnaum/schema-store branch March 24, 2026 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants