PDX-492: feat(mcp) — provar_org_describe tool, H2a (Thread F) by mrdailey99 · Pull Request #188 · ProvarTesting/provardx-cli

mrdailey99 · 2026-05-19T21:46:08Z

Context

Adds provar_org_describe, a read-only MCP tool that surfaces cached Salesforce describe data from the Provar IDE workspace .metadata directory. This is H2a of the PDX-H2 plan (Thread F, single PR). The sibling H2b PR in Thread C will consume this tool's output to populate provar_testcase_generate field-type hints.

Why

provar_testcase_generate currently has no source of truth for which fields on a Salesforce object are required and what their types are. Agents either guess (brittle), hard-code names, or call the live SF API (slow + auth-dependent). The Provar IDE already caches describe data on disk after a connection is loaded — this PR exposes that cache as a read-only MCP tool so it becomes useful outside the IDE.

Workspace discovery heuristic

The tool walks three candidate directories in order; the first one that exists wins:

<parent-of-project>/workspace-<basename>/ — sibling workspace pattern (default for Provar IDE in this layout).
<parent-of-project>/Provar_Workspaces/workspace-<name-dashes>/ — shared Provar_Workspaces directory.
~/Provar/workspace-<name-dashes>/ — user-home fallback.

<name-dashes> = basename(project_path).trim().replace(/\s+/g, '-').toLowerCase().

Cache schema (for H2b consumers)

No prior .metadata reader exists in the codebase, so this PR designs the schema H2b should produce/consume. It is intentionally simple — one file per object, JSON preferred:

// <workspace>/.metadata/<connection_name>/<ObjectApiName>.json
{
  "name": "Account",
  "fields": [
    { "name": "Name",  "type": "string", "defaultValue": null, "nillable": false },
    { "name": "Phone", "type": "phone",  "defaultValue": null, "nillable": true  }
  ]
}

The tool also falls back to .xml and .object (legacy CustomObject metadata) files for migration paths.

Output shape

{
  workspace_path: string | null;       // null when no workspace discovered
  cache_age_ms: number | null;         // mtime delta of cache dir
  objects: Array<{
    name: string;
    exists: boolean | null;            // null when cache missing entirely
    required_fields: Array<{ name, type, default_value, nillable }>;
    field_count: number;
  }>;
  details?: { suggestion: string };    // populated on cache-miss
}

Cache-miss behaviour

Returns a structured response (not isError) with details.suggestion telling the agent how to recover:

"Open this project in Provar IDE and load the 'MyOrg' connection, or pass field-type hints inline to provar_testcase_generate."

This is the same advisory shape provar_properties_read uses for divergence warnings, so consumers don't need a special error path.

Path safety

assertPathAllowed is called on both project_path and the composed connection_dir so a connection_name like ../../etc cannot escape the workspace.
connection_name is additionally rejected outright if it contains a path separator or .. segment (returns PATH_TRAVERSAL).

Files changed

src/mcp/tools/orgDescribeTools.ts — new tool
src/mcp/server.ts — registered under existing inspect tool group
test/unit/mcp/orgDescribeTools.test.ts — 14 unit tests, all 7 plan scenarios
scripts/mcp-smoke.cjs — 2 new RPC calls (happy path + cache miss)
docs/mcp.md — new "Org metadata access" section with TOC entry
docs/mcp-pilot-guide.md — new Scenario 13 (org-aware generation)

Test plan

node_modules/.bin/tsc -p . — clean compile
yarn lint — clean
node_modules/.bin/mocha "test/**/*.test.ts" — 1169 passing, 1 pending (baseline)
node_modules/.bin/mocha "test/unit/mcp/orgDescribeTools.test.ts" — 14 passing covering scenarios (a)–(g)
node scripts/mcp-smoke.cjs --profile inspect — both new entries PASS
Full smoke run — 57 passing, 0 failed
Reviewer to verify docs render in GitHub preview

Out of scope (H2b — sibling Thread C PR)

Adding field_type_hints / required_fields_hint to provar_testcase_generate
Wiring provar_org_describe output into the generator

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

…etadata cache) Introduces provar_org_describe, a read-only MCP tool that surfaces cached Salesforce describe data from the Provar IDE workspace .metadata directory so authoring tools (provar_testcase_generate) can produce field-correct data steps without a live SF API call. Workspace discovery walks three candidates in order: 1. <parent>/workspace-<basename>/ (sibling pattern) 2. <parent>/Provar_Workspaces/workspace-<name-dashes>/ 3. ~/Provar/workspace-<name-dashes>/ (user-home fallback) Returns a structured cache-miss response with details.suggestion when the connection cache is absent, so the agent can either prompt the user to load the connection in Provar IDE or fall back to inline field hints. Registered under the existing 'inspect' tool group. H2b (sibling thread) consumes this tool's output to populate generator hints. RCA: provar_testcase_generate had no source of truth for which fields on a Salesforce object are required and what their types are. Agents either guessed (producing brittle tests), hard-coded names, or called the live SF API (slow + auth-dependent). The Provar IDE already caches describe data on disk after a connection is loaded — this PR exposes that cache as a read-only MCP tool so the cache becomes useful outside the IDE. Fix: New tool src/mcp/tools/orgDescribeTools.ts with strict path-policy checks on both project_path and connection_name (separator/traversal rejected). Cache schema is one file per object (.json preferred, .xml / .object accepted) so the existing IDE writer needs no change. Cache miss returns a stable shape with suggestion rather than an isError response, so callers do not need a try/catch path. 14 unit tests cover all 7 plan scenarios (workspace discovery, fallback, cache miss, path policy, happy path, field_filter, objects filter). Two smoke entries cover happy + miss. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-19T21:46:27Z

Quality Orchestrator

🟢 LOW · 4 / 100 · All changed files have mapped tests.

🧪 Tests to Run · Running 2 of 54 tests

unit/mcp/server.test.ts
unit/mcp/orgDescribeTools.test.ts

▶ Run command

npx vitest run \
  unit/mcp/server.test.ts \
  unit/mcp/orgDescribeTools.test.ts

_{⚡ quality-orchestrator · /qo stub <file> · qo analyze-local}

Copilot

Pull request overview

Adds a new read-only MCP inspection tool (provar_org_describe) to surface Salesforce object/field describe metadata from the Provar IDE workspace .metadata cache, enabling downstream authoring tools to avoid live SF API calls and rely on on-disk cached schema.

Changes:

Implemented provar_org_describe tool with workspace discovery, path-policy enforcement, and JSON/XML/Object cache readers.
Registered the tool under the existing inspect tool group in the MCP server.
Added unit tests, smoke-script coverage, and documentation updates (MCP reference + pilot guide scenario).

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`src/mcp/tools/orgDescribeTools.ts`	New MCP tool implementation: workspace discovery, cache reading/parsing, response shaping, path policy checks.
`src/mcp/server.ts`	Registers org-describe tools under the `inspect` tool group.
`test/unit/mcp/orgDescribeTools.test.ts`	Unit tests for workspace discovery, cache-miss behavior, path policy, filters, and happy path.
`scripts/mcp-smoke.cjs`	Adds smoke calls for cache-miss and happy-path flows.
`docs/mcp.md`	Adds “Org metadata access” section and `provar_org_describe` reference docs.
`docs/mcp-pilot-guide.md`	Adds Scenario 13 describing org-aware generation workflow using the new tool.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    fields.push({
+      name,
+      type: (f['type'] as string | undefined) ?? 'unknown',
+      defaultValue: (f['defaultValue'] as string | undefined) ?? null,
+      // XML defaults: required = !nillable. In the .object format, "required" is rare,
+      // so we default to nillable=true (optional) unless explicitly required.
+      nillable: f['required'] === 'true' ? false : true,


+/** Returns the first candidate workspace that exists, or null. */
+export function discoverWorkspace(projectPath: string): string | null {
+  for (const candidate of workspaceCandidates(projectPath)) {
+    try {
+      if (fs.existsSync(candidate) && fs.statSync(candidate).isDirectory()) {
+        return candidate;
+      }
+    } catch {
+      // Permission errors etc. — skip and try next candidate
+    }
+  }
+  return null;


+    cached = path.extname(file) === '.json' ? readJsonCacheFile(file) : readXmlCacheFile(file);
+  } catch (e) {
+    log('warn', 'org_describe: failed to parse cache file', { file, error: (e as Error).message });
+    return { name: objectName, exists: false, required_fields: [], field_count: 0 };


+| Output field         | Description                                                                                                                                         |
+| -------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `workspace_path`     | Absolute resolved path to the discovered workspace, or `null` when none of the three candidate directories exists.                                  |
+| `cache_age_ms`       | `mtime` delta in milliseconds of the connection cache directory, or `null` when the cache is missing.                                               |
+| `objects[]`          | Array of `{ name, exists, required_fields, field_count }`. `exists` is `true` (cached), `false` (requested but not cached), or `null` (cache miss). |
+| `details.suggestion` | Present **only** on cache miss. Tells the agent how to populate the cache (open Provar IDE) or how to proceed without it (inline hints).            |
+


+/**
+ * Parse a legacy .object XML file (CustomObject metadata) into the canonical shape.
+ * Only extracts the bare minimum the tool needs: field name, type, nillable.
+ */
+function readXmlCacheFile(filePath: string): CachedObject {
+  const raw = fs.readFileSync(filePath, 'utf-8');
+  const parsed = XML_PARSER.parse(raw) as Record<string, unknown>;
+  const root = (parsed['CustomObject'] ?? parsed['toolingObjectInfo'] ?? {}) as Record<string, unknown>;
+  const fieldsRaw = root['fields'];
+  if (!Array.isArray(fieldsRaw)) return { name: path.basename(filePath, path.extname(filePath)), fields: [] };
+
+  const fields: CachedField[] = [];
+  for (const f of fieldsRaw as Array<Record<string, unknown>>) {
+    const name = (f['fullName'] ?? f['name']) as string | undefined;
+    if (!name) continue;
+    fields.push({
+      name,
+      type: (f['type'] as string | undefined) ?? 'unknown',
+      defaultValue: (f['defaultValue'] as string | undefined) ?? null,
+      // XML defaults: required = !nillable. In the .object format, "required" is rare,
+      // so we default to nillable=true (optional) unless explicitly required.
+      nillable: f['required'] === 'true' ? false : true,
+    });
+  }
+  return { name: path.basename(filePath, path.extname(filePath)), fields };
+}
+
+/** Look up the cache file for one object, trying .json then .xml. */
+function findObjectCacheFile(connectionDir: string, objectName: string): string | null {
+  const jsonPath = path.join(connectionDir, `${objectName}.json`);
+  if (fs.existsSync(jsonPath)) return jsonPath;
+  const xmlPath = path.join(connectionDir, `${objectName}.xml`);
+  if (fs.existsSync(xmlPath)) return xmlPath;
+  // Legacy CustomObject layout (.object extension)
+  const objPath = path.join(connectionDir, `${objectName}.object`);
+  if (fs.existsSync(objPath)) return objPath;
+  return null;
+}


+  if (connectionName.includes('/') || connectionName.includes('\\') || connectionName.split(/[/\\]+/).includes('..')) {
+    throw new PathPolicyError(
+      'PATH_TRAVERSAL',
+      `Invalid connection_name (contains path separators): ${connectionName}`


…lag bug, exists-true on parse error, docs/tests RCA: Copilot review of PR #188 flagged six issues across correctness, security, and contract: (1) the .xml/.object fallback compared required==='true' as a string, but fast-xml-parser with parseTagValue=true (its default) coerces the value to boolean true, silently misclassifying required fields as nillable; (2) discoverWorkspace probed fs.existsSync/statSync against candidate dirs (including the ~/Provar home fallback) BEFORE any path-policy check, contradicting the project's --allowed-paths contract and potentially touching paths outside the policy; (3) when a cache file existed but failed to parse, readObject returned exists=false — indistinguishable from "object not cached", so callers could not detect corrupt/unsupported cache files; (4) docs/examples omitted the requestId field that the tool actually returns, making the documented shape drift from runtime; (5) unit tests covered only the .json cache path, leaving the legacy .xml and .object parsers (where the required-flag bug lived) untested; (6) the PATH_TRAVERSAL message read "contains path separators" but the validator also rejects bare ".." with no separators, so the message was inaccurate for that branch. Fix: (1) readXmlCacheFile now treats both boolean true and string "true" as required, so nillable is computed correctly regardless of parser config; (2) discoverWorkspace accepts allowedPaths and runs assertPathAllowed per candidate BEFORE fs.existsSync/statSync — a candidate outside policy is silently skipped (not a hard error) so discovery falls through to the next candidate naturally; (3) readObject parse failures now return exists=true with field_count=0 and a per-object error_message describing the parse failure, letting callers distinguish corrupt from missing; (4) docs/mcp.md adds requestId to the output table, adds it to both example responses, documents the new error_message field shape, and adds a third example showing the parse-error response; (5) added (h.1) .xml format test (regression guard for the required-flag bug), (h.2) .object format test, (i) parse-error test asserting exists=true + error_message, and (j) bare ".." connection_name test asserting the broadened message; (6) the PATH_TRAVERSAL message now reads "must not contain path separators or directory-traversal segments ('..')", covering both rejection conditions. 19/19 orgDescribe tests pass, full mocha 1174/1174, yarn lint clean, yarn compile clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

mrdailey99

Addressed all 6 review comments in f04eeff:

readXmlCacheFile required-flag: now treats both boolean true and string "true" as required, fixing the silent misclassification under fast-xml-parser's default parseTagValue=true.
discoverWorkspace: takes allowedPaths and runs assertPathAllowed per candidate BEFORE fs.existsSync/statSync. Out-of-policy candidates (including the ~/Provar fallback when home is outside --allowed-paths) are silently skipped, so discovery never touches denied filesystem paths.
Parse failure on a present cache file now returns exists: true with field_count: 0 and a per-object error_message, distinguishing corrupt/unsupported from "object not cached".
docs/mcp.md adds requestId to the output table and to both example responses, documents the new error_message field, and adds a third parse-error response example.
New unit tests added for .xml (regression guard for the required-flag bug), .object, parse-error (asserts exists: true + error_message), and bare .. connection_name (broadened message). 19/19 orgDescribe tests pass; full mocha 1174/1174; yarn lint + yarn compile clean.
PATH_TRAVERSAL message now reads "must not contain path separators or directory-traversal segments ('..')", covering both rejection conditions.

Copilot AI review requested due to automatic review settings May 19, 2026 21:46

Copilot started reviewing on behalf of mrdailey99 May 19, 2026 21:46 View session

Copilot AI reviewed May 19, 2026

View reviewed changes

mrdailey99 commented May 20, 2026

View reviewed changes

Merge branch 'develop' into feature/PDX-492-org-describe-tool

cf8e35e

mrdailey99 merged commit 39c3b92 into develop May 20, 2026
4 checks passed

mrdailey99 mentioned this pull request May 20, 2026

PDX-0: release v1.5.2 — Sessions 6/7 feedback (warnings, valueClass, org-describe) #190

Merged

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PDX-492: feat(mcp) — provar_org_describe tool, H2a (Thread F)#188

PDX-492: feat(mcp) — provar_org_describe tool, H2a (Thread F)#188
mrdailey99 merged 3 commits into
developfrom
feature/PDX-492-org-describe-tool

mrdailey99 commented May 19, 2026

Uh oh!

github-actions Bot commented May 19, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

mrdailey99 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mrdailey99 commented May 19, 2026

Context

Why

Workspace discovery heuristic

Cache schema (for H2b consumers)

Output shape

Cache-miss behaviour

Path safety

Files changed

Test plan

Out of scope (H2b — sibling Thread C PR)

Uh oh!

github-actions Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Quality Orchestrator

🧪 Tests to Run · Running 2 of 54 tests

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

mrdailey99 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented May 19, 2026 •

edited

Loading