PDX-492: feat(mcp) — provar_org_describe tool, H2a (Thread F)#188
Merged
Conversation
…etadata cache) Introduces provar_org_describe, a read-only MCP tool that surfaces cached Salesforce describe data from the Provar IDE workspace .metadata directory so authoring tools (provar_testcase_generate) can produce field-correct data steps without a live SF API call. Workspace discovery walks three candidates in order: 1. <parent>/workspace-<basename>/ (sibling pattern) 2. <parent>/Provar_Workspaces/workspace-<name-dashes>/ 3. ~/Provar/workspace-<name-dashes>/ (user-home fallback) Returns a structured cache-miss response with details.suggestion when the connection cache is absent, so the agent can either prompt the user to load the connection in Provar IDE or fall back to inline field hints. Registered under the existing 'inspect' tool group. H2b (sibling thread) consumes this tool's output to populate generator hints. RCA: provar_testcase_generate had no source of truth for which fields on a Salesforce object are required and what their types are. Agents either guessed (producing brittle tests), hard-coded names, or called the live SF API (slow + auth-dependent). The Provar IDE already caches describe data on disk after a connection is loaded — this PR exposes that cache as a read-only MCP tool so the cache becomes useful outside the IDE. Fix: New tool src/mcp/tools/orgDescribeTools.ts with strict path-policy checks on both project_path and connection_name (separator/traversal rejected). Cache schema is one file per object (.json preferred, .xml / .object accepted) so the existing IDE writer needs no change. Cache miss returns a stable shape with suggestion rather than an isError response, so callers do not need a try/catch path. 14 unit tests cover all 7 plan scenarios (workspace discovery, fallback, cache miss, path policy, happy path, field_filter, objects filter). Two smoke entries cover happy + miss. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Quality Orchestrator🟢 LOW · 🧪 Tests to Run · Running 2 of 54 tests
▶ Run commandnpx vitest run \
unit/mcp/server.test.ts \
unit/mcp/orgDescribeTools.test.ts⚡ quality-orchestrator · |
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new read-only MCP inspection tool (provar_org_describe) to surface Salesforce object/field describe metadata from the Provar IDE workspace .metadata cache, enabling downstream authoring tools to avoid live SF API calls and rely on on-disk cached schema.
Changes:
- Implemented
provar_org_describetool with workspace discovery, path-policy enforcement, and JSON/XML/Object cache readers. - Registered the tool under the existing
inspecttool group in the MCP server. - Added unit tests, smoke-script coverage, and documentation updates (MCP reference + pilot guide scenario).
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
src/mcp/tools/orgDescribeTools.ts |
New MCP tool implementation: workspace discovery, cache reading/parsing, response shaping, path policy checks. |
src/mcp/server.ts |
Registers org-describe tools under the inspect tool group. |
test/unit/mcp/orgDescribeTools.test.ts |
Unit tests for workspace discovery, cache-miss behavior, path policy, filters, and happy path. |
scripts/mcp-smoke.cjs |
Adds smoke calls for cache-miss and happy-path flows. |
docs/mcp.md |
Adds “Org metadata access” section and provar_org_describe reference docs. |
docs/mcp-pilot-guide.md |
Adds Scenario 13 describing org-aware generation workflow using the new tool. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+145
to
+151
| fields.push({ | ||
| name, | ||
| type: (f['type'] as string | undefined) ?? 'unknown', | ||
| defaultValue: (f['defaultValue'] as string | undefined) ?? null, | ||
| // XML defaults: required = !nillable. In the .object format, "required" is rare, | ||
| // so we default to nillable=true (optional) unless explicitly required. | ||
| nillable: f['required'] === 'true' ? false : true, |
Comment on lines
+101
to
+112
| /** Returns the first candidate workspace that exists, or null. */ | ||
| export function discoverWorkspace(projectPath: string): string | null { | ||
| for (const candidate of workspaceCandidates(projectPath)) { | ||
| try { | ||
| if (fs.existsSync(candidate) && fs.statSync(candidate).isDirectory()) { | ||
| return candidate; | ||
| } | ||
| } catch { | ||
| // Permission errors etc. — skip and try next candidate | ||
| } | ||
| } | ||
| return null; |
| cached = path.extname(file) === '.json' ? readJsonCacheFile(file) : readXmlCacheFile(file); | ||
| } catch (e) { | ||
| log('warn', 'org_describe: failed to parse cache file', { file, error: (e as Error).message }); | ||
| return { name: objectName, exists: false, required_fields: [], field_count: 0 }; |
Comment on lines
+1718
to
+1724
| | Output field | Description | | ||
| | -------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- | | ||
| | `workspace_path` | Absolute resolved path to the discovered workspace, or `null` when none of the three candidate directories exists. | | ||
| | `cache_age_ms` | `mtime` delta in milliseconds of the connection cache directory, or `null` when the cache is missing. | | ||
| | `objects[]` | Array of `{ name, exists, required_fields, field_count }`. `exists` is `true` (cached), `false` (requested but not cached), or `null` (cache miss). | | ||
| | `details.suggestion` | Present **only** on cache miss. Tells the agent how to populate the cache (open Provar IDE) or how to proceed without it (inline hints). | | ||
|
|
Comment on lines
+130
to
+167
| /** | ||
| * Parse a legacy .object XML file (CustomObject metadata) into the canonical shape. | ||
| * Only extracts the bare minimum the tool needs: field name, type, nillable. | ||
| */ | ||
| function readXmlCacheFile(filePath: string): CachedObject { | ||
| const raw = fs.readFileSync(filePath, 'utf-8'); | ||
| const parsed = XML_PARSER.parse(raw) as Record<string, unknown>; | ||
| const root = (parsed['CustomObject'] ?? parsed['toolingObjectInfo'] ?? {}) as Record<string, unknown>; | ||
| const fieldsRaw = root['fields']; | ||
| if (!Array.isArray(fieldsRaw)) return { name: path.basename(filePath, path.extname(filePath)), fields: [] }; | ||
|
|
||
| const fields: CachedField[] = []; | ||
| for (const f of fieldsRaw as Array<Record<string, unknown>>) { | ||
| const name = (f['fullName'] ?? f['name']) as string | undefined; | ||
| if (!name) continue; | ||
| fields.push({ | ||
| name, | ||
| type: (f['type'] as string | undefined) ?? 'unknown', | ||
| defaultValue: (f['defaultValue'] as string | undefined) ?? null, | ||
| // XML defaults: required = !nillable. In the .object format, "required" is rare, | ||
| // so we default to nillable=true (optional) unless explicitly required. | ||
| nillable: f['required'] === 'true' ? false : true, | ||
| }); | ||
| } | ||
| return { name: path.basename(filePath, path.extname(filePath)), fields }; | ||
| } | ||
|
|
||
| /** Look up the cache file for one object, trying .json then .xml. */ | ||
| function findObjectCacheFile(connectionDir: string, objectName: string): string | null { | ||
| const jsonPath = path.join(connectionDir, `${objectName}.json`); | ||
| if (fs.existsSync(jsonPath)) return jsonPath; | ||
| const xmlPath = path.join(connectionDir, `${objectName}.xml`); | ||
| if (fs.existsSync(xmlPath)) return xmlPath; | ||
| // Legacy CustomObject layout (.object extension) | ||
| const objPath = path.join(connectionDir, `${objectName}.object`); | ||
| if (fs.existsSync(objPath)) return objPath; | ||
| return null; | ||
| } |
| if (connectionName.includes('/') || connectionName.includes('\\') || connectionName.split(/[/\\]+/).includes('..')) { | ||
| throw new PathPolicyError( | ||
| 'PATH_TRAVERSAL', | ||
| `Invalid connection_name (contains path separators): ${connectionName}` |
…lag bug, exists-true on parse error, docs/tests RCA: Copilot review of PR #188 flagged six issues across correctness, security, and contract: (1) the .xml/.object fallback compared required==='true' as a string, but fast-xml-parser with parseTagValue=true (its default) coerces the value to boolean true, silently misclassifying required fields as nillable; (2) discoverWorkspace probed fs.existsSync/statSync against candidate dirs (including the ~/Provar home fallback) BEFORE any path-policy check, contradicting the project's --allowed-paths contract and potentially touching paths outside the policy; (3) when a cache file existed but failed to parse, readObject returned exists=false — indistinguishable from "object not cached", so callers could not detect corrupt/unsupported cache files; (4) docs/examples omitted the requestId field that the tool actually returns, making the documented shape drift from runtime; (5) unit tests covered only the .json cache path, leaving the legacy .xml and .object parsers (where the required-flag bug lived) untested; (6) the PATH_TRAVERSAL message read "contains path separators" but the validator also rejects bare ".." with no separators, so the message was inaccurate for that branch. Fix: (1) readXmlCacheFile now treats both boolean true and string "true" as required, so nillable is computed correctly regardless of parser config; (2) discoverWorkspace accepts allowedPaths and runs assertPathAllowed per candidate BEFORE fs.existsSync/statSync — a candidate outside policy is silently skipped (not a hard error) so discovery falls through to the next candidate naturally; (3) readObject parse failures now return exists=true with field_count=0 and a per-object error_message describing the parse failure, letting callers distinguish corrupt from missing; (4) docs/mcp.md adds requestId to the output table, adds it to both example responses, documents the new error_message field shape, and adds a third example showing the parse-error response; (5) added (h.1) .xml format test (regression guard for the required-flag bug), (h.2) .object format test, (i) parse-error test asserting exists=true + error_message, and (j) bare ".." connection_name test asserting the broadened message; (6) the PATH_TRAVERSAL message now reads "must not contain path separators or directory-traversal segments ('..')", covering both rejection conditions. 19/19 orgDescribe tests pass, full mocha 1174/1174, yarn lint clean, yarn compile clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mrdailey99
commented
May 20, 2026
Collaborator
Author
mrdailey99
left a comment
There was a problem hiding this comment.
Addressed all 6 review comments in f04eeff:
readXmlCacheFilerequired-flag: now treats both booleantrueand string"true"as required, fixing the silent misclassification under fast-xml-parser's defaultparseTagValue=true.discoverWorkspace: takesallowedPathsand runsassertPathAllowedper candidate BEFOREfs.existsSync/statSync. Out-of-policy candidates (including the~/Provarfallback when home is outside--allowed-paths) are silently skipped, so discovery never touches denied filesystem paths.- Parse failure on a present cache file now returns
exists: truewithfield_count: 0and a per-objecterror_message, distinguishing corrupt/unsupported from "object not cached". docs/mcp.mdaddsrequestIdto the output table and to both example responses, documents the newerror_messagefield, and adds a third parse-error response example.- New unit tests added for
.xml(regression guard for the required-flag bug),.object, parse-error (assertsexists: true+error_message), and bare..connection_name (broadened message). 19/19 orgDescribe tests pass; full mocha 1174/1174;yarn lint+yarn compileclean. - PATH_TRAVERSAL message now reads "must not contain path separators or directory-traversal segments ('..')", covering both rejection conditions.
13 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
Adds
provar_org_describe, a read-only MCP tool that surfaces cached Salesforce describe data from the Provar IDE workspace.metadatadirectory. This is H2a of the PDX-H2 plan (Thread F, single PR). The sibling H2b PR in Thread C will consume this tool's output to populateprovar_testcase_generatefield-type hints.Why
provar_testcase_generatecurrently has no source of truth for which fields on a Salesforce object are required and what their types are. Agents either guess (brittle), hard-code names, or call the live SF API (slow + auth-dependent). The Provar IDE already caches describe data on disk after a connection is loaded — this PR exposes that cache as a read-only MCP tool so it becomes useful outside the IDE.Workspace discovery heuristic
The tool walks three candidate directories in order; the first one that exists wins:
<parent-of-project>/workspace-<basename>/— sibling workspace pattern (default for Provar IDE in this layout).<parent-of-project>/Provar_Workspaces/workspace-<name-dashes>/— sharedProvar_Workspacesdirectory.~/Provar/workspace-<name-dashes>/— user-home fallback.<name-dashes>=basename(project_path).trim().replace(/\s+/g, '-').toLowerCase().Cache schema (for H2b consumers)
No prior
.metadatareader exists in the codebase, so this PR designs the schema H2b should produce/consume. It is intentionally simple — one file per object, JSON preferred:The tool also falls back to
.xmland.object(legacy CustomObject metadata) files for migration paths.Output shape
Cache-miss behaviour
Returns a structured response (not
isError) withdetails.suggestiontelling the agent how to recover:This is the same advisory shape
provar_properties_readuses for divergence warnings, so consumers don't need a special error path.Path safety
assertPathAllowedis called on bothproject_pathand the composedconnection_dirso aconnection_namelike../../etccannot escape the workspace.connection_nameis additionally rejected outright if it contains a path separator or..segment (returnsPATH_TRAVERSAL).Files changed
src/mcp/tools/orgDescribeTools.ts— new toolsrc/mcp/server.ts— registered under existinginspecttool grouptest/unit/mcp/orgDescribeTools.test.ts— 14 unit tests, all 7 plan scenariosscripts/mcp-smoke.cjs— 2 new RPC calls (happy path + cache miss)docs/mcp.md— new "Org metadata access" section with TOC entrydocs/mcp-pilot-guide.md— new Scenario 13 (org-aware generation)Test plan
node_modules/.bin/tsc -p .— clean compileyarn lint— cleannode_modules/.bin/mocha "test/**/*.test.ts"— 1169 passing, 1 pending (baseline)node_modules/.bin/mocha "test/unit/mcp/orgDescribeTools.test.ts"— 14 passing covering scenarios (a)–(g)node scripts/mcp-smoke.cjs --profile inspect— both new entries PASSOut of scope (H2b — sibling Thread C PR)
field_type_hints/required_fields_hinttoprovar_testcase_generateprovar_org_describeoutput into the generatorCo-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com