feat(sparql): cross-vault index + runtime materialization

## Summary

Cross-vault analytical queries return ~19% recall because `sparql index` has no `--also` flag, materialization runs per-vault only, and `query --use-cache --also` loads independent per-vault caches without cross-vault materialization.

## User Story / JTBD

**As a** knowledge worker running analytical SPARQL over a primary vault + archive vault
**I want** a single combined index that materialises prototype-chain inheritance across vault boundaries
**So that** queries like "all ems__Effort in Q1-26 under TBank area chain" return complete results instead of 19% recall

## Background

**Empirical evidence (2026-05-27 cross-vault audit, ~220K triples):**

Running the following reproducer on a live vault returns 91 results where Python-verified ground truth is 488:

```bash
cd /Users/kitelev/vault-2025 && time npx @kitelev/exocortex-cli query \
  --vault /Users/kitelev/vault-2025 \
  --also /Users/kitelev/vault-2025-archive \
  --use-cache --format json /tmp/q1-26-tbank-final.sparql
# Returns 91, expected 488 (19% recall — 81% precision gap)
```

**Root cause trace (codegraph-verified):**

Archive tasks contain frontmatter `exo__Asset_prototype: "[[fb3d12b2-...]]"` where the prototype file lives in `vault-2025/assetspaces/shared-identities/`. When building the archive cache (`sparql index --vault vault-2025-archive`), `NoteToRDFConverter.ts:1089-1090` cannot find the target file in the archive vault → falls back to **synthesized basename-only IRI** `obsidian://vault/fb3d12b2-...md`. The prototype's own subject IRI after `--also` merge is `obsidian://vault/assetspaces/shared-identities/fb3d12b2-...md` (real path). JOIN between these two IRI forms fails silently.

The same UID `fb3d12b2-9552-4866-a31e-2b5f65ea433c` appears as object of `Asset_prototype` in **3 distinct IRI forms** in the combined store:
- `obsidian://vault/assetspaces/shared-identities/<uid>.md` — 6 refs (from vault-2025 assets)
- `obsidian://vault/<uid>.md` — 96 refs (synth-A fallback from archive assets)
- bare literal `<uid>` — 6 refs

`PrototypeChainMaterializer.ts` correctly supports combined stores via `store: ITripleStore` interface, but it is never invoked on the combined store because materialisation happens only per-vault at index time.

## Related Issues

- Depends on: none (this is the root infrastructure issue)
- Blocks: #IRI-CANONICALIZATION (Issue #6), #CROSS-VAULT-SHACL (Issue #4)
- Related: #3219 (`query --also` path-prefix strip — adjacent parser bug)

## BoK References

| Body of Knowledge | Chapter/Section | Relevance |
|-------------------|-----------------|-----------|
| SWEBOK v3 | Ch. 2 Software Design | Federated store architecture, IRI resolution layer |
| SWEBOK v3 | Ch. 3 Software Construction | CLI flag design, backward-compat constraints |
| DMBOK v2 | Ch. 8 Data Integration | Multi-source data loading, IRI identity resolution |
| PMBOK v7 | Project Work | Regression baseline required before ship |

## Technical Approach

### Architecture Context

```
Current flow:
  sparql index --vault A          → cache-A (only vault-A triples + materialisation)
  sparql index --vault B          → cache-B (only vault-B triples + materialisation)
  query --vault A --also B        → load cache-A + cache-B independently → store-union
                                     (no cross-vault materialisation on union)

Target flow:
  sparql index --vault A --also B → combined-cache (union triples + cross-vault materialisation)
  query --vault A --also B        → load combined-cache OR run runtime materialisation
```

### Implementation Steps

1. **Sub-task A: `index --also <path>` repeatable flag**
   - Add `--also` option to `sparql-index.ts` (mirrors existing `sparql-query.ts` pattern)
   - Collect all vault paths, build union triple store before passing to `PrototypeChainMaterializer`
   - Write combined cache to `<primary-vault>/.exocortex/cache/triples-combined.json` (or hash-keyed filename per `--also` set)
   - `PrototypeChainMaterializer` already accepts `store: ITripleStore` — no changes needed there

2. **Sub-task B: `query --inference` flag → runtime materialisation on combined store**
   - Add `--inference` flag to `sparql-query.ts`
   - When `--also` provided and `--inference` set, run `PrototypeChainMaterializer` on union store before query
   - Cache key includes `--also` paths so combined-cache hit avoids re-materialisation

3. **Sub-task C: Regression test**
   - Integration test: query Q1-26 ems__Effort in TBank area chain → assert result count ≥ 464 (95% of 488 baseline)
   - Test fixture: minimal vault pair with prototype in vault-A, task in vault-B referencing it via `[[<uid>]]`

### Code Example

```typescript
// packages/cli/src/commands/sparql-index.ts — add --also support
program
  .option('--also <path>', 'Additional vault to include', (v, prev) => [...(prev || []), v], [])
  .action(async (options) => {
    const vaultPaths = [options.vault, ...(options.also || [])];
    const store = await buildUnionStore(vaultPaths);           // new helper
    await PrototypeChainMaterializer.materialize(store);        // existing — no changes
    await writeCache(options.vault, store, { alsoVaults: options.also });  // hash-keyed
  });
```

## Techniques Applied

- **Federated triple store**: union of N per-vault stores before materialisation
- **Content-addressed cache**: cache filename keyed on `hash(primary + sorted(also))` to support multiple `--also` combinations
- **Flag parity**: `--also` already exists in `sparql-query.ts` — reuse same semantics in `sparql-index.ts`

## Test Plan

### Unit Tests

- `buildUnionStore([pathA, pathB])` returns store containing triples from both vaults
- Cache key differs for different `--also` sets
- `PrototypeChainMaterializer` resolves prototype chain when prototype is in secondary vault

### Integration Tests

- Vault-pair fixture: task in vault-B references prototype in vault-A → after `index --also`, property path `ems:Effort_area/ems:Area_parent*` resolves correctly
- Regression: result count for TBank Q1-26 query ≥ 464 (95% recall baseline)

### BDD Scenarios

```gherkin
Feature: Cross-vault SPARQL index

  Scenario: Combined index resolves cross-vault prototype references
    Given vault-2025 (primary) and vault-2025-archive (secondary)
    When running: exocortex-cli index --vault vault-2025 --also vault-2025-archive
    Then single combined cache is built with materialisation on the union store
    And prototype refs to cross-vault targets resolve to consistent IRI form

  Scenario: Cross-vault property path query returns correct recall
    Given combined cache built for vault-2025 + vault-2025-archive
    When running cross-vault query with property path "Effort_area/Area_parent*"
    Then tasks with prototype-inherited area residing in another vault are matched
    And result count is within 95% of Python-verified baseline (488)

  Scenario: Runtime inference flag substitutes for pre-built combined cache
    Given no combined cache exists
    When running: exocortex-cli query --vault A --also B --inference
    Then PrototypeChainMaterializer runs on combined triple store at query time
    And results match pre-built combined-cache results
```

## Deliverables

- [ ] `--also` flag added to `sparql-index` command
- [ ] `--inference` flag added to `sparql-query` command (runtime materialisation)
- [ ] Combined cache written to content-addressed path
- [ ] Integration test: cross-vault prototype resolution
- [ ] Regression test: TBank Q1-26 ≥ 95% recall
- [ ] CLI help text updated for both flags
- [ ] CHANGELOG entry

## Quality Criteria

- Cross-vault query recall ≥ 95% of Python-verified baseline (488 tasks → ≥ 464)
- Combined cache build time ≤ 2× single-vault index time
- Backward-compat: `sparql index --vault A` (no `--also`) behaviour unchanged
- No regressions in existing `sparql-query` tests

## Acceptance Criteria

- [ ] `index --vault A --also B` produces combined cache without error
- [ ] `query --vault A --also B --use-cache` hits combined cache when available
- [ ] `query --vault A --also B --inference` runs materialisation at query time
- [ ] TBank Q1-26 regression test passes (≥ 464 results)
- [ ] Existing CLI tests unaffected

## Definition of Done

- [ ] Implementation complete and tested
- [ ] Code review approved
- [ ] Tests passing (unit + integration)
- [ ] Documentation updated
- [ ] PR merged to main

## RACI

| Activity | Responsible | Accountable | Consulted | Informed |
|----------|-------------|-------------|-----------|----------|
| Implementation | AI Agent | Tech Lead | — | Team |
| Testing | AI Agent | QA | — | Team |
| Documentation | AI Agent | Tech Lead | — | Stakeholders |

## Risks

| Risk | Probability | Impact | Mitigation |
|------|-------------|--------|------------|
| Combined cache doubles disk usage (~43MB + ~16MB → ~60MB) | High | Low | Content-addressed naming; document in CLI help |
| Cache invalidation logic breaks when `--also` set changes | Medium | Medium | Hash-keyed cache files; stale detection via mtime |
| PrototypeChainMaterializer performance on 2× triples | Low | Medium | Benchmark before ship; add `--no-inference` escape hatch |

## Rollback Plan

1. `--also` flag is additive — removing it restores per-vault-only behaviour
2. Combined cache is a separate file — deleting it forces fallback to per-vault caches
3. Feature flag `EXOCORTEX_COMBINED_INDEX=0` as escape hatch if needed

## Dependencies

- **blockedBy**: none
- **Enables**: Issue #4 (cross-vault SHACL), Issue #6 (IRI canonicalization uses combined store)

## Estimates

| Task | Effort |
|------|--------|
| `sparql-index.ts` — add `--also` flag + union store builder | 3h |
| `sparql-query.ts` — add `--inference` flag + runtime materialisation | 2h |
| Cache naming / invalidation logic | 2h |
| Integration tests + regression test | 3h |
| **Total** | **10h** |

## Labels

`enhancement`, `sparql`, `cli`, `package:cli`, `priority:P0`, `epic:sparql-engine`, `size:large`

## Best Practices Checklist

- [ ] `--also` flag semantics match existing `sparql-query.ts` implementation
- [ ] Cache files use content-addressed names (not overwrite shared `triples.json`)
- [ ] PrototypeChainMaterializer invoked exactly once on combined store (not per-vault)
- [ ] CLI `--help` updated for both new flags
- [ ] No mutation of per-vault cache when combined flag used

## Review Checklist

- [ ] Code follows project conventions
- [ ] Tests are comprehensive (unit + integration + regression)
- [ ] Documentation is clear
- [ ] No security vulnerabilities
- [ ] Backward-compat: no `--also` = same behaviour as before

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(sparql): cross-vault index + runtime materialization #3281

Summary

User Story / JTBD

Background

Related Issues

BoK References

Technical Approach

Architecture Context

Implementation Steps

Code Example

Techniques Applied

Test Plan

Unit Tests

Integration Tests

BDD Scenarios

Deliverables

Quality Criteria

Acceptance Criteria

Definition of Done

RACI

Risks

Rollback Plan

Dependencies

Estimates

Labels

Best Practices Checklist

Review Checklist

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Body of Knowledge	Chapter/Section	Relevance
SWEBOK v3	Ch. 2 Software Design	Federated store architecture, IRI resolution layer
SWEBOK v3	Ch. 3 Software Construction	CLI flag design, backward-compat constraints
DMBOK v2	Ch. 8 Data Integration	Multi-source data loading, IRI identity resolution
PMBOK v7	Project Work	Regression baseline required before ship

Activity	Responsible	Accountable	Consulted	Informed
Implementation	AI Agent	Tech Lead	—	Team
Testing	AI Agent	QA	—	Team
Documentation	AI Agent	Tech Lead	—	Stakeholders

Risk	Probability	Impact	Mitigation
Combined cache doubles disk usage (~43MB + ~16MB → ~60MB)	High	Low	Content-addressed naming; document in CLI help
Cache invalidation logic breaks when `--also` set changes	Medium	Medium	Hash-keyed cache files; stale detection via mtime
PrototypeChainMaterializer performance on 2× triples	Low	Medium	Benchmark before ship; add `--no-inference` escape hatch

Task	Effort
`sparql-index.ts` — add `--also` flag + union store builder	3h
`sparql-query.ts` — add `--inference` flag + runtime materialisation	2h
Cache naming / invalidation logic	2h
Integration tests + regression test	3h
Total	10h

feat(sparql): cross-vault index + runtime materialization #3281

Description

Summary

User Story / JTBD

Background

Related Issues

BoK References

Technical Approach

Architecture Context

Implementation Steps

Code Example

Techniques Applied

Test Plan

Unit Tests

Integration Tests

BDD Scenarios

Deliverables

Quality Criteria

Acceptance Criteria

Definition of Done

RACI

Risks

Rollback Plan

Dependencies

Estimates

Labels

Best Practices Checklist

Review Checklist

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions