Wire Graphify extraction into wiki_ingest_folder()

## Parent Epic

Part of #5 — Integrate Graphify for zero-cost code entity extraction

## Task

Modify `wiki_ingest_folder()` in `agent_notes/services/wiki_backend.py` to automatically run Graphify extraction when code files are present and the package is available.

## File

`agent_notes/services/wiki_backend.py` — function `wiki_ingest_folder()` (lines 338-441)

## Current Flow

```
wiki_ingest_folder(folder_path)
    ├── Walk files, filter by extension/.gitignore/_SKIP_DIRS
    ├── Concatenate with "--- FILE: <rel> ---" markers
    ├── Chunk if > 2MB
    └── Call wiki_ingest(concepts=caller_provided, entities=caller_provided)
```

**Problem**: `concepts` and `entities` are almost always `None` when called programmatically — the caller (LLM agent or CLI) doesn't know what's in the code yet.

## New Flow

```
wiki_ingest_folder(folder_path)
    ├── Walk files, filter by extension/.gitignore/_SKIP_DIRS
    ├── Track has_code flag during walk
    ├── Concatenate with "--- FILE: <rel> ---" markers
    │
    ├── [NEW] if has_code and graphify_available():
    │   ├── extract_code_graph(folder_path, extensions, skip_dirs)
    │   ├── graph_to_wiki_terms(graph_data)
    │   ├── save_graph_json(wiki_root, slug, graph_data)
    │   └── Merge discovered terms with caller-provided ones
    │
    ├── Chunk if > 2MB
    └── Call wiki_ingest(concepts=merged, entities=merged)
```

## Implementation Details

### Step 1: Add `_CODE_EXTENSIONS` constant (near line 306)

```python
_CODE_EXTENSIONS = {
    ".py", ".ts", ".js", ".tsx", ".jsx",
    ".go", ".rs", ".java", ".cpp", ".c", ".h",
    ".rb", ".swift", ".kt", ".cs", ".scala",
    ".php", ".lua", ".groovy",
}
```

### Step 2: Track `has_code` during file walk (inside the for loop, line 364-387)

Add before the loop:
```python
has_code = False
```

Inside the loop, after the extension filter passes (after line 374):
```python
if file.suffix in _CODE_EXTENSIONS:
    has_code = True
```

### Step 3: Insert Graphify extraction block (after line 389, before line 391)

```python
    # ── Graphify auto-extraction (zero-cost entity discovery) ────────
    graphify_concepts: list[str] = []
    graphify_entities: list[str] = []

    if has_code:
        try:
            from .code_graph import (
                graphify_available,
                extract_code_graph,
                graph_to_wiki_terms,
                save_graph_json,
            )

            if graphify_available():
                graph_data = extract_code_graph(
                    folder_path,
                    extensions=allowed_exts & _CODE_EXTENSIONS if allowed_exts != _DEFAULT_EXTENSIONS else None,
                    skip_dirs=_SKIP_DIRS,
                )
                if graph_data["stats"]["nodes"] > 0:
                    wiki_terms = graph_to_wiki_terms(graph_data)
                    graphify_entities = wiki_terms["entities"]
                    graphify_concepts = wiki_terms["concepts"]

                    # Persist graph alongside raw content
                    _slug_name = slug if 'slug' in dir() else _slug(title or folder_path.name)
                    save_graph_json(wiki_root, _slug_name, graph_data)
        except Exception:
            pass  # Graphify failure must never break ingestion
```

### Step 4: Merge terms before wiki_ingest() calls

Add helper function:
```python
def _merge_unique(base: list[str], extra: list[str]) -> list[str]:
    """Merge two lists preserving order, removing duplicates (case-insensitive)."""
    seen = {x.lower() for x in base}
    result = list(base)
    for item in extra:
        if item.lower() not in seen:
            seen.add(item.lower())
            result.append(item)
    return result
```

Before both `wiki_ingest()` calls (line 421 and 432), merge:
```python
    merged_concepts = _merge_unique(concepts or [], graphify_concepts)
    merged_entities = _merge_unique(entities or [], graphify_entities)
```

Then pass `concepts=merged_concepts, entities=merged_entities` instead of `concepts=concepts, entities=entities`.

### Step 5: Add graph.json reference to source page (optional enhancement)

In `wiki_ingest()`, if a graph.json was saved, add its path to the `sources` list in the source page frontmatter. This is optional — the graph.json is discoverable by convention (`raw/<slug>-graph.json`).

## Insertion Points (exact line references)

| What | Where | Line |
|---|---|---|
| `_CODE_EXTENSIONS` constant | After `_DEFAULT_EXTENSIONS` | ~306 |
| `has_code = False` | Before `for file in sorted(...)` | ~363 |
| `has_code = True` | Inside loop, after extension check | ~375 |
| Graphify extraction block | After `raw_content = "".join(parts)` | ~390 |
| `_merge_unique()` helper | Before `wiki_ingest_folder()` or as module-level | ~337 |
| Merged args to `wiki_ingest()` (chunked) | Replace `concepts=concepts` | ~428 |
| Merged args to `wiki_ingest()` (single) | Replace `concepts=concepts` | ~439 |

## Edge Cases

1. **Folder with no code files** (only .md/.yaml): `has_code` stays False, Graphify block skipped entirely. Zero overhead.
2. **Graphify not installed**: `graphify_available()` returns False. Zero overhead beyond one failed import attempt (cached by Python).
3. **Graphify extraction returns empty**: `stats.nodes == 0` check skips term mapping. Falls through to original behavior.
4. **Graphify crashes**: `except Exception: pass` catches everything. Ingestion continues without entity discovery.
5. **Caller provides entities AND Graphify discovers more**: `_merge_unique()` combines both, deduplicating case-insensitively. Caller's entities come first (higher priority).
6. **Very large folder (1000+ files)**: `extract()` may take 10-30s. This is acceptable for a one-time ingest. Tree-sitter is O(n) in file size.

## Testing

See #11 for test specifications.

## Dependencies

- #6 (optional dependency)
- #7 (code_graph.py module)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wire Graphify extraction into wiki_ingest_folder() #8

Parent Epic

Task

File

Current Flow

New Flow

Implementation Details

Step 1: Add `_CODE_EXTENSIONS` constant (near line 306)

Step 2: Track `has_code` during file walk (inside the for loop, line 364-387)

Step 3: Insert Graphify extraction block (after line 389, before line 391)

Step 4: Merge terms before wiki_ingest() calls

Step 5: Add graph.json reference to source page (optional enhancement)

Insertion Points (exact line references)

Edge Cases

Testing

Dependencies

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

What	Where	Line
`_CODE_EXTENSIONS` constant	After `_DEFAULT_EXTENSIONS`	~306
`has_code = False`	Before `for file in sorted(...)`	~363
`has_code = True`	Inside loop, after extension check	~375
Graphify extraction block	After `raw_content = "".join(parts)`	~390
`_merge_unique()` helper	Before `wiki_ingest_folder()` or as module-level	~337
Merged args to `wiki_ingest()` (chunked)	Replace `concepts=concepts`	~428
Merged args to `wiki_ingest()` (single)	Replace `concepts=concepts`	~439

Wire Graphify extraction into wiki_ingest_folder() #8

Description

Parent Epic

Task

File

Current Flow

New Flow

Implementation Details

Step 1: Add _CODE_EXTENSIONS constant (near line 306)

Step 2: Track has_code during file walk (inside the for loop, line 364-387)

Step 3: Insert Graphify extraction block (after line 389, before line 391)

Step 4: Merge terms before wiki_ingest() calls

Step 5: Add graph.json reference to source page (optional enhancement)

Insertion Points (exact line references)

Edge Cases

Testing

Dependencies

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Step 1: Add `_CODE_EXTENSIONS` constant (near line 306)

Step 2: Track `has_code` during file walk (inside the for loop, line 364-387)