Security
- SSRF guard thread-safety: Replaced the global
socket.getaddrinfo monkey-patch with per-connection _SSRFGuardedHTTPConnection/_SSRFGuardedHTTPSConnection subclasses. DNS is resolved once, IP validated, connection made to that exact address — closes the concurrent-thread race and the underlying TOCTOU gap.
- Prompt injection mitigation: Untrusted source files are now wrapped in
<untrusted_source path="..." sha256="..."> XML delimiters during LLM extraction. Jailbreak sentinel tokens neutralised; system prompt hardened.
Fixes
- Obsidian/Canvas export crash:
KeyError when a community contains a node absent from the graph — dangling members now skipped gracefully.
- macOS NFC/NFD re-extraction loop: Office sidecar filenames now NFC-normalised before hashing, fixing the bug where
--update re-extracted all Office files on every run on HFS+/APFS.
- Data JSON orphan nodes: The JSON extractor now only processes config/manifest JSON (
package.json, tsconfig.json, etc.). Data JSON files (arrays, generic key/value) are skipped by the AST pass — eliminates 561+ orphan nodes on real repos.
- OpenAI reasoning models temperature error:
temperature=0 is now auto-omitted for o1/o3/o4/gpt-5 series. Override with GRAPHIFY_LLM_TEMPERATURE.
- Corporate Windows / EDR hang:
datasketch and scipy removed. Replaced with a self-contained pure-numpy MinHash/MinHashLSH implementation — eliminates the numpy.testing → platform.machine() subprocess spawn at import time that EDR software (CrowdStrike, SentinelOne) was intercepting.
- Dedup merges distinct same-named symbols:
Config class in app.py and Config class in db.py are no longer collapsed. Code nodes are now deduplicated by ID only.
Performance
- detect() 34% faster on large repos: Ignore-pattern checks memoized per scan. Each ancestor directory evaluated once across all sibling files, eliminating ~42M redundant
fnmatch calls on 2k+ file corpora.
Features
GRAPHIFY_MAX_GRAPH_BYTES: Override the 512 MiB graph.json size cap (e.g. 700MB, 2GB). The cap error message now cites this env var. graphify export html auto-falls back to community-aggregation view when over cap.
- Stronger
CLAUDE.md instructions: "MANDATORY: Before using Read/Grep/Glob/Bash you MUST run graphify first" — includes explicit instruction to forward the rule to every subagent prompt.
GRAPHIFY_LLM_TEMPERATURE: Override LLM temperature for any backend (none to omit entirely).
CI
- Self-graph release asset: Every release now ships
graphify-self-graph.tar.gz — graph.json + graph.html + GRAPH_REPORT.md from Graphify analysing its own source. Download and open graph.html locally to see what Graphify produces, no install required.
Upgrade
uv tool upgrade graphifyy
# or
pip install --upgrade graphifyy