fix: consolidate _default_db_path to utils.py — closes #40#71
Conversation
The default-db-path logic was triplicated across:
- scripts/commands/graph_schema.py:29-31 (_default_db_path)
- scripts/graph_model.py:161-163 (_default_db_path)
- scripts/persistent_registry.py:132-133 (__init__ os.path.join)
- scripts/persistent_registry.py:858 (db_exists os.path.join)
All four sites returned os.path.join(workspace, '.codelens', 'codelens.db')
verbatim. Any future change (e.g. honoring CODELENS_DB_PATH env var, moving
the db out of .codelens/) had to be made in four places — easy to miss one
and create a silent bug where graph-schema and graph_model queries hit
different databases.
This refactor consolidates to a single source of truth:
utils.default_db_path(workspace: str) -> str, with a docstring.
Changes per file:
- scripts/utils.py: add default_db_path helper (single source of truth).
- scripts/commands/graph_schema.py: delete local _default_db_path; import
default_db_path from utils; update call site.
- scripts/graph_model.py: replace local _default_db_path definition with
import + backward-compat alias
(preserves the underscore-private symbol used by existing tests and
test_hybrid_type_resolver.py); internal call sites now use the public
name.
- scripts/persistent_registry.py: __init__ and db_exists() now call
default_db_path(workspace). DB_FILENAME constant preserved as a
module-level export (no longer used internally; comment notes it's
kept for backwards-compat with external importers).
- tests/test_graph_model.py: add TestDefaultDbPathConsistency test that
asserts all three call sites resolve to the same path for the same
workspace (and graph_schema imports the same callable, not a
re-definition).
Pure refactor — no behavior change. The default path is still
<workspace>/.codelens/codelens.db.
Verification:
- python3 -m pytest tests/test_graph_model.py -v → 21/21 pass
(20 existing + 1 new TestDefaultDbPathConsistency)
- python3 -m pytest tests/test_graph_model.py tests/test_graph_incremental.py
tests/test_hybrid_type_resolver.py tests/test_persistent_registry.py
tests/test_persistent_registry_extra.py tests/test_registry.py
→ 129/129 pass (no regressions in any storage-related test)
- python3 -m pytest tests/test_cli.py → 15/15 pass
- python3 -c "from utils import default_db_path; print(default_db_path('/tmp/test'))"
→ /tmp/test/.codelens/codelens.db
- python3 scripts/codelens.py graph-schema benchmarks/fixtures/clean_app
→ runs clean (zero-shaped before scan; 31 nodes/134 edges/6 indexes
after scan — same as before refactor)
- All modified files byte-compile with py_compile (no syntax errors)
- All lines I added respect the 120-char limit (PEP 8 + repo convention)
Out of scope (not fixed):
- tests/test_vuln_staleness.py::test_vulnerable_app_has_stale_cache_info
fails on both main and this branch — pre-existing date-based flake
('VULN_DB is 485 days old'), unrelated to this refactor.
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Review - APPROVE with minor nitReviewed full diff (5 files, +115/-17). Clean, well-executed refactor. Strengths
Minor nit (non-blocking)The test hardcodes CI noteThe VerdictMerging. The nit can be addressed in a follow-up PR if desired. Reviewed by BOS (orchestrate-workers skill) - diff read in full before approval. |
|



Summary
Consolidates the default-db-path logic from 4 duplicated sites into a single helper in
scripts/utils.py. Pure refactor — no behavior change. The default path is still<workspace>/.codelens/codelens.db.Before this PR, the same one-liner
os.path.join(workspace, ".codelens", "codelens.db")lived in four places:scripts/commands/graph_schema.py_default_db_path()functionscripts/graph_model.py_default_db_path()function (identical copy)scripts/persistent_registry.pyos.path.joininPersistentRegistry.__init__scripts/persistent_registry.pyos.path.joinindb_exists()Any future change to the default path — e.g., honoring a
CODELENS_DB_PATHenv var, or moving the db out of.codelens/— had to be made in four places. Forgetting one would silently create a bug wheregraph-schemaandgraph_modelqueries hit different databases.Why
Closes #40.
Single source of truth. The issue itself flags the drift risk: "Any future change to the default path ... must be made in both places. Forgetting one creates a silent bug where
graph-schemaandgraph_modelqueries hit different databases."This PR extracts the helper to
utils.default_db_path(workspace: str) -> strand replaces all four call sites with imports / calls to it. Future changes to the path logic now happen in exactly one place.Files changed
scripts/utils.pydefault_db_path(workspace: str) -> strhelper with full docstring (single source of truth).scripts/commands/graph_schema.py_default_db_pathdef; addedfrom utils import default_db_path; updated the one call site inget_graph_schema.scripts/graph_model.py_default_db_pathdef withfrom utils import default_db_path+ a backward-compat alias_default_db_path = default_db_path(preserves the underscore-private symbol imported bytests/test_graph_model.pyandtests/test_hybrid_type_resolver.py— both outside the file allowlist for this PR, so removing the symbol would have broken them). Internal call sites now use the public name.scripts/persistent_registry.py__init__anddb_exists()now calldefault_db_path(workspace).DB_FILENAME = "codelens.db"constant preserved as a module-level export with a comment noting it's kept for backwards-compat with external importers (no longer used internally).tests/test_graph_model.pyTestDefaultDbPathConsistency.test_default_db_path_consistency— asserts all three call sites (utils, graph_model alias, graph_schema import) resolve to the same path for the same workspace, and thatgraph_schema.default_db_path is utils.default_db_path(same callable, not a re-definition). Also coversPersistentRegistrydefault anddb_existsreturningFalsefor a non-existent workspace.No other stacks or files touched. Total: 5 files, +115 / −17.
Architectural decisions
Why keep
graph_model._default_db_pathas an alias instead of deleting it. The constraint "Hanya touch file: ... tests/test_graph_model.py" plus scope-discipline forbids modifyingtests/test_hybrid_type_resolver.py, which imports_default_db_pathfromgraph_modelin 9 places. Deleting the symbol would break those imports and force touching a file outside the allow-list. The alias_default_db_path = default_db_pathpreserves the convention-based API while makingutils.default_db_paththe single source of truth for the path logic. The new test asserts the alias is the same callable, so any future drift (e.g. someone re-defining_default_db_pathlocally) will be caught.Why also update
db_exists()in persistent_registry.py (not just__init__). The DoD explicitly calls out__init__, butdb_exists()at L858 is a second path-construction site in the same file using the sameos.path.join(workspace, ".codelens", DB_FILENAME)pattern. Leaving it untouched would defeat the single-source-of-truth goal (the issue text says "three sources of truth actually"). It's the same file already in scope, so updating it keeps the refactor consistent. Noted explicitly here for reviewer awareness.Why preserve
DB_FILENAME. The constraint says "PreserveDB_FILENAME = "codelens.db"constant in persistent_registry.py if ada code lain yang pakai". A repo-wide grep shows it's no longer referenced anywhere internally after this refactor, but it's a module-level public symbol that external plugins/importers could depend on. Keeping it (with a comment) is the conservative choice; deleting it would be a separate breaking-API decision for the BOS to make.Verification
1. Functional test — new
TestDefaultDbPathConsistencyproves the consolidation worksThe test asserts:
graph_model._default_db_path(ws) == utils.default_db_path(ws)(alias works)graph_schema.default_db_path is utils.default_db_path(same callable, not a re-definition)graph_schema.default_db_path(ws) == utils.default_db_path(ws)(resolves to same path)PersistentRegistry(ws).db_path == utils.default_db_path(ws)(registry uses the helper)db_exists(ws) is Falsefor a non-existent workspace (helper doesn't crash).codelens/codelens.db(sanity)2. All existing tests still pass — no regressions in any storage-related test
(20 existing + 1 new = 21.)
Also ran broader engine + parser suites (architecture, circular, complexity, dataflow, deadcode, history, smell, search, secrets, semantic, perfhint, codelens, css/html/js/rust parsers, hybrid engine, vuln_staleness): 393 passed, 1 pre-existing failure (see "Out of scope" below), 12 skipped.
3. DoD #8 —
python3 -c "from utils import default_db_path; print(default_db_path('/tmp/test'))"4. DoD #7 —
python3 scripts/codelens.py graph-schemaworks on test fixturesgraph-schemaworks both pre-scan (zero-shaped) and post-scan (31 nodes / 134 edges / 6 indexes — same as before refactor). The.codelens/artifacts were cleaned up after the test so they don't pollute the PR.5. All modules import cleanly + alias is the same callable
6. PEP 8 / line length
All lines I added respect the 120-character limit (verified by a per-line check on the diff). The 5 pre-existing 120+ char lines in
scripts/utils.py(L354-362, incheck_gitignore_healthrecommendations) are untouched by this PR — out of scope per scope-discipline.7. Byte-compile check
Out of scope (not fixed)
tests/test_vuln_staleness.py::test_vulnerable_app_has_stale_cache_infofails on bothmainand this branch — pre-existing date-based flake (WARNING: Built-in VULN_DB is 485 days old (last updated 2025-02-28)). Verified by checking outmainand re-running the same test: same failure, same warning. Unrelated to this refactor.Issue link
Closes #40: #40