Releases: ClayGendron/vfs
Releases · ClayGendron/vfs
v0.0.22
What's Changed
Added
PostgresFileSystem(story 003) — native Postgres backend with pgvector-backed embeddings, native lexical (tsvector+plainto_tsquery) and semantic search, regex pushdown, and explicit schema verification. Respects the model-declared pgvector operator class (vector_cosine_ops,vector_ip_ops,vector_l2_ops) for both distance operator selection (<=>,<#>,<->) and score normalization.- Native Postgres pattern-search contract — partial trigram GIN + B-tree
text_pattern_opsindexes onvfs_objects.pathand.content;verify_native_search_schemafails fast if the required artifacts are missing. - Native Postgres
meeting_subgraph(story 004) — server-side PL/pgSQL Steiner-tree traversal installed viainstall_native_graph_schema()and verified at startup. - Native MSSQL
meeting_subgraph— T-SQL stored procedure with a TVP (GroverSeedList) seed-list input; override routes through a rawaioodbccursor since SQLAlchemybindparamcan't plumb TVPs. - Native MSSQL pushdown for
predecessors,successors,ancestors,descendants,neighborhood— previously delegated to the in-memory rustworkx path. Seed/exclusion sets bind as JSON strings unpacked viaOPENJSONwith an explicitNVARCHAR(450)schema; multi-hop traversals drive a Python BFS loop over single-hop SQL queries to avoid MSSQL's recursive-CTEUNION ALLcycle hazard. /.vfssidecar namespace (story 002) — canonical/.vfs/<path>/__meta__/...layout for edge storage and metadata. Replaces the prior child-of-file metadata namespace;mkconnis removed.
Changed
- Result schema unified — the
Candidate/Detailchain is collapsed into a flatEntryrow with operation metadata hoisted onto the envelope. Backends and the query executor now reason in terms of column projection instead of shape dispatch. - Constitution rewritten around four primitives (Namespace, Entry, Revision, Operation) with RFC 2119 precedence language.
in_degree/out_degreedropped fromVFSObjectBase,Entryprojection, and result defaults — graph degrees are no longer part of the entry payload.path/parent_pathmax length restored to 4096 soSQLModel.metadata.create_alltargets SQL Server'sVARCHAR(8000)ceiling; deepest sidecar paths (/.vfs/<path>/__meta__/versions) now have headroom.MSSQLFileSystem._grep_impl— WHERE body extracted so the two FROM / ORDER BY branches no longer duplicate the filter body.context/workspace — research/ migrated intocontext/standards/,context/stories/,context/learnings/; legacy docs and demo DB fixtures pruned.
Fixed
VectorType.process_result_value— rejectstr/bytesbefore the iterable fallback so the expectedValueError("expected iterable pgvector value")is raised instead of a downstream per-character float parse error.- CI coverage restored above the 99% gate with focused tests for Postgres inverse-edge projections, path/vector helper error handling, model/result rendering, and routing/parser/base/permission edge cases.
- CI lint / format / type-check regressions from the native search backend work — touched files now pass
ruff check,ruff format --check, andty check src/.
Full Changelog: v0.0.21...v0.0.22
v0.0.21
What's Changed
Changed
- Python package renamed from
grovertovfs— imports change fromfrom grover import ...tofrom vfs import .... Matches thevfs-pyPyPI distribution name. This is a breaking change with no compatibility shim. - Class identifiers renamed to a
VFS*/VirtualFileSystemscheme:GroverFileSystem→VirtualFileSystem(invfs.base)Grover→VFSClient(sync facade)GroverAsync→VFSClientAsync(async router)GroverResult→VFSResultGroverObject/GroverObjectBase→VFSObject/VFSObjectBaseGroverError→VFSError
- DB table renamed
grover_objects→vfs_objects(and theix_grover_objects_ext_kindindex →ix_vfs_objects_ext_kind). No migration script is shipped — existing databases need their tables recreated by the consumer. scripts/bump_version.py— updated to point atsrc/vfs/__init__.py.README.md— examples, install commands, badges, and class names updated for the new package and identifiers.
Full Changelog: v0.0.20...v0.0.21
v0.0.20
What's Changed
Changed
- PyPI package renamed from
grovertovfs-py— install withpip install vfs-py. Python imports are unchanged (import grover). Versions0.0.18and earlier remain available on PyPI under thegrovername; new releases publish tovfs-py.
Full Changelog: v0.0.18...v0.0.20
v0.0.18
What's Changed
Changed
- DB read paths now hydrate
Candidate.content—ls,glob,delete,move,tree, andlexical_searchalways populatecontenton the candidates they emit. The underlyingselect(self._model)was already pulling the column over the wire; theinclude_content=Falsedefault onto_candidatewas discarding it during projection. Removed the parameter entirely so every read-path projection returns content, eliminating the redundant follow-upread(...)round trip when a downstream stage needs the content. - MSSQL
_grep_impl,_glob_impl, and_lexical_search_impldelegate to the base class whencandidatesis supplied — Once the candidate set has been transferred to Python (and now carries content), there is nothing left for SQL Server to do. The base class runs the regex via_collect_line_matchesand BM25 viaBM25Scorerover the in-memory content with zero round trips. Full-tree pushdowns (CONTAINSTABLE,REGEXP_LIKE) are unchanged for the no-candidates path. - MSSQL
_glob_implno-candidates branch —SELECT path, kind, contentinstead ofSELECT path, kind, so glob results carry content directly out of the pushdown. - MSSQL
_lexical_search_implno-candidates branch — Follows theCONTAINSTABLEpushdown with one small batchedSELECT path, content WHERE path IN (top_k)so the top-k results return hydrated.kis bounded (default 15), so the second round trip is tiny. - Base
_lexical_search_impl— Threads content through_LexicalDocinto the result candidates (previously dropped during result construction). - Base
_glob_implupstream-candidates branch — Preserves prior content and metrics viaCandidate.model_copy(...)instead of constructing a freshCandidatethat drops them. _read_impl— Skips theSELECTfor already-hydrated candidates and only fetches the gaps. Makesread(candidates=...)cheap when content is already on the candidates from a prior stage.
Removed
include_contentparameter onGroverObjectBase.to_candidate— Always populates content now. The 5 callers that explicitly passedinclude_content=True(_write_impl,_read_impl, bulk write) drop the redundant kwarg.- MSSQL
_grep_with_candidate_chunkshelper — Dead after the candidates path delegates to the base class.
Fixed
glob | grepon MSSQL drops from three round trips to one — Previously the executor pre-hydrated content viaread(...), then MSSQL_grep_implignored the hydrated content and re-queried withREGEXP_LIKEagainst the same paths. Now glob returns hydrated content directly and grep runs the regex in Python on the in-memory candidates.
Full Changelog: v0.0.17...v0.0.18
v0.0.17
What's Changed
Fixed
globandgrepmount-prefix routing — Absolute patterns and literalpathsfilters are now stripped of the mount prefix before dispatch to each mount. Previously the non-candidate fanout in_route_fanoutforwarded the full pattern to every mount, but mounts store paths mount-relative — soglob('/data/**/*.py')andgrepwithpaths=('/data/src',)silently returned empty whilereadon the same path worked. New dedicated_route_glob_fanout/_route_grep_fanoutuse exact-rewrite when provable (literal-prefix or single-segment glob consumption against the mount name) and fall back to a/**superset query plus router-side authoritative re-filter when the pattern's leading segment is**.globs_notis never silently dropped — exclusions that cannot be exactly pushed are enforced at the router after rebase.max_countis deferred to the router whenever any mount uses the post-filter fallback so it cannot truncate pre-filter candidates. Wildcard mount selectors (/*/,/d?ta/,/d[ae]ta/) and multi-hop chains (GroverAsync → router → router → leaf) work correctly. The candidate-input path (glob/grepwithcandidates=) had the same mismatch and is fixed by filtering at the router with the original absolute pattern before grouping by terminal.
Added
grover.routingmodule — Pure helpers (rewrite_glob_for_mount,rewrite_path_for_mount,first_segment,glob_segment_matches) and plan dataclasses (GlobMountPlan,GrepMountPlan) backing the new fanout strategy.
Changed
MSSQLFileSystemglob pushdown —_glob_implnow structurally decomposes glob patterns via the newdecompose_glob()helper into a literal path prefix and a trailing**/*.<ext>tail, then pushes both into SQL: the prefix becomes a sargableLIKEpredicate and the ext narrows the(ext, kind)composite index seek. When the pattern is fully expressible asprefix + **/*.<ext>the authoritativeREGEXP_LIKEresidual is dropped entirely andkindis narrowed to'file'so the planner picksix_grover_objects_ext_kindinstead of relying onext IS NULLfor directory exclusion. Caller-suppliedextandpathsremain authoritative (intersected with the decomposed values) so explicit narrowing is never broadened by the optimization. Patterns the decomposer doesn't recognize fall through to the existing regex path with no behavior change.
Full Changelog: v0.0.16...v0.0.17
v0.0.16
What's Changed
Added
- Ripgrep-compatible filter surface on
grepandglob— Structural filters (ext, positionalpaths,globs, output modesfiles/lines/count, context windows-A/-B/-C,case_mode,max_count) now push into SQL through a composable clause builder instead of forcing every search through a full-content scan.DatabaseFileSystemstill issues a single query per grep/glob call;MSSQLFileSystempicks between four SQL templates (CONTAINSTABLE/Direct × lines/files) and skips content transfer entirely for-l(files-only) mode. - Indexed
extcolumn ongrover_objects— Derived frompathand indexed so-t pyon a million-row corpus becomes an index seek rather than a table scan. Maintained automatically on write. docs/ai_agent_glob_grep_patterns.md— Reference for agents on rg-equivalent query patterns against Grover.
Changed
grep/globkwargs aligned with ripgrep —case_sensitive→case_mode(smart/insensitive/sensitive),max_results→max_count. This is a breaking change for callers of the old kwargs.
Full Changelog: v0.0.15...v0.0.16
v0.0.15
What's Changed
Fixed
MSSQLFileSystemschema resolution for rawtext()SQL — Rawtext()queries inverify_fulltext_schema,_lexical_search_impl,_grep_impl,_grep_with_candidate_chunks, and_glob_implusedself._model.__tablename__directly, bypassing SQLAlchemy'sschema_translate_map(which only applies when compilingTablereferences). Mounts pointing at a non-default schema hitInvalid object name 'grover_objects'on every search call. Fixed by adding aschemakwarg toGroverFileSystemthat storesself._schemaand appliesschema_translate_map={None: schema}to every session via_use_session()so ORM queries continue to resolve correctly, plus a_resolve_table()helper onMSSQLFileSystemthat qualifies the bare__tablename__withself._schemafor raw SQL. Works uniformly acrossengine=andsession_factory=construction and supports multiple filesystems sharing one factory with different schemas (per-session connection options). Closes #3.verify_fulltext_schemaDDL hint key column — The suggestedCREATE UNIQUE NONCLUSTERED INDEXreferenced(path), butpathismax_length=4096and exceeds SQL Server's 900-byte index key limit, so the DDL would always fail. The Full-TextKEY INDEXnow targets(id), the 36-character UUID primary key.
Added
schemakwarg onGroverFileSystem— Optional, forwarded throughDatabaseFileSystem.__init__andMSSQLFileSystem.__init__. When set,_use_session()appliesschema_translate_map={None: schema}per session so ORM queries resolve unqualified tables, andMSSQLFileSystemraw queries qualify the table name with it.
Full Changelog: v0.0.14...v0.0.15
v0.0.14
What's Changed
Added
MSSQLFileSystem(alpha) — SQL Server / Azure SQL backend with full-text search and native regex pushdown. Subclass ofDatabaseFileSystemthat overrides_lexical_search_impl,_grep_impl, and_glob_implto push work into SQL Server 2025+ viaCONTAINSTABLEandREGEXP_LIKE. CRUD, versions, chunks, connections, graph, and vector search are inherited unchanged. Includesverify_fulltext_schema()startup check, a dialect parameter budget of 2000, and a Docker dev environment (SQL Server 2025 + Full-Text Search + ODBC Driver 18) withmssql_up.sh/mssql_down.sh/mssql_test.shhelpers. Install viagrover[mssql](requiresaioodbc>=0.5andpyodbc>=5.0). Operators must provision the Full-Text catalog and index outside the application. Integration tests gated onpytest --mssql; helpers run unconditionally in CI.src/grover/backends/mssql.pyis excluded from the coverage gate until a SQL Server 2025 service container is wired into CI.- Mount-level permissions —
read/read_writeflag onadd_mount()for coarse-grained access control. Read-only mounts reject all write operations at the facade boundary. - Directory-level permissions via
PermissionMap— fine-grained per-directory permission rules layered on top of mount permissions. Routing checks both mount and directory permissions before dispatching to the backend.
Full Changelog: v0.0.13...v0.0.14
v0.0.13
What's Changed
Added
GroverObjectBase.clone()— Fast (~1.7µs) method to create a detached copy of a model instance with independent SQLAlchemy state. Uses shallow copy + freshInstanceStateso clones can be safely added to any session.
Fixed
write(objects=...)no longer mutates input objects —_group_objects_by_terminalnow clones objects before stripping mount prefixes, preserving the caller's original list.add_prefixpath normalization — Prefixes are now normalized vianormalize_path()before concatenation, ensuring paths always have a leading/regardless of prefix format.strip_prefixsafety — Now validates the prefix matches the start of the path and raisesValueErroron mismatch instead of blindly slicing. Prefixes are normalized before comparison._rederive_path_fieldsnormalization — Callsnormalize_path()as a safety net, guaranteeing all post-mutation paths are valid before reaching the database.
Full Changelog: v0.0.12...v0.0.13
v0.0.12
What's Changed
Changed
- Unified client API — All
Groversync methods now returnGroverResult, matchingGroverFileSystemexactly. Single-path CRUD methods (read,write,edit,delete,stat,mkdir,mkconn) no longer unwrap toCandidate. add_mountsimplified — Accepts both"data"and"/data", rejects nested paths. No more factory kwargs (engine_url,session_factory, etc.) — constructDatabaseFileSystemexplicitly and pass it in.- No overrides in facades — Mount normalization, engine disposal, and
close()live onGroverFileSystem.GroverAsyncis now a one-liner subclass.Groversync wrapper is a pure delegation layer. - Batch parameters added to sync
Grover—candidatesparam onread,stat,edit,delete,ls;editslist onedit;moves/copiesbatch lists onmove/copy;objectsonwrite.
Fixed
- Path length limit test — Account for
/.versions/1suffix when testing max path length against the 4096-char column limit.
Full Changelog: v0.0.11...v0.0.12