feat: Add version tracking to FeatureView#6101
feat: Add version tracking to FeatureView#6101franciscojavierarceo wants to merge 11 commits intomasterfrom
Conversation
…emandFeatureView Every `feast apply` now creates a version snapshot. Users can pin a feature view to a specific historical version declaratively via `version="v2"`. By default, the latest version is always served. - New proto: FeatureViewVersion.proto with version record/history - Added `version` field to FeatureViewSpec, StreamFeatureViewSpec, OnDemandFeatureViewSpec and version metadata to their Meta messages - New version_utils module for parsing/normalizing version strings - Version-aware apply_feature_view in both SQL and file registries - New `list_feature_view_versions` API on FeatureStore and registries - CLI: `feast feature-views versions <name>` subcommand - Updated all 14 templates with explicit `version="latest"` - Unit tests (28) and integration tests (7) for versioning Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix current_version_number=0 being silently dropped during proto deserialization in FeatureView, OnDemandFeatureView (proto3 int32 default 0 is falsy in Python); use spec.version to disambiguate - Add current_version_number restoration in StreamFeatureView.from_proto (was missing entirely) - Use timezone-aware UTC datetime in SqlRegistry.list_feature_view_versions for consistency with the rest of the codebase - Add test for v0 proto roundtrip Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add Versioning section to feature-view.md concept page covering automatic snapshots, version pinning, version string formats, CLI usage, and Python SDK API - Add `feast feature-views versions` command to CLI reference Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix current_version_number roundtrip bug: version="latest" (always truthy) caused None to become 0 after proto roundtrip; now check that spec.version is not "latest" before treating 0 as intentional - Use write_engine (not read_engine) for pre/post apply reads in SqlRegistry to avoid read replica lag causing missed version snapshots - Remove redundant version check in StreamFeatureView.__eq__ (parent FeatureView.__eq__ already checks it) - Add else clause to StreamFeatureView.from_proto for consistency - Add test for latest/None roundtrip preservation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ntly - delete_feature_view now also deletes version history records, preventing IntegrityError when re-creating a previously deleted FV - _get_next_version_number uses write_engine instead of read_engine to avoid stale version numbers with read replicas Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add step-by-step walkthrough showing how versions auto-increment on changes and skip on identical re-applies - Add CLI example showing the apply/change/apply cycle - Clarify that pinning ignores constructor params and uses the snapshot - Explain how to return to auto-incrementing after a pin/revert Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Devin Review found 1 new potential issue.
🐛 1 issue in files not directly in the diff
🐛 BatchFeatureView missing version parameter — incomplete transformation (sdk/python/feast/batch_feature_view.py:80-102)
FeatureView, StreamFeatureView, and OnDemandFeatureView all accept a version parameter in their constructors, but BatchFeatureView.__init__ (sdk/python/feast/batch_feature_view.py:80-102) does not. Furthermore, BatchFeatureView.__init__ does not pass version to super().__init__() at sdk/python/feast/batch_feature_view.py:144-158. This means users cannot construct a BatchFeatureView with a pinned version (e.g., BatchFeatureView(..., version="v2") raises TypeError), and any BatchFeatureView created directly always defaults to version="latest". The proto round-trip path works because FeatureView.from_proto sets feature_view.version after construction, but the constructor API is inconsistent with the other feature view types.
View 19 additional findings in Devin Review.
Raises FeatureViewPinConflict when a user pins to an older version while also modifying the feature view definition (schema, source, etc.). Fixes FeatureView.__copy__() to include description and owner fields, which was causing false positive conflict detection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Devin Review found 1 new potential issue.
🐛 1 issue in files not directly in the diff
🐛 BatchFeatureView missing version parameter — incomplete transformation (sdk/python/feast/batch_feature_view.py:80-102)
FeatureView, StreamFeatureView, and OnDemandFeatureView all accept a version parameter in their constructors, but BatchFeatureView.__init__ (sdk/python/feast/batch_feature_view.py:80-102) does not. Furthermore, BatchFeatureView.__init__ does not pass version to super().__init__() at sdk/python/feast/batch_feature_view.py:144-158. This means users cannot construct a BatchFeatureView with a pinned version (e.g., BatchFeatureView(..., version="v2") raises TypeError), and any BatchFeatureView created directly always defaults to version="latest". The proto round-trip path works because FeatureView.from_proto sets feature_view.version after construction, but the constructor API is inconsistent with the other feature view types.
View 23 additional findings in Devin Review.
- Add version parameter to BatchFeatureView constructor for consistency with FeatureView, StreamFeatureView, and OnDemandFeatureView - Clean up version history records in file registry delete_feature_view to prevent orphaned records on re-creation - Fix current_version_number proto roundtrip: preserve 0 when version="latest" (after first apply) instead of incorrectly returning None Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Clarify that versioning provides definition management and rollback, not concurrent multi-version serving. Document recommended approaches (separate projects or distinct FV names) for A/B testing scenarios. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extends feature view versioning with support for reading features from specific
versions at query time using the syntax: "driver_stats@v2:trips_today"
Core changes:
- Add _parse_feature_ref() to parse version-qualified feature references
- Update all feature reference parsing to use _parse_feature_ref()
- Add get_feature_view_by_version() to BaseRegistry and all implementations
- Add FeatureViewProjection.version_tag for multi-version query support
- Add version-aware _table_id() in SQLite online store (v0→unversioned, v1+→_v{N})
- Add VersionedOnlineReadNotSupported error for unsupported stores
Features:
- "driver_stats:trips" = "driver_stats@latest:trips" (backward compatible)
- "driver_stats@v2:trips" reads from v2 snapshot using _v2 table suffix
- Multiple versions in same query: ["driver@v1:trips", "driver@v2:daily"]
- Version parameter added to all decorator functions for consistency
Backward compatibility:
- Unversioned table serves as v0, only v1+ get _v{N} suffix
- All existing queries work unchanged
- SQLite-only for now, other stores raise clear error
Documentation:
- Updated feature-view.md with @Version syntax examples
- Updated feature-retrieval.md reference format
- Added version examples to how-to guides
Tests: 47 unit + 11 integration tests pass, no regressions
Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
There was a problem hiding this comment.
🟡 Version-qualified online reads with full_feature_names=True produce response columns containing @v<N> that get dropped
When using versioned feature references like driver_stats@v2:trips_today with full_feature_names=True, the requested_result_row_names set at sdk/python/feast/utils.py:1379-1381 is built by replacing : with __, producing driver_stats@v2__trips_today. However, the _populate_response_from_feature_data function at sdk/python/feast/utils.py:1112-1114 also uses table.projection.name_to_use() which includes @v2 in the name. So both sides contain @v2, and the match succeeds.
However, the @ character in column names like driver_stats@v2__trips_today is unusual and may break downstream consumers (e.g., pandas DataFrames using attribute access, or other systems expecting simple alphanumeric column names). This is a design concern rather than a runtime crash.
(Refers to lines 1379-1381)
Was this helpful? React with 👍 or 👎 to provide feedback.
| # Save new as next version | ||
| self._save_version_snapshot( | ||
| feature_view.name, | ||
| project, | ||
| next_ver, | ||
| fv_type_str, | ||
| new_proto_bytes, | ||
| ) |
There was a problem hiding this comment.
🟡 SqlRegistry apply_feature_view version snapshot stores proto with stale current_version_number
In SqlRegistry.apply_feature_view at lines 721-728, new_proto_bytes is read from the database before current_version_number is updated. The new_proto_bytes was written by _apply_object which serialized the FV without the correct current_version_number (since it hadn't been assigned yet). This stale proto is then saved as the version snapshot via _save_version_snapshot. Later at lines 730-743, current_version_number is set and the active proto is updated, but the snapshot in the version history table still contains the wrong current_version_number. When someone later retrieves this version via get_feature_view_by_version, the snapshot's current_version_number in the proto meta will be incorrect (it will have whatever value was there before the update).
Prompt for agents
In sdk/python/feast/infra/registry/sql.py, in the apply_feature_view method around lines 721-743, the version snapshot is saved using new_proto_bytes that was read from the DB before current_version_number was assigned. The snapshot should instead be created after updating feature_view.current_version_number so it contains the correct version metadata. Move the _save_version_snapshot call for the new version to after line 730 (where current_version_number is set), and use the re-serialized proto bytes (feature_view.to_proto().SerializeToString()) instead of the stale new_proto_bytes. The same issue exists in the 'new FV' branch at lines 745-752 where new_proto_bytes is used before current_version_number=0 is assigned.
Was this helpful? React with 👍 or 👎 to provide feedback.
| new_proto_bytes = feature_view_proto.SerializeToString() | ||
| if old_proto_bytes is not None: | ||
| # FV changed: save old as a version if first time, then save new | ||
| next_ver = self._next_version_number(feature_view.name, project) | ||
| if next_ver == 0: | ||
| self._save_version_record( | ||
| feature_view.name, project, 0, fv_type_str, old_proto_bytes | ||
| ) | ||
| next_ver = 1 | ||
| self._save_version_record( | ||
| feature_view.name, project, next_ver, fv_type_str, new_proto_bytes | ||
| ) | ||
| feature_view.current_version_number = next_ver | ||
| feature_view_proto = feature_view.to_proto() | ||
| feature_view_proto.spec.project = project | ||
| else: | ||
| # New FV: save as v0 | ||
| self._save_version_record( | ||
| feature_view.name, project, 0, fv_type_str, new_proto_bytes | ||
| ) | ||
| feature_view.current_version_number = 0 | ||
| feature_view_proto = feature_view.to_proto() | ||
| feature_view_proto.spec.project = project |
There was a problem hiding this comment.
🟡 File Registry apply_feature_view saves new version snapshot with stale proto bytes (before current_version_number is set)
In Registry.apply_feature_view at sdk/python/feast/infra/registry/registry.py:700-722, the new version snapshot is saved using new_proto_bytes obtained at line 700 which is serialized before current_version_number is updated at line 712. The snapshot saved at line 709-711 for the new version thus contains a stale current_version_number in its proto metadata. This means that retrieving the version snapshot later via get_feature_view_by_version will return a FV whose proto metadata has the wrong version number. The same issue exists in the 'new FV' branch at lines 715-722 where the snapshot is saved from new_proto_bytes before current_version_number = 0 is assigned at line 720.
Prompt for agents
In sdk/python/feast/infra/registry/registry.py in the apply_feature_view method, the version snapshots saved at lines 709-711 and 717-719 use new_proto_bytes that was serialized before current_version_number was assigned. Fix both branches: (1) For the 'FV changed' branch (lines 701-714), move the _save_version_record call for new_proto_bytes to after line 712 where current_version_number is set, and re-serialize the proto. (2) For the 'New FV' branch (lines 715-722), set current_version_number = 0 before calling _save_version_record, and re-serialize the proto. This ensures version snapshots contain accurate current_version_number metadata.
Was this helpful? React with 👍 or 👎 to provide feedback.
- Fix type inference issues in get_feature_view_by_version() - Use distinct variable names for different proto types - Ensure proper type annotations for BaseFeatureView subclasses
|
I haven't been able to go through the whole thing yet, but iiuc the expectation is that online stores would be expected to handle this by essentially treating fv1@v1 and fv1@v2 as two different feature views, right? (sqlite example infers table name from the version). Just wondering, maybe that's to high a bat for online store implementations. for example, if you have fv1@v1 and fv1@v2 both containing 100 features that differ by a single feature only, the online store would have to keep redundant copies of the other 99 features or do some complicated logic to diff values and deduplicate. wdyt about concentrating on introducing versioning on feature level instead? tbh, that makes more intuitive sense to me. there's a feature called |
I spent some time reasoning about feature-level versioning with Claude. My initial reaction was that it's too large of a change and it only works today in a broken sense. By "in a broken sense" I mean that today we don't really version feature views or features, if someone changes the feature view or the feature, we just overwrite it and lose history. Moreover, we essentially force the materialization to be out of sync. Today, that just ignores the behavior silently. I like how we've implemented it here (i.e., creating a new table and storing the history of the metadata but allowing for callers to specify exact versions and defaulting to the existing behavior) because it is far more explicit about the challenges of materialization consistency issues when you change feature versions. So, I don't recommend we do feature-level versioning as I worry it makes materialization very unreliable. We can, in fact, declare feature view level versioning because transformations are a collection of features mapped one-to-one with a table. |
Summary
FeatureView,StreamFeatureView, andOnDemandFeatureViewfeast applycreates a version snapshot in a separate history store (new SQL table / proto field)FeatureView(name="driver_stats", version="v2")version="latest"or omitted), the latest version is always served — fully backward compatibleChanges
Proto layer:
FeatureViewVersion.protowithFeatureViewVersionRecord/FeatureViewVersionHistoryversionfield toFeatureViewSpec,StreamFeatureViewSpec,OnDemandFeatureViewSpeccurrent_version_number/version_idto meta messagesfeature_view_version_historytoRegistryprotoPython SDK:
version_utils.py— parses"latest","v2","version2"(case-insensitive)FeatureViewVersionNotFounderror classversionparameter onFeatureView,StreamFeatureView,OnDemandFeatureViewapply_feature_viewin both SQL registry and file/proto registrylist_feature_view_versions(name, project)on registries andFeatureStorefeast feature-views versions <name>subcommandversion="latest"Backward compatibility:
versionis fully optional — omitting it is identical toversion="latest"metadata.create_all(checkfirst=True)auto-creates the new history tableTest plan
🤖 Generated with Claude Code
Resolves #2728