feat: Support Vector Search in Valkey#354
Conversation
…batch functionality
- Add explicit type annotation for schema_fields to support both TagField and VectorField - Encode project string to bytes for consistency with other hash values - Decode doc_key bytes to string for hmget compatibility - Fix code formatting: break long lines and remove extra blank lines - Remove tests for multiple vector fields (Feast enforces one vector per feature view) - Fix config type: use 'eg-valkey' (hyphen) not 'eg_valkey' (underscore) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…h-support-reads Resolves merge conflicts and incorporates Option A implementation: - One index per vector field with feature name in index name - Float64 to Float32 conversion (Valkey limitation) - Vector fields use original name for hset keys Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| """ | ||
| # Build KNN query with project filter | ||
| # Format: "(@__project__:{project})=>[KNN {top_k} @{field} $vec AS distance]" | ||
| query_str = ( |
There was a problem hiding this comment.
Do we allow hyphens in project name? If yes, then Iyou would have to escape it. Redisearch interprets it as negation.
| search_results = [] | ||
| for doc in results.docs: | ||
| doc_key = doc.id.encode() if isinstance(doc.id, str) else doc.id | ||
| distance = float(getattr(doc, "__distance__", 0.0)) |
There was a problem hiding this comment.
Distance of 0.0 means its a perfect match right? I think we should use something else as default.
| query = ( | ||
| Query(query_str) | ||
| .return_fields("__distance__") | ||
| .sort_by("__distance__") |
There was a problem hiding this comment.
sort_by("distance") would be ascending or descending?
There was a problem hiding this comment.
sort_by() defaults to ascending order, I'll make it explicit
There was a problem hiding this comment.
I changed it a bit to infer sort order from distance metric
| table: FeatureView, | ||
| requested_features: List[str], | ||
| search_results: List[Tuple[bytes, float]], | ||
| vector_field: Field, |
There was a problem hiding this comment.
Where is this being used?
| embedding: Query embedding vector | ||
| top_k: Number of results to return | ||
| distance_metric: Optional override for distance metric (COSINE, L2, IP) | ||
| query_string: Not supported in V1 (reserved for future BM25 search) |
There was a problem hiding this comment.
| query_string: Not supported in V1 (reserved for future BM25 search) | |
| query_string: Not supported in V2(reserved for future BM25 search) |
Add the third argument (vector_field.name) to _get_vector_index_name call to match the updated function signature. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| entity_hset = dict() | ||
| entity_hset[ts_key] = ts.SerializeToString() | ||
| # Store project and entity key for vector search | ||
| entity_hset["__project__"] = project.encode() |
There was a problem hiding this comment.
This is producing bytes but line 1138 expects a string. Can you test it out?
Resolved conflicts: - eg_valkey.py: Keep both deserialize_entity_key import (for reads) and Query import (for FT.SEARCH) - test_valkey.py: Keep read/search test classes from HEAD
What this PR does / why we need it:
Adds vector similarity search support to the EG Valkey online store, enabling semantic search use cases for ML features.
Which issue(s) this PR fixes:
Misc