Fixed
- Leaky abstraction in vector stores: Vector store providers now use canonical
source_id instead of raw source_url. This fixes issue #19 where backends had to parse Source IDs from URLs.
- Added
source_id to StageContext for pipeline-wide canonical ID
- Renamed
VectorStoreProvider.delete_by_source() to delete_by_source_id()
- Updated all vector store implementations (ChromaDB, Pinecone, Weaviate, Supabase) to filter by
source_id
- Updated metadata to use
source_id instead of source_url
Migration Note
⚠️ Existing vector stores with source_url metadata will need to be re-indexed for force=True deletion to work. Alternatively, users can manually delete via the vector store's native tools.