Skip to content

Add support for ensuring all data in vector database is up to date (old data should be cleaned up) #2996

@murphp15

Description

@murphp15

Dependencies

#2995

What is the feature request? What problem does it solve?

After #2995 is complete we will have data jobs which can periodically scrape the organisational data source and publish new data into the vector store. The vector database should reflect the current state of confluence. This means that old data should be deleted and new data ingested into the database incrementally.

Might require additions to the template/logic for reading from data source.

Definition of Done

  1. Support incrementally updating the vector database in the data job from Create a simple datajob which reads from a datasource(confluence/jira) and writes data and embeddings to postgres #2995
  2. Propose a way to extend the data job template

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestinitiative: VDK for Private AIInitiative including the effort to support Private AI usecases of VMWare with VDK

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions