Skip to content

[Store] Indexer is nice, but hard to use directly #875

@lyrixx

Description

@lyrixx

Hello.

In my previous POC, I did everything manually. And it worked well! Thanks for that.

With the bundle, I discover the Indexer stuff.

It looks good.

But It seems I cannot use it. The indexer need a loader to work. I can override it with a specific source. But my use case does not fit.

I consume messages from Rabbitmq through messenger. So the fetching documents from loader part is not useful for me. But the next bits are really cool (filters, transformers, vectoriser, and store).

So I would expect a service where I'm able to do:

#[AsMessageHandler()]
final readonly class VectorizeCrawlUrlHandler
{
    public function __construct(
        // #[Target('crawl_url')]
        private IndexerInterface $indexer,
    ) {
    }

    public function __invoke(VectorizeCrawlUrl $crawlUrl): void
    {
        $this->indexer->index(new TextDocument(
            id: new Uuid($crawlUrl->id),
            content: $crawlUrl->body,
            metadata: new Metadata([
                // My stuff here
            ]),
        ));
    }
}

What do you think about this use case?

Could we consider splitting the indexer? Or we could change a bit the API to accept a document, or a collection of document. And in such situation, we skip the loader?

Metadata

Metadata

Assignees

No one assigned

    Labels

    RFCRFC = Request For Comments (proposals about features that you want to be discussed)StoreIssues & PRs about the AI Store component

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions