Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DocumentsBuilder #5700

Open
Tracked by #5330
sjrl opened this issue Sep 1, 2023 · 0 comments · May be fixed by #6636
Open
Tracked by #5330

DocumentsBuilder #5700

sjrl opened this issue Sep 1, 2023 · 0 comments · May be fixed by #6636
Labels
2.x Related to Haystack v2.0 P3 Low priority, leave it in the backlog type:feature New feature or request

Comments

@sjrl
Copy link
Contributor

sjrl commented Sep 1, 2023

See the proposal: #5540 and see AnswersBuilder


LLMs clients output strings, but many components expect other object types, and LLMs may produce output in a parsable format that can be directly converted into objects. Output parsers transform these strings into objects of the user’s choosing.

DocumentsBuilder. It takes the string replies and metadata output of an LLM and produces Document objects.

For example, a PromptNode could be used to summarize a longer doc and the user would like to have the result output as a Document object. This document object could then be shown to the end-user or it could be used in another PromptNode to answer a question for example.

Draft I/O for DocumentsBuilder:

@component
class DocumentsBuilder:

    @component.output_types(answers=List[List[Document]])
    def run(self, replies: List[List[str]], metadata: List[List[Dict[str, Any]]], documents: Optional[List[List[Document]]]):
        all_documents = []
        for replies_list, meta, document_list in zip(replies, metadata, documents):
            documents = [Document(content=document, metadata={**meta, "documents": document_list}) for document in replies_list]
            all_documents.append(documents)
        return {"documents": all_documents}
@sjrl sjrl mentioned this issue Sep 1, 2023
@sjrl sjrl added the 2.x Related to Haystack v2.0 label Sep 1, 2023
@sjrl sjrl changed the title DocumentsBuilder DocumentsBuilder Sep 1, 2023
@Timoeller Timoeller modified the milestone: 2.0-beta Oct 9, 2023
@Timoeller Timoeller added the P3 Low priority, leave it in the backlog label Oct 12, 2023
@mathislucka mathislucka added the type:feature New feature or request label Dec 22, 2023
@vrunm vrunm linked a pull request Dec 22, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to Haystack v2.0 P3 Low priority, leave it in the backlog type:feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants