-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pipeline] Allow for batch indexing when using Pipelines fix #1168 #1231
Conversation
@tholor @brandenchan This the first draft of how we can introduce batch processing when using the pipeline. Please review the design changes and Let me know that I am on the right track 😎 After your approval, I will add the remaining changes to the Pull Request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main changes look good. I'd just like to 1) make sure the meta
argument is more flexible, 2) suggest using Path.suffix and 3) ask whether its possible to have the REST API also support batches of documents
A single dict that is applied to all files One dict for each file being converted None #1168
Great thanks for making those changes! Just one last thing came to mind - I am wondering if these REST API changes might break the file_upload functionality in the UI. Could you test this out @akkefa ? |
Thank you for your feedback. Which UI you are talking about? |
I'm thinking about the Streamlit ui that's packaged with haystack. |
@brandenchan |
@brandenchan I update the upload file parameter to files in streamlit. Rest are the changes working fine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Thanks for your work on this. This gets my thumbs up.
Proposed changes:
Status (please check what you already did):