-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add split_by_row
feature to CSVDocumentSplitter
#9031
Conversation
Pull Request Test Coverage Report for Build 13944618786Details
💛 - Coveralls |
@Amnah199 thanks for working on this! I think we should take a slightly different approach to toggle between the two different split modes. Instead of using the boolean from typing import Literal
SplitMode = Literal["threshold", "row-wise"]
I think this would improve readability and makes it immediately clear what the expected behavior is. Let me know what you think! |
…epset-ai/haystack into add-split-by-row-csv-splitter Merge pull
@Amnah199 In the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Related Issues
Proposed Changes:
Add a new parameter
_split_by_row
toCSVDocumentSplitter
. When_split_by_row=True
, other split arguments won't be considered.How did you test it?
Added new unit tests
Notes for the reviewer
If this is merged, we need to update the documentation of this component slightly.
Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
and added!
in case the PR includes breaking changes.