Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable the client to specify a custom csv delimiter for parsing documents #2803

Closed

Conversation

MixusMinimax
Copy link
Contributor

@MixusMinimax MixusMinimax commented Sep 27, 2022

Pull Request

What does this PR do?

Fixes #2806

This PR adds the ability to specify custom csv delimiters, which I think is a very common use case, especially with large amounts of pre-existing data.

Description

In meilisearch-lib, I added a u8 parameter to the DocumentAdditionFormat::Csv document type.

In meilisearch-http, I added the field pub csv_delimiter: Option<char>, to UpdateDocumentsQuery (the actual query parameter is in camelCase thanks to serde: csvDelimiter)

The default delimiter is the comma (44), just like before. For example, the query for data delimited with semicolons:

POST localhost:7700/indexes/example/documents?csvDelimiter=%3B

cargo test succeeded on my system, and I was able to add documents using different delimiters using postman.

PR checklist

Please check if your PR fulfills the following requirements:

  • Does this PR fix an existing issue?
  • Have you read the contributing guidelines?
  • Have you made sure that the title is accurate and descriptive of the changes?

@curquiza
Copy link
Member

Hello @MixusMinimax thanks for your PR, since it impacts the product we need to discuss with the team first, and we will review it as soon as possible 😄

@curquiza
Copy link
Member

curquiza commented Sep 29, 2022

For traceability, more information about the state of this PR here: #2806 (comment) 😇

@curquiza curquiza marked this pull request as draft September 29, 2022 11:45
@curquiza curquiza mentioned this pull request Jan 31, 2023
2 tasks
@irevoire irevoire mentioned this pull request Feb 16, 2023
@MixusMinimax
Copy link
Contributor Author

MixusMinimax commented Feb 16, 2023

Hi!
Should I update this now to be up to date with main? I think it could work like this: have a struct of options, implement FromRequest for it, and pass it to add_documents. This could include the csv delimiter, and what quotation marks to use (csv fields that include the delimiter themselves are put in quotation marks in the csv standard)

Update: Ignore this, this PR is replaced by #3505

@bors bors bot closed this in 1e9ac00 Feb 20, 2023
@curquiza
Copy link
Member

Hello @MixusMinimax
Sorry for the late answer! Thanks for your previous work, it helped us implement #3505, you will be put as a contributor in the next release changelog of course 😇

@MixusMinimax
Copy link
Contributor Author

No worries, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable the client to specify a custom csv delimiter for parsing documents
2 participants