Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: avoid embedding the user input into the error response. #2207

Merged
merged 17 commits into from
Apr 4, 2022

Conversation

CNLHC
Copy link
Contributor

@CNLHC CNLHC commented Mar 4, 2022

Pull Request

What does this PR do?

Fix #2107.

The problem is meilisearch embeds the user input to the error message.

The reason for this problem is milli throws a serde_json: Error whose Display implementation will do this embedding.

I tried to solve this problem in this PR by manually implementing the Display trait for DocumentFormatError instead of deriving automatically.

PR checklist

Please check if your PR fulfills the following requirements:

  • Does this PR fix an existing issue?
  • Have you read the contributing guidelines?
  • Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!

@CNLHC CNLHC changed the title Fix #2107 Fix #2107: avoid embedding the whole document in the error response. Mar 4, 2022
@CNLHC CNLHC changed the title Fix #2107: avoid embedding the whole document in the error response. Fix #2107: avoid embedding the user input into the error response. Mar 4, 2022
@CNLHC CNLHC marked this pull request as ready for review March 4, 2022 07:50
Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your PR!
Apart from some esthetic changes, it's good for me.

meilisearch-lib/src/document_formats.rs Outdated Show resolved Hide resolved
@irevoire irevoire added this to the v0.27.0 milestone Mar 7, 2022
LiuHanCheng and others added 2 commits March 8, 2022 12:03
Co-authored-by: Clément Renault <renault.cle@gmail.com>
Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had some thoughts about your PR and think it removes too much information about the error, users will be lost if an error occurs and only the line and column are specified.

What do you think about ellipsing the message? By using something like 50 chars at the start and 50 chars at the end:

The `json` payload provided is malformed. `Couldn't serialize document value: invalid type: string \"my very ver...very very long body here\", expected a documents, or a sequence of documents. at line 1 column 7`.

meilisearch-lib/src/document_formats.rs Show resolved Hide resolved
meilisearch-lib/src/document_formats.rs Show resolved Hide resolved
@curquiza
Copy link
Member

Hello @CNLHC, thanks again for starting this PR. Do you need any precision or help regarding the @Kerollmops review? 🙂 The whole core team is available to help you do your contribution.

Also, we would totally understand you don't have the time to apply the changes. In this case, let me know so that we can let someone else work on it and integrate the bug fix to the next release 🙂

Hope everything is doing well on your side!

@CNLHC
Copy link
Contributor Author

CNLHC commented Mar 24, 2022

Sorry for the late response. I will submit a patch to follow the suggestion of @Kerollmops today.

@CNLHC CNLHC requested a review from Kerollmops March 25, 2022 13:34
CNLHC and others added 3 commits March 25, 2022 21:36
Co-authored-by: Clément Renault <renault.cle@gmail.com>
Co-authored-by: Clément Renault <renault.cle@gmail.com>
@curquiza curquiza changed the title Fix #2107: avoid embedding the user input into the error response. Fix: avoid embedding the user input into the error response. Mar 28, 2022
@curquiza
Copy link
Member

@CNLHC thanks a lot for these changes 🙏
Looks like there is a tiny issue to fix with our clippy CI before Kero reviews it :)

Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your contribution ❤️
However, could you modify the tests according to the changes I proposed with the ellipsing algorithm? And could you stash everything into one final commit when we are done, please?

meilisearch-http/tests/documents/add_documents.rs Outdated Show resolved Hide resolved
Comment on lines 41 to 58
let mut serde_msg = se.to_string();
let prefix = r#"invalid type: string ""#;
if serde_msg.starts_with(prefix) {
let start_idx = prefix.len();
if let Some(end_idx) = serde_msg.rfind("\"") {
if end_idx - start_idx > 100 {
serde_msg.replace_range(start_idx + 50..end_idx - 50, " ... ");
}
} else {
serde_msg = String::from("");
}
}

write!(
f,
"The `{}` payload provided is malformed. `Couldn't serialize document value: {}`.",
b,serde_msg
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ze can simplify the algorithm by directly ellipsing the whole message. Also matching on the invalid type: string " isn't a good idea, it could happen with objects too. What do you think? Could you modify the tests accordingly, please?

Suggested change
let mut serde_msg = se.to_string();
let prefix = r#"invalid type: string ""#;
if serde_msg.starts_with(prefix) {
let start_idx = prefix.len();
if let Some(end_idx) = serde_msg.rfind("\"") {
if end_idx - start_idx > 100 {
serde_msg.replace_range(start_idx + 50..end_idx - 50, " ... ");
}
} else {
serde_msg = String::from("");
}
}
write!(
f,
"The `{}` payload provided is malformed. `Couldn't serialize document value: {}`.",
b,serde_msg
)
let mut serde_msg = se.to_string();
let ellipsis = "...";
if serde_msg.len() > 100 + ellipsis.len() {
serde_msg.replace_range(50..serde_msg.len() - 50, ellipsis);
}
write!(
f,
"The `{}` payload provided is malformed. `Couldn't serialize document value: {}`.",
b, serde_msg
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some cases, this method will lead to weird error messages. For example, if the raw message is

The `json` payload provided is malformed. `Couldn't serialize document value: invalid type: string "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789", expected a documents, or a sequence of documents. at line 1 column 102`.

Using your method, ellipsing the string without parsing the context will generate

The `json` payload provided is malformed. `Couldn't serialize document value: invalid type: string \"0123456789012345678901234567..., or a sequence of documents. at line 1 column 102`.

Some meaningful information(The expected a documents) is truncated.

Copy link
Contributor Author

@CNLHC CNLHC Mar 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The meaningful suffix length cannot be determined. But in most cases, its length is around 75.

", expected a documents, or a sequence of documents. at line <LINE_NUMBER> column <COLUMN_NUMBER>

So I suggest to change 50..serde_msg.len() - 50 to 50..serde_msg.len() - 85

LiuHanCheng and others added 3 commits March 31, 2022 10:14
Co-authored-by: Clément Renault <renault.cle@gmail.com>
Co-authored-by: Clément Renault <renault.cle@gmail.com>
@CNLHC CNLHC requested a review from Kerollmops March 31, 2022 05:49
Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, that's better and clever. Could you remove the dbg! in the test and I'll merge after that, please?
Once you have done so, could you squash everything in one commit and rebase on main, please?

meilisearch-http/tests/documents/add_documents.rs Outdated Show resolved Hide resolved
meilisearch-http/tests/documents/add_documents.rs Outdated Show resolved Hide resolved
LiuHanCheng and others added 2 commits April 1, 2022 09:30
Co-authored-by: Clément Renault <renault.cle@gmail.com>
Co-authored-by: Clément Renault <renault.cle@gmail.com>
Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your work @CNLHC, merging now!
bors merge

@bors
Copy link
Contributor

bors bot commented Apr 4, 2022

@bors bors bot merged commit 9e344f6 into meilisearch:main Apr 4, 2022
@curquiza curquiza added the v0.27.0 PRs/issues solved in v0.27.0 label Aug 24, 2022
onyxcherry pushed a commit to onyxcherry/meilisearch that referenced this pull request Aug 31, 2022
bors bot added a commit that referenced this pull request Sep 7, 2022
2727: Don't panic when the error length is slightly over 100 r=Kerollmops a=onyxcherry

# Pull Request

## What does this PR do?
Fixes PR #2207 as [the last commit](7ece7a9) has changed number of the characters at the end to leave in place from `50` to `85` **but the lower limit of a string length wasn't changed**.
Therefore, any data (e.g. example string from issue #2680) was causing `meilisearch` to **panic**.

So I simply raised the minimum value from `100` to `135` (`50 + 85`) to ensure that `replace_range()` won't panic due to an inverted range.
At the same time I am in favor of the `85` value which was changed in the `@CNLHC's` last commit.


## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing ~issue~ pull request?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Tomasz Wiśniewski <tomasz@wisniewski.app>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v0.27.0 PRs/issues solved in v0.27.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error message contains whole payload when the type is not correct
4 participants