Extract actual Content-Type from the header #1933

Kerollmops · 2021-11-24T13:52:37Z

This PR should fix #1866 by extracting the actual Content-Type from the header and ignoring everything that follows the first semicolon (;) which, most of the time, means ignoring the charset=utf-8. We use the mime_type method to do so.

meilisearch-http/src/routes/indexes/documents.rs

irevoire

@MarinPostma I let you merge 👍

MarinPostma

I would like to use methods provided by actix to handle that :)

MarinPostma · 2021-11-24T14:28:35Z

meilisearch-http/src/routes/indexes/documents.rs

+    match headers.get(CONTENT_TYPE) {
+        Some(value) => match value.to_str() {
+            Ok(content) => Some(
+                content
+                    .split_once(";")
+                    .map_or(content, |(ct, _)| ct.trim())
+                    .to_string(),
+            ),
+            Err(_) => Some(format!("{}", value.as_bytes().as_bstr())), // convenient bytes display
+        },
+        None => None,


I would prefer that we use this method that is implemented on HttpRequest, and leverage the Mine library, instead of doing this ourselves

Wow, didn't see this, that's awesome. The only difference that I see with my function is that it will not try to return a malformed content-type, one that is not utf-8 encoded for example, and we will not be able to return it in the error messages. But, not sure that's very important. Changing my PR to use this method, thank you!

MarinPostma

Can you update the tests to make use of the server

meilisearch-http/tests/content_type.rs

MarinPostma

look good to me, thank you :)

Kerollmops requested review from MarinPostma and irevoire November 24, 2021 14:06

irevoire reviewed Nov 24, 2021

View reviewed changes

meilisearch-http/src/routes/indexes/documents.rs Outdated Show resolved Hide resolved

irevoire approved these changes Nov 24, 2021

View reviewed changes

Kerollmops added 2 commits November 24, 2021 15:15

Extract the actual content-type and ignore everything else

bb2f0da

Add a test to make sure that we extract the actual content-type

688b8a9

Kerollmops force-pushed the charset-content-type branch from 756a641 to 688b8a9 Compare November 24, 2021 14:16

MarinPostma suggested changes Nov 24, 2021

View reviewed changes

Kerollmops requested review from irevoire and MarinPostma November 24, 2021 14:59

Prefer using the mime type

3d7ca58

Kerollmops force-pushed the charset-content-type branch from 21b8dcc to 3d7ca58 Compare November 24, 2021 15:04

irevoire approved these changes Nov 24, 2021

View reviewed changes

MarinPostma suggested changes Nov 24, 2021

View reviewed changes

meilisearch-http/tests/content_type.rs Show resolved Hide resolved

Kerollmops requested a review from MarinPostma November 25, 2021 09:47

MarinPostma approved these changes Nov 25, 2021

View reviewed changes

MarinPostma merged commit 506349d into new-update-store Nov 25, 2021

MarinPostma deleted the charset-content-type branch November 25, 2021 10:31

gmourier mentioned this pull request Nov 29, 2021

Accept charset in Content-Type header #1866

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extract actual Content-Type from the header #1933

Extract actual Content-Type from the header #1933

Kerollmops commented Nov 24, 2021 •

edited

irevoire left a comment

MarinPostma left a comment

MarinPostma Nov 24, 2021

Kerollmops Nov 24, 2021

MarinPostma left a comment

MarinPostma left a comment

Extract actual Content-Type from the header #1933

Extract actual Content-Type from the header #1933

Conversation

Kerollmops commented Nov 24, 2021 • edited

irevoire left a comment

Choose a reason for hiding this comment

MarinPostma left a comment

Choose a reason for hiding this comment

MarinPostma Nov 24, 2021

Choose a reason for hiding this comment

Kerollmops Nov 24, 2021

Choose a reason for hiding this comment

MarinPostma left a comment

Choose a reason for hiding this comment

MarinPostma left a comment

Choose a reason for hiding this comment

Kerollmops commented Nov 24, 2021 •

edited