Nested fields #2298

irevoire · 2022-04-07T13:18:13Z

There are a few things that I want to fix AFTER merging this PR.
For the following RCs.

Stop the useless conversion

In the search.rs I convert a Document to a Value, and then the Value to a Document and then back to a Value etc. I should stop doing all these conversion and stick to one format.
Probably by merging my permissive-json-pointer crate into meilisearch.
That would also give me the opportunity to work directly with obkvs and stops deserializing fields I don't need.

Add more test specific to the nested

Everything seems to works but I should write tests to double check that the nested works well with the formatted field.

See how I could stop iterating on hashmap and instead fill them correctly

This is related to milli. I really often needs to iterate over hashmap to see if a field is a subset of another field. I could probably generate a structure containing all the possible key values.
ie. the user say doggo is an attribute to retrieve. Instead of iterating on all the attributes to retrieve to check if doggo.name is a subset of doggo. I should insert doggo.name in the attributes to retrieve map.

meilisearch-lib/src/index/search.rs

MarinPostma · 2022-04-07T13:56:23Z

meilisearch-lib/src/index/search.rs

+    // then we need to convert the `serde_json::Map` into an `IndexMap`.
+    let document = document.into_iter().collect();


I'm wondering, is there a risk that the fields in the Map are reordered, and returned in a different order than the original document?

Yes, I was concerned about that too. From what I see it works but it's not specified it should so I guess I'm just lucky.
I would like to move the permissive-json-pointer crate into meilisearch so we could make it work directly with the IndexMap.
Do you think this needs to be fixed before the RC0 or can it wait until the RC1?

@gmourier @curquiza
Basically, the jsons in hits could have their fields in a different order each time.

actually, when the preserve_order feature from serde_json is enables, a serde_json::Map is nothing but an IndexMap under the hood: https://docs.serde.rs/src/serde_json/map.rs.html#23-26

and it turns out that we're lucky: https://github.com/meilisearch/meilisearch/blob/main/meilisearch-lib/Cargo.toml#L45

This now begs the question: Do we need to do the back and forth conversion?

Ah! No we don't, and I guess we could just remove entirely the IndexMap and define a Document as a serde_json::Map<String, Value>!

I think we can indeed 🤔

Basically, the jsons in hits could have their fields in a different order each time.

Ok to wait for rc1 to fix this!

meilisearch-lib/src/index/search.rs

Kerollmops

Looks good to me. We can now wait for the users to try to break this feature! I am also awaiting the benchmarks too.

meilisearch-auth/Cargo.toml

meilisearch-lib/Cargo.toml

meilisearch-lib/src/index/search.rs

Kerollmops

Looks good to me. Are we waiting for meilisearch/milli#458 to be merged and released?

Kerollmops

Thank you for the clippy fix!

irevoire · 2022-04-07T17:20:56Z

Yep we need to wait for the release because we need to publish the new Cargo.lock for the CI to run

Kerollmops

Looks good to me, thank you @irevoire!

See meilisearch/specifications#121

irevoire · 2022-04-07T22:00:05Z

bors merge

2298: Nested fields r=irevoire a=irevoire There are a few things that I want to fix _AFTER_ merging this PR. For the following RCs. ## Stop the useless conversion In the `search.rs` I convert a `Document` to a `Value`, and then the `Value` to a `Document` and then back to a `Value` etc. I should stop doing all these conversion and stick to one format. Probably by merging my `permissive-json-pointer` crate into meilisearch. That would also give me the opportunity to work directly with obkvs and stops deserializing fields I don't need. ## Add more test specific to the nested Everything seems to works but I should write tests to double check that the nested works well with the `formatted` field. ## See how I could stop iterating on hashmap and instead fill them correctly This is related to milli. I really often needs to iterate over hashmap to see if a field is a subset of another field. I could probably generate a structure containing all the possible key values. ie. the user say `doggo` is an attribute to retrieve. Instead of iterating on all the attributes to retrieve to check if `doggo.name` is a subset of `doggo`. I should insert `doggo.name` in the attributes to retrieve map. Co-authored-by: Tamo <tamo@meilisearch.com>

bors · 2022-04-07T22:08:44Z

Build failed:

Run Clippy

Kerollmops · 2022-04-07T22:10:15Z

Is it a bug in Clippy?

irevoire · 2022-04-07T22:42:19Z

yeah look like, it says it's an internal compiler error 😒

Also the benchmarks just finished running and it's worse than I thought 😩

group                                                             indexing_main_4ae7aea3                 indexing_nested_fields_0438ab7d
-----                                                             ----------------------                 -------------------------------
indexing/Indexing geo_point                                       1.00       8.7±0.19s        ? ?/sec    2.81      24.5±0.17s        ? ?/sec
indexing/Indexing movies in three batches                         1.00      18.2±0.26s        ? ?/sec    1.03      18.7±0.27s        ? ?/sec
indexing/Indexing movies with default settings                    1.00      17.6±0.11s        ? ?/sec    1.01      17.9±0.08s        ? ?/sec
indexing/Indexing songs in three batches with default settings    1.00      63.4±0.41s        ? ?/sec    1.08      68.3±0.93s        ? ?/sec
indexing/Indexing songs with default settings                     1.00      53.2±0.95s        ? ?/sec    1.21      64.2±0.93s        ? ?/sec
indexing/Indexing songs without any facets                        1.00      49.2±0.96s        ? ?/sec    1.20      59.2±1.22s        ? ?/sec
indexing/Indexing songs without faceted numbers                   1.00      52.0±0.23s        ? ?/sec    1.21      62.8±0.28s        ? ?/sec
indexing/Indexing wiki                                            1.00   1007.1±10.25s        ? ?/sec    1.01   1014.7±10.74s        ? ?/sec
indexing/Indexing wiki in three batches                           1.00    1136.9±7.83s        ? ?/sec    1.01    1152.7±9.64s        ? ?/sec

I don't really understand how the geosearch can be so slow

irevoire · 2022-04-11T11:45:22Z

bors merge

bors · 2022-04-11T12:01:22Z

Build succeeded:

2313: fix(search): remove the back and forth between the IndexMap and the serde_json::Map r=irevoire a=irevoire This is ok because we're using the preserve_order feature in serde_json which is already internally using an IndexMap. See #2298 (comment) Co-authored-by: Tamo <tamo@meilisearch.com>

irevoire added the breaking change The related changes are breaking for the users label Apr 7, 2022

curquiza requested review from Kerollmops and MarinPostma April 7, 2022 13:23

curquiza added this to the v0.27.0 milestone Apr 7, 2022

MarinPostma suggested changes Apr 7, 2022

View reviewed changes

irevoire force-pushed the nested_fields branch 2 times, most recently from 7e45f8a to 68da743 Compare April 7, 2022 15:03

Kerollmops approved these changes Apr 7, 2022

View reviewed changes

Kerollmops previously approved these changes Apr 7, 2022

View reviewed changes

irevoire dismissed Kerollmops’s stale review via 58258c5 April 7, 2022 17:15

Kerollmops previously approved these changes Apr 7, 2022

View reviewed changes

irevoire dismissed Kerollmops’s stale review via 0fbb87d April 7, 2022 17:41

irevoire force-pushed the nested_fields branch from 58258c5 to 0fbb87d Compare April 7, 2022 17:41

irevoire requested a review from MarinPostma April 7, 2022 17:41

irevoire mentioned this pull request Apr 7, 2022

Improve the nested fields performance #2300

Closed

5 tasks

Kerollmops previously approved these changes Apr 7, 2022

View reviewed changes

feat(search): Implements the nested fields

69d3122

See meilisearch/specifications#121

irevoire dismissed Kerollmops’s stale review via 69d3122 April 7, 2022 17:47

irevoire force-pushed the nested_fields branch from 0fbb87d to 69d3122 Compare April 7, 2022 17:47

MarinPostma approved these changes Apr 7, 2022

View reviewed changes

bors bot merged commit 31584f3 into main Apr 11, 2022

bors bot deleted the nested_fields branch April 11, 2022 12:01

curquiza mentioned this pull request Apr 11, 2022

Support nested fields #2211

Closed

3 tasks

irevoire mentioned this pull request Apr 12, 2022

fix(search): remove the back and forth between the IndexMap and the serde_json::Map #2313

Merged

This was referenced Apr 20, 2022

Fix tests related to changes in placeholder hits order resolver meilisearch/meilisearch-js-plugins#737

Merged

Fix tests dependent on a specific return order for the documents meilisearch/meilisearch-swift#285

Merged

alallema mentioned this pull request Apr 27, 2022

Fix tests due to changing the way of returning documents meilisearch/meilisearch-go#289

Merged

curquiza added the v0.27.0 PRs/issues solved in v0.27.0 label Aug 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nested fields #2298

Nested fields #2298

irevoire commented Apr 7, 2022

MarinPostma Apr 7, 2022

irevoire Apr 7, 2022

MarinPostma Apr 7, 2022

irevoire Apr 7, 2022 •

edited

MarinPostma Apr 7, 2022

curquiza Apr 7, 2022

Kerollmops left a comment

Kerollmops left a comment

Kerollmops left a comment

irevoire commented Apr 7, 2022 •

edited

Kerollmops left a comment

irevoire commented Apr 7, 2022

bors bot commented Apr 7, 2022

Kerollmops commented Apr 7, 2022

irevoire commented Apr 7, 2022

irevoire commented Apr 11, 2022

bors bot commented Apr 11, 2022

		// then we need to convert the `serde_json::Map` into an `IndexMap`.
		let document = document.into_iter().collect();

Nested fields #2298

Nested fields #2298

Conversation

irevoire commented Apr 7, 2022

Stop the useless conversion

Add more test specific to the nested

See how I could stop iterating on hashmap and instead fill them correctly

MarinPostma Apr 7, 2022

Choose a reason for hiding this comment

irevoire Apr 7, 2022

Choose a reason for hiding this comment

MarinPostma Apr 7, 2022

Choose a reason for hiding this comment

irevoire Apr 7, 2022 • edited

Choose a reason for hiding this comment

MarinPostma Apr 7, 2022

Choose a reason for hiding this comment

curquiza Apr 7, 2022

Choose a reason for hiding this comment

Kerollmops left a comment

Choose a reason for hiding this comment

Kerollmops left a comment

Choose a reason for hiding this comment

Kerollmops left a comment

Choose a reason for hiding this comment

irevoire commented Apr 7, 2022 • edited

Kerollmops left a comment

Choose a reason for hiding this comment

irevoire commented Apr 7, 2022

bors bot commented Apr 7, 2022

Kerollmops commented Apr 7, 2022

irevoire commented Apr 7, 2022

irevoire commented Apr 11, 2022

bors bot commented Apr 11, 2022

irevoire Apr 7, 2022 •

edited

irevoire commented Apr 7, 2022 •

edited