From 17972e42ae14d607181a52b5934789e32f6b1541 Mon Sep 17 00:00:00 2001 From: Moshe Date: Mon, 21 Jan 2019 09:22:51 +0200 Subject: [PATCH 01/19] SOLR-13129: added new nested page in Solr ref-guide --- solr/solr-ref-guide/nested-documents.adoc | 121 ++++++++++++++++++++++ 1 file changed, 121 insertions(+) create mode 100644 solr/solr-ref-guide/nested-documents.adoc diff --git a/solr/solr-ref-guide/nested-documents.adoc b/solr/solr-ref-guide/nested-documents.adoc new file mode 100644 index 000000000000..4016f37966fe --- /dev/null +++ b/solr/solr-ref-guide/nested-documents.adoc @@ -0,0 +1,121 @@ +== Nested Child Documents + +Solr supports indexing nested documents such as a blog post parent document and comments as child documents -- or products as parent documents and sizes, colors, or other variations as child documents. +The parent with all children is referred to as a "block" and it explains some of the nomenclature of related features. +At query time, the <> can search these relationships, + and the `[child]` <> can attach child documents to the result documents. +In terms of performance, indexing the relationships between documents usually yields much faster queries than an equivalent "query time join", + since the relationships are already stored in the index and do not need to be computed. +However, nested documents are less flexible than query time joins as it imposes rules that some applications may not be able to accept. + +.Note +[NOTE] +==== +A big limitation is that the whole block of parent-children documents must be updated or deleted together, not separately. +In other words, even if a single child document or the parent document is changed, the whole block of parent-child documents must be indexed together. +_Solr does not enforce this rule_; if it's violated, you may get sporadic query failures or incorrect results. +==== + +Nested documents may be indexed via either the XML or JSON data syntax, and is also supported by <> with javabin. + +=== Schema Notes + + * The schema must include indexed fields field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth. + Fields `\_nest_path_`, `\_nest_parent_` can be configured to store the path of the document in the hierarchy, and the unique `id` of the parent in the previous level. + These 2 fields will be used by NestedUpdateProcessor URP, which is implicitly configured under Solr 8, when `\_root_` field is defined. + * Nested documents are very much documents in their own right even if certain nested documents hold different information from the parent. + Therefore: + ** the schema must be able to represent the fields of any document + ** it may be infeasible to use `required` + ** even child documents need a unique `id` + + +=== Legacy Schema Notes + * The schema must include an indexed, non-stored field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth. + * You must include a field that identifies the parent document as a parent; it can be any field that suits this purpose, and it will be used as input for the <>. + * If you associate a child document as a field (e.g., comment), that field need not be defined in the schema, and probably + shouldn't be as it would be confusing. There is no child document field type. + +=== XML Examples + +For example, here are two documents and their child documents. +It illustrates two styles of adding child documents; the first is associated via a field "comment" (preferred), +and the second is done in the classic way now referred to as an "anonymous" or "unlabelled" child document. +This field label relationship is available to the URP chain in Solr but is ultimately discarded. +Solr 8 will save the relationship. + +[source,xml] +---- + + + 1 + Solr adds block join support + parentDocument + + + 2 + SolrCloud supports it too! + + + + + 3 + New Lucene and Solr release is out + parentDocument + + 4 + Lots of new features + + + +---- + +In this example, we have indexed the parent documents with the field `content_type`, which has the value "parentDocument". +We could have also used a boolean field, such as `isParent`, with a value of "true", or any other similar approach. + +=== JSON Examples + +This example is equivalent to the XML example above. +Again, the field labelled relationship is preferred. +The labelled relationship here is one child document but could have been wrapped in array brackets. +For the anonymous relationship, note the special `\_childDocuments_` key whose contents must be an array of child documents. + +[source,json] +---- +[ + { + "id": "1", + "title": "Solr adds block join support", + "content_type": "parentDocument", + "comment": { + "id": "2", + "comments": "SolrCloud supports it too!" + } + }, + { + "id": "3", + "title": "New Lucene and Solr release is out", + "content_type": "parentDocument", + "_childDocuments_": [ + { + "id": "4", + "comments": "Lots of new features" + } + ] + } +] +---- + +.Legacy Mode +[NOTE] +==== + In legacy mode, these two documents will result in the same docs being indexed(legacy mode does not honor nested relationships). + When quried, child docs will be appended to _childDocuments_ key. +==== + + +=== Querying Nested Documents + + * The `<>` Document Transformer allows to attach child documents using a specialized query language(only regular filters are available in legacy mode). + * <> + From 8702a3384c3ed93e288de452730084962aa13521 Mon Sep 17 00:00:00 2001 From: Moshe Date: Wed, 23 Jan 2019 14:29:23 +0200 Subject: [PATCH 02/19] SOLR-13129: improve nested docs --- solr/solr-ref-guide/nested-documents.adoc | 121 ---------------------- 1 file changed, 121 deletions(-) delete mode 100644 solr/solr-ref-guide/nested-documents.adoc diff --git a/solr/solr-ref-guide/nested-documents.adoc b/solr/solr-ref-guide/nested-documents.adoc deleted file mode 100644 index 4016f37966fe..000000000000 --- a/solr/solr-ref-guide/nested-documents.adoc +++ /dev/null @@ -1,121 +0,0 @@ -== Nested Child Documents - -Solr supports indexing nested documents such as a blog post parent document and comments as child documents -- or products as parent documents and sizes, colors, or other variations as child documents. -The parent with all children is referred to as a "block" and it explains some of the nomenclature of related features. -At query time, the <> can search these relationships, - and the `[child]` <> can attach child documents to the result documents. -In terms of performance, indexing the relationships between documents usually yields much faster queries than an equivalent "query time join", - since the relationships are already stored in the index and do not need to be computed. -However, nested documents are less flexible than query time joins as it imposes rules that some applications may not be able to accept. - -.Note -[NOTE] -==== -A big limitation is that the whole block of parent-children documents must be updated or deleted together, not separately. -In other words, even if a single child document or the parent document is changed, the whole block of parent-child documents must be indexed together. -_Solr does not enforce this rule_; if it's violated, you may get sporadic query failures or incorrect results. -==== - -Nested documents may be indexed via either the XML or JSON data syntax, and is also supported by <> with javabin. - -=== Schema Notes - - * The schema must include indexed fields field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth. - Fields `\_nest_path_`, `\_nest_parent_` can be configured to store the path of the document in the hierarchy, and the unique `id` of the parent in the previous level. - These 2 fields will be used by NestedUpdateProcessor URP, which is implicitly configured under Solr 8, when `\_root_` field is defined. - * Nested documents are very much documents in their own right even if certain nested documents hold different information from the parent. - Therefore: - ** the schema must be able to represent the fields of any document - ** it may be infeasible to use `required` - ** even child documents need a unique `id` - - -=== Legacy Schema Notes - * The schema must include an indexed, non-stored field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth. - * You must include a field that identifies the parent document as a parent; it can be any field that suits this purpose, and it will be used as input for the <>. - * If you associate a child document as a field (e.g., comment), that field need not be defined in the schema, and probably - shouldn't be as it would be confusing. There is no child document field type. - -=== XML Examples - -For example, here are two documents and their child documents. -It illustrates two styles of adding child documents; the first is associated via a field "comment" (preferred), -and the second is done in the classic way now referred to as an "anonymous" or "unlabelled" child document. -This field label relationship is available to the URP chain in Solr but is ultimately discarded. -Solr 8 will save the relationship. - -[source,xml] ----- - - - 1 - Solr adds block join support - parentDocument - - - 2 - SolrCloud supports it too! - - - - - 3 - New Lucene and Solr release is out - parentDocument - - 4 - Lots of new features - - - ----- - -In this example, we have indexed the parent documents with the field `content_type`, which has the value "parentDocument". -We could have also used a boolean field, such as `isParent`, with a value of "true", or any other similar approach. - -=== JSON Examples - -This example is equivalent to the XML example above. -Again, the field labelled relationship is preferred. -The labelled relationship here is one child document but could have been wrapped in array brackets. -For the anonymous relationship, note the special `\_childDocuments_` key whose contents must be an array of child documents. - -[source,json] ----- -[ - { - "id": "1", - "title": "Solr adds block join support", - "content_type": "parentDocument", - "comment": { - "id": "2", - "comments": "SolrCloud supports it too!" - } - }, - { - "id": "3", - "title": "New Lucene and Solr release is out", - "content_type": "parentDocument", - "_childDocuments_": [ - { - "id": "4", - "comments": "Lots of new features" - } - ] - } -] ----- - -.Legacy Mode -[NOTE] -==== - In legacy mode, these two documents will result in the same docs being indexed(legacy mode does not honor nested relationships). - When quried, child docs will be appended to _childDocuments_ key. -==== - - -=== Querying Nested Documents - - * The `<>` Document Transformer allows to attach child documents using a specialized query language(only regular filters are available in legacy mode). - * <> - From 3058396666738c6d8e8b9e95b7edbf014be0afc5 Mon Sep 17 00:00:00 2001 From: Moshe Date: Wed, 23 Jan 2019 18:01:41 +0200 Subject: [PATCH 03/19] SOLR-13129: improve nested queries --- .../src/blockjoin-faceting.adoc | 2 +- solr/solr-ref-guide/src/other-parsers.adoc | 2 +- solr/solr-ref-guide/src/searching.adoc | 2 + .../src/transforming-result-documents.adoc | 2 +- .../uploading-data-with-index-handlers.adoc | 102 ------------------ 5 files changed, 5 insertions(+), 105 deletions(-) diff --git a/solr/solr-ref-guide/src/blockjoin-faceting.adoc b/solr/solr-ref-guide/src/blockjoin-faceting.adoc index 18a74084f38d..60ea4c189d15 100644 --- a/solr/solr-ref-guide/src/blockjoin-faceting.adoc +++ b/solr/solr-ref-guide/src/blockjoin-faceting.adoc @@ -41,7 +41,7 @@ This example shows how you could add this search components to `solrconfig.xml` This component can be added into any search request handler. This component work with distributed search in SolrCloud mode. -Documents should be added in children-parent blocks as described in <>. Examples: +Documents should be added in children-parent blocks as described in <>. Examples: .Sample document [source,xml] diff --git a/solr/solr-ref-guide/src/other-parsers.adoc b/solr/solr-ref-guide/src/other-parsers.adoc index f22dcdc0599e..7c50873d5aef 100644 --- a/solr/solr-ref-guide/src/other-parsers.adoc +++ b/solr/solr-ref-guide/src/other-parsers.adoc @@ -24,7 +24,7 @@ Many of these parsers are expressed the same way as <>. +There are two query parsers that support block joins. These parsers allow indexing and searching for relational content that has been <>. The example usage of the query parsers below assumes these two documents and each of their child documents have been indexed: diff --git a/solr/solr-ref-guide/src/searching.adoc b/solr/solr-ref-guide/src/searching.adoc index 9fa0e57cae84..52fbd818adff 100644 --- a/solr/solr-ref-guide/src/searching.adoc +++ b/solr/solr-ref-guide/src/searching.adoc @@ -10,6 +10,7 @@ spell-checking, + query-re-ranking, + transforming-result-documents, + + nested-documents, + suggester, + morelikethis, + pagination-of-results, + @@ -69,6 +70,7 @@ This section describes how Solr works with search requests. It covers the follow ** <>: How to use LTR to run machine learned ranking models in Solr. * <>: Detailed information about using `DocTransformers` to add computed information to individual documents +* <>: Detailed information about nested documents * <>: Detailed information about Solr's powerful autosuggest component. * <>: Detailed information about Solr's similar results query component. * <>: Detailed information about fetching paginated results for display in a UI, or for fetching all documents matching a query. diff --git a/solr/solr-ref-guide/src/transforming-result-documents.adoc b/solr/solr-ref-guide/src/transforming-result-documents.adoc index 97917b48dbd4..7063e393edf8 100644 --- a/solr/solr-ref-guide/src/transforming-result-documents.adoc +++ b/solr/solr-ref-guide/src/transforming-result-documents.adoc @@ -124,7 +124,7 @@ A default style can be configured by specifying an `args` parameter in your `sol === [child] - ChildDocTransformerFactory -This transformer returns all <> of each parent document matching your query in a flat list nested inside the matching parent document. This is useful when you have indexed nested child documents and want to retrieve the child documents for the relevant parent documents for any type of search query. +This transformer returns all <> of each parent document matching your query in a flat list nested inside the matching parent document. This is useful when you have indexed nested child documents and want to retrieve the child documents for the relevant parent documents for any type of search query. [source,plain] ---- diff --git a/solr/solr-ref-guide/src/uploading-data-with-index-handlers.adoc b/solr/solr-ref-guide/src/uploading-data-with-index-handlers.adoc index 3bf9b52a1d5e..1c090a09693e 100644 --- a/solr/solr-ref-guide/src/uploading-data-with-index-handlers.adoc +++ b/solr/solr-ref-guide/src/uploading-data-with-index-handlers.adoc @@ -541,105 +541,3 @@ In addition to the `/update` handler, there is an additional CSV specific reques |=== The `/update/csv` path may be useful for clients sending in CSV formatted update commands from applications where setting the Content-Type proves difficult. - -== Nested Child Documents - -Solr supports indexing nested documents such as a blog post parent document and comments as child documents -- or products as parent documents and sizes, colors, or other variations as child documents. -The parent with all children is referred to as a "block" and it explains some of the nomenclature of related features. -At query time, the <> can search these relationships, - and the `[child]` <> can attach child documents to the result documents. -In terms of performance, indexing the relationships between documents usually yields much faster queries than an equivalent "query time join", - since the relationships are already stored in the index and do not need to be computed. -However, nested documents are less flexible than query time joins as it imposes rules that some applications may not be able to accept. - -.Note -[NOTE] -==== -A big limitation is that the whole block of parent-children documents must be updated or deleted together, not separately. -In other words, even if a single child document or the parent document is changed, the whole block of parent-child documents must be indexed together. -_Solr does not enforce this rule_; if it's violated, you may get sporadic query failures or incorrect results. -==== - -Nested documents may be indexed via either the XML or JSON data syntax, and is also supported by <> with javabin. - -=== Schema Notes - - * The schema must include an indexed, non-stored field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth. - * Nested documents are very much documents in their own right even if certain nested documents hold different information from the parent. - Therefore: - ** the schema must be able to represent the fields of any document - ** it may be infeasible to use `required` - ** even child documents need a unique `id` - * You must include a field that identifies the parent document as a parent; it can be any field that suits this purpose, and it will be used as input for the <>. - * If you associate a child document as a field (e.g., comment), that field need not be defined in the schema, and probably - shouldn't be as it would be confusing. There is no child document field type. - -=== XML Examples - -For example, here are two documents and their child documents. -It illustrates two styles of adding child documents; the first is associated via a field "comment" (preferred), -and the second is done in the classic way now referred to as an "anonymous" or "unlabelled" child document. -This field label relationship is available to the URP chain in Solr but is ultimately discarded. -Solr 8 will save the relationship. - -[source,xml] ----- - - - 1 - Solr adds block join support - parentDocument - - - 2 - SolrCloud supports it too! - - - - - 3 - New Lucene and Solr release is out - parentDocument - - 4 - Lots of new features - - - ----- - -In this example, we have indexed the parent documents with the field `content_type`, which has the value "parentDocument". -We could have also used a boolean field, such as `isParent`, with a value of "true", or any other similar approach. - -=== JSON Examples - -This example is equivalent to the XML example above. -Again, the field labelled relationship is preferred. -The labelled relationship here is one child document but could have been wrapped in array brackets. -For the anonymous relationship, note the special `\_childDocuments_` key whose contents must be an array of child documents. - -[source,json] ----- -[ - { - "id": "1", - "title": "Solr adds block join support", - "content_type": "parentDocument", - "comment": { - "id": "2", - "comments": "SolrCloud supports it too!" - } - }, - { - "id": "3", - "title": "New Lucene and Solr release is out", - "content_type": "parentDocument", - "_childDocuments_": [ - { - "id": "4", - "comments": "Lots of new features" - } - ] - } -] ----- From 7194813a0bc28bdc23c8066911ad55d05475979e Mon Sep 17 00:00:00 2001 From: Moshe Date: Sun, 27 Jan 2019 19:19:41 +0200 Subject: [PATCH 04/19] add nested.adoc --- solr/solr-ref-guide/src/nested-documents.adoc | 299 ++++++++++++++++++ 1 file changed, 299 insertions(+) create mode 100644 solr/solr-ref-guide/src/nested-documents.adoc diff --git a/solr/solr-ref-guide/src/nested-documents.adoc b/solr/solr-ref-guide/src/nested-documents.adoc new file mode 100644 index 000000000000..35b1018bdae6 --- /dev/null +++ b/solr/solr-ref-guide/src/nested-documents.adoc @@ -0,0 +1,299 @@ += Nested Child Documents +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +Solr supports indexing nested documents such as a blog post parent document and comments as child documents -- or products as parent documents and sizes, colors, or other variations as child documents. +The parent with all children is referred to as a "block" and it explains some of the nomenclature of related features. +At query time, the <> can search these relationships, + and the `[child]` <> can attach child documents to the result documents. +In terms of performance, indexing the relationships between documents usually yields much faster queries than an equivalent "query time join", + since the relationships are already stored in the index and do not need to be computed. +However, nested documents are less flexible than query time joins as it imposes rules that some applications may not be able to accept. + +.Note +[NOTE] +==== +A big limitation is that the whole block of parent-children documents must be updated or deleted together, not separately. +In other words, even if a single child document or the parent document is changed, the whole block of parent-child documents must be indexed together. +_Solr does not enforce this rule_; if it's violated, you may get sporadic query failures or incorrect results. +==== + +Nested documents may be indexed via either the XML or JSON data syntax, and is also supported by <> with javabin. + +=== Schema Notes + + * The schema must include indexed fields field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth. + Fields `\_nest_path_`, `\_nest_parent_` can be configured to store the path of the document in the hierarchy, and the unique `id` of the parent in the previous level. + These 2 fields will be used by NestedUpdateProcessor URP, which is implicitly configured under Solr 8, when `\_root_` field is defined. + * Nested documents are very much documents in their own right even if certain nested documents hold different information from the parent. + Therefore: + ** the schema must be able to represent the fields of any document + ** it may be infeasible to use `required` + ** even child documents need a unique `id` + + +=== Legacy Schema Notes + * The schema must include an indexed, non-stored field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth. + * You must include a field that identifies the parent document as a parent; it can be any field that suits this purpose, and it will be used as input for the <>. + * If you associate a child document as a field (e.g., comment), that field need not be defined in the schema, and probably + shouldn't be as it would be confusing. There is no child document field type. + +=== XML Examples + +For example, here are two documents and their child documents. +It illustrates two styles of adding child documents; the first is associated via a field "comment" (preferred), +and the second is done in the classic way now referred to as an "anonymous" or "unlabelled" child document. +This field label relationship is available to the URP chain in Solr but is ultimately discarded. +Solr 8 will save the relationship. + +[source,xml] +---- + + + 1 + Solr adds block join support + parentDocument + + + 2 + SolrCloud supports it too! + + + + + 3 + New Lucene and Solr release is out + parentDocument + + 4 + Lots of new features + + + +---- + +In this example, we have indexed the parent documents with the field `content_type`, which has the value "parentDocument". +We could have also used a boolean field, such as `isParent`, with a value of "true", or any other similar approach. + +=== JSON Examples + +This example is equivalent to the XML example above. +Again, the field labelled relationship is preferred. +The labelled relationship here is one child document but could have been wrapped in array brackets. +For the anonymous relationship, note the special `\_childDocuments_` key whose contents must be an array of child documents. + +[source,json] +---- +[ + { + "id": "1", + "title": "Solr adds block join support", + "content_type": "parentDocument", + "comments": [{ + "id": "2", + "content": "SolrCloud supports it too!" + }, + { + "id": "3", + "content": "New filter syntax" + } + ] + }, + { + "id": "4", + "title": "New Lucene and Solr release is out", + "content_type": "parentDocument", + "_childDocuments_": [ + { + "id": "5", + "comments": "Lots of new features" + } + ] + } +] +---- + +.Legacy Mode +[NOTE] +==== + In legacy mode, these two documents will result in the same docs being indexed(legacy mode does not honor nested relationships). + When quried, child docs will be appended to _childDocuments_ key. +==== + + +=== Querying Nested Documents + + * `<>` Document Transformer + * <> + +=== Query Examples + +For the upcoming examples, assume the following documents have been indexed: + +==== +[source,json] +---- +[ + { + "id": "1", + "title": "Cooking Recommendations", + "tags": ["cooking", "meetup"], + "posts": [{ + "id": "2", + "title": "Cookies", + "comments": [{ + "id": "3", + "content": "Lovely recipe" + }, + { + "id": "4", + "content": "A-" + } + ] + }, + { + "id": "5", + "title": "Cakes" + } + ] + }, + { + "id": "6", + "title": "For Hire", + "tags": ["professional", "jobs"], + "posts": [{ + "id": "7", + "title": "Search Engineer", + "comments": [{ + "id": "8", + "content": "I am interested" + }, + { + "id": "9", + "content": "How large is the team?" + } + ] + }, + { + "id": "10", + "title": "Low level Engineer" + } + ] + } +] +---- +==== + +==== `<>` + * Can be used enrich query results with the documents' descendsnts. + `q=id:1, + + fl=id,[child childFilter=/comments/content:recipe]` + + The child Filter will only match the first comment of doc(id:1), + therefore only that particular comment will be appended to the result. + +[source,json] +---- + { "response":{"numFound":1,"start":0,"docs":[ + { + "id": "1", + "title": "Cooking Recommendations", + "tags": ["cooking", "meetup"], + "posts": [{ + "id": "2", + "title": "Cookies", + "comments": [{ + "id": "3", + "content": "Lovely recipe" + }] + }] + }] + } + } +---- + +==== <> + * Can be used to retrieve children of a matching document. + + * `q={!child of='_nest_path_:/posts}content:"Search Engineer"` + + This query returns the parent at the root(since all parents filter returns root documents). + +[source,json] +---- + { "response":{"numFound":2,"start":0,"docs":[ + { + "id": "8", + "content": "I am interested" + }, + { + "id": "9", + "content": "How large is the team?" + } + ]} + } +---- + +==== <> + * Can be used to retrieve parents of a child document. + + * Can be used to query the doc in JSON example, + `q={!parent which='-_nest_path_:* \*:*'}content:"Search Engineer"` + + This query returns the parent at the root(since all parents filter returns root documents). + +[source,json] +---- + { "response":{"numFound":1,"start":0,"docs":[{ + "id": "6", + "title": "For Hire", + "tags": ["professional", "jobs"] + } + ]} + } +---- + +==== Combining Block Join Query Parser with Child Doc Transformer + * The combination of these two features enable seamless creation of powerful queries. + + For example, querying posts which are under a page tagged as a job, contain the words "Search Engineer". + The comments for matching posts can also be fetched, all done in a single Solr Query. + `q=+{!child of='-\_nest_path_:* *:*'}+tags:"jobs" &fl=*,[child] + &fq=\_nest_path_:/posts` + + This query returns all posts and their comments, which had "Search Engineer" in their title, + and were under a page tagged with "jobs". + +[source,json] +---- + { "response":{"numFound":1,"start":0,"docs":[ + { + "id": "7", + "title": "Search Engineer", + "comments": [{ + "id": "8", + "content": "I am interested" + }, + { + "id": "9", + "content": "How large is the team?" + } + ] + }, + { + "id": "10", + "title": "Low level Engineer" + }] + } + } +---- + From d52811337607f749ba875f2b5eece06ba950f9e3 Mon Sep 17 00:00:00 2001 From: Moshe Date: Sun, 27 Jan 2019 19:27:19 +0200 Subject: [PATCH 05/19] add nested docs to ref-guide index --- solr/solr-ref-guide/src/index.adoc | 23 ++++++++++++++++++++++- solr/solr-ref-guide/src/searching.adoc | 2 -- 2 files changed, 22 insertions(+), 3 deletions(-) diff --git a/solr/solr-ref-guide/src/index.adoc b/solr/solr-ref-guide/src/index.adoc index bb3c3962953b..8c2416359b48 100644 --- a/solr/solr-ref-guide/src/index.adoc +++ b/solr/solr-ref-guide/src/index.adoc @@ -1,5 +1,5 @@ = Apache Solr Reference Guide -:page-children: about-this-guide, getting-started, deployment-and-operations, using-the-solr-administration-user-interface, documents-fields-and-schema-design, understanding-analyzers-tokenizers-and-filters, indexing-and-basic-data-operations, searching, streaming-expressions, solrcloud, legacy-scaling-and-distribution, the-well-configured-solr-instance, monitoring-solr, securing-solr, client-apis, further-assistance, solr-glossary, errata, how-to-contribute +:page-children: about-this-guide, getting-started, deployment-and-operations, using-the-solr-administration-user-interface, documents-fields-and-schema-design, understanding-analyzers-tokenizers-and-filters, indexing-and-basic-data-operations, searching, nested-documents, streaming-expressions, solrcloud, legacy-scaling-and-distribution, the-well-configured-solr-instance, monitoring-solr, securing-solr, client-apis, further-assistance, solr-glossary, errata, how-to-contribute :page-notitle: :page-toc: false :page-layout: home @@ -90,6 +90,27 @@ The *<>* section guides yo **** -- +[.row.match-my-cols] +-- +.Nested Documents +[sidebar.col-sm-6.col-md-4] +**** +* <>: Detailed information about nested documents. + +**** + +.Searching Documents +[sidebar.col-sm-6.col-md-4] +**** + +*<>*: This section presents an overview of the search process in Solr. It describes the main components used in searches, including request handlers, query parsers, and response writers. It lists the query parameters that can be passed to Solr, and it describes features such as boosting and faceting, which can be used to fine-tune search results. + +*<>*: A stream processing language for Solr, with a suite of functions to perform many types of queries and parallel execution tasks. + +*<>*: This section tells you how to access Solr through various client APIs, including JavaScript, JSON, and Ruby. +**** +-- + [.row] -- diff --git a/solr/solr-ref-guide/src/searching.adoc b/solr/solr-ref-guide/src/searching.adoc index 52fbd818adff..9fa0e57cae84 100644 --- a/solr/solr-ref-guide/src/searching.adoc +++ b/solr/solr-ref-guide/src/searching.adoc @@ -10,7 +10,6 @@ spell-checking, + query-re-ranking, + transforming-result-documents, + - nested-documents, + suggester, + morelikethis, + pagination-of-results, + @@ -70,7 +69,6 @@ This section describes how Solr works with search requests. It covers the follow ** <>: How to use LTR to run machine learned ranking models in Solr. * <>: Detailed information about using `DocTransformers` to add computed information to individual documents -* <>: Detailed information about nested documents * <>: Detailed information about Solr's powerful autosuggest component. * <>: Detailed information about Solr's similar results query component. * <>: Detailed information about fetching paginated results for display in a UI, or for fetching all documents matching a query. From 3340e5be3ac687f696fc2d55ac41ed473b23184c Mon Sep 17 00:00:00 2001 From: Moshe Date: Wed, 30 Jan 2019 17:04:09 +0200 Subject: [PATCH 06/19] SOLR-13129: improve nested links on index homepage --- solr/solr-ref-guide/src/index.adoc | 15 ++------------- 1 file changed, 2 insertions(+), 13 deletions(-) diff --git a/solr/solr-ref-guide/src/index.adoc b/solr/solr-ref-guide/src/index.adoc index 8c2416359b48..fbff54a47a36 100644 --- a/solr/solr-ref-guide/src/index.adoc +++ b/solr/solr-ref-guide/src/index.adoc @@ -95,19 +95,8 @@ The *<>* section guides yo .Nested Documents [sidebar.col-sm-6.col-md-4] **** -* <>: Detailed information about nested documents. - -**** - -.Searching Documents -[sidebar.col-sm-6.col-md-4] -**** - -*<>*: This section presents an overview of the search process in Solr. It describes the main components used in searches, including request handlers, query parsers, and response writers. It lists the query parameters that can be passed to Solr, and it describes features such as boosting and faceting, which can be used to fine-tune search results. - -*<>*: A stream processing language for Solr, with a suite of functions to perform many types of queries and parallel execution tasks. - -*<>*: This section tells you how to access Solr through various client APIs, including JavaScript, JSON, and Ruby. +* <>: Detailed information about index configuration for nested documents. +* <>: Detailed information about nested documents. **** -- From 9e57dd91cfb3a73297cb27bff3bae034897aa773 Mon Sep 17 00:00:00 2001 From: Moshe Date: Thu, 31 Jan 2019 11:02:09 +0200 Subject: [PATCH 07/19] SOLR-13129: change legacy mode to root only and fix index --- solr/solr-ref-guide/src/index.adoc | 2 +- solr/solr-ref-guide/src/nested-documents.adoc | 50 +++++++++++-------- 2 files changed, 31 insertions(+), 21 deletions(-) diff --git a/solr/solr-ref-guide/src/index.adoc b/solr/solr-ref-guide/src/index.adoc index fbff54a47a36..f34a1878e8ce 100644 --- a/solr/solr-ref-guide/src/index.adoc +++ b/solr/solr-ref-guide/src/index.adoc @@ -96,7 +96,7 @@ The *<>* section guides yo [sidebar.col-sm-6.col-md-4] **** * <>: Detailed information about index configuration for nested documents. -* <>: Detailed information about nested documents. +* <>: Querying nested documents how to guide. **** -- diff --git a/solr/solr-ref-guide/src/nested-documents.adoc b/solr/solr-ref-guide/src/nested-documents.adoc index 35b1018bdae6..7000c6290432 100644 --- a/solr/solr-ref-guide/src/nested-documents.adoc +++ b/solr/solr-ref-guide/src/nested-documents.adoc @@ -36,9 +36,9 @@ Nested documents may be indexed via either the XML or JSON data syntax, and is a === Schema Notes - * The schema must include indexed fields field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth. - Fields `\_nest_path_`, `\_nest_parent_` can be configured to store the path of the document in the hierarchy, and the unique `id` of the parent in the previous level. - These 2 fields will be used by NestedUpdateProcessor URP, which is implicitly configured under Solr 8, when `\_root_` field is defined. + * The schema must include indexed field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth. The id of the top document in every nested hierarchy is populated in this field. + * `\_nest_path_` can be configured to store the path of the document in the hierarchy + * `\_nest_parent_` can be configured to store the `id` of the parent in the previous level * Nested documents are very much documents in their own right even if certain nested documents hold different information from the parent. Therefore: ** the schema must be able to represent the fields of any document @@ -46,7 +46,10 @@ Nested documents may be indexed via either the XML or JSON data syntax, and is a ** even child documents need a unique `id` -=== Legacy Schema Notes +=== Rudimentary Root-only schemas + * These schemas do not contain any other nested related fields apart for `\_root_`. + + In this mode relationship types(field names) between parents and their children are not saved. + All children are indexed under the `\_childDocuments_` field. * The schema must include an indexed, non-stored field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth. * You must include a field that identifies the parent document as a parent; it can be any field that suits this purpose, and it will be used as input for the <>. * If you associate a child document as a field (e.g., comment), that field need not be defined in the schema, and probably @@ -127,18 +130,20 @@ For the anonymous relationship, note the special `\_childDocuments_` key whose c ] ---- -.Legacy Mode +.Root Only Mode [NOTE] ==== - In legacy mode, these two documents will result in the same docs being indexed(legacy mode does not honor nested relationships). + In Root-only schemas, these two documents will result in the same docs being indexed(Root-only schemas do not honor nested relationships). When quried, child docs will be appended to _childDocuments_ key. ==== === Querying Nested Documents - * `<>` Document Transformer - * <> + * `<>` Document Transformer + * <> + * <> + * <> === Query Examples @@ -198,9 +203,11 @@ For the upcoming examples, assume the following documents have been indexed: ---- ==== -==== `<>` - * Can be used enrich query results with the documents' descendsnts. - `q=id:1, + +==== Child Doc Transformer +Can be used enrich query results with the documents' descendants. + +For a detailed explanation of this parser, click <>. + +* `q=id:1, fl=id,[child childFilter=/comments/content:recipe]` + The child Filter will only match the first comment of doc(id:1), therefore only that particular comment will be appended to the result. @@ -225,8 +232,9 @@ For the upcoming examples, assume the following documents have been indexed: } ---- -==== <> - * Can be used to retrieve children of a matching document. +==== Children Query Parser +Can be used to retrieve children of a matching document. + +For a detailed explanation of this parser, click <>. * `q={!child of='_nest_path_:/posts}content:"Search Engineer"` + This query returns the parent at the root(since all parents filter returns root documents). @@ -246,8 +254,9 @@ For the upcoming examples, assume the following documents have been indexed: } ---- -==== <> - * Can be used to retrieve parents of a child document. +==== Parents Query Parser +Can be used to retrieve parents of a child document. + +For a detailed explanation of this parser, click <>. * Can be used to query the doc in JSON example, `q={!parent which='-_nest_path_:* \*:*'}content:"Search Engineer"` + @@ -264,11 +273,12 @@ For the upcoming examples, assume the following documents have been indexed: } ---- -==== Combining Block Join Query Parser with Child Doc Transformer - * The combination of these two features enable seamless creation of powerful queries. + - For example, querying posts which are under a page tagged as a job, contain the words "Search Engineer". - The comments for matching posts can also be fetched, all done in a single Solr Query. - `q=+{!child of='-\_nest_path_:* *:*'}+tags:"jobs" &fl=*,[child] +==== Combining Block Join Query Parsers with Child Doc Transformer +The combination of these two features enable seamless creation of powerful queries. + +For example, querying posts which are under a page tagged as a job, contain the words "Search Engineer". +The comments for matching posts can also be fetched, all done in a single Solr Query. + + * `q=+{!child of='-\_nest_path_:* *:*'}+tags:"jobs" &fl=*,[child] &fq=\_nest_path_:/posts` + This query returns all posts and their comments, which had "Search Engineer" in their title, and were under a page tagged with "jobs". From 4c320914810484a27cd00a7537a0d8f4c5dc2809 Mon Sep 17 00:00:00 2001 From: Moshe Date: Sun, 3 Feb 2019 16:09:03 +0200 Subject: [PATCH 08/19] SOLR-13129: fix for PR review --- solr/solr-ref-guide/src/nested-documents.adoc | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/solr/solr-ref-guide/src/nested-documents.adoc b/solr/solr-ref-guide/src/nested-documents.adoc index 7000c6290432..acdf76d23a6b 100644 --- a/solr/solr-ref-guide/src/nested-documents.adoc +++ b/solr/solr-ref-guide/src/nested-documents.adoc @@ -47,9 +47,9 @@ Nested documents may be indexed via either the XML or JSON data syntax, and is a === Rudimentary Root-only schemas - * These schemas do not contain any other nested related fields apart for `\_root_`. + - In this mode relationship types(field names) between parents and their children are not saved. - All children are indexed under the `\_childDocuments_` field. + * These schemas do not contain any other nested related fields apart from `\_root_`. + + In this mode relationship types(field names) between parents and their children are not saved. + + In this case <> transformer returns all children under the `\_childDocuments_` field. * The schema must include an indexed, non-stored field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth. * You must include a field that identifies the parent document as a parent; it can be any field that suits this purpose, and it will be used as input for the <>. * If you associate a child document as a field (e.g., comment), that field need not be defined in the schema, and probably @@ -278,7 +278,7 @@ The combination of these two features enable seamless creation of powerful queri For example, querying posts which are under a page tagged as a job, contain the words "Search Engineer". The comments for matching posts can also be fetched, all done in a single Solr Query. - * `q=+{!child of='-\_nest_path_:* *:*'}+tags:"jobs" &fl=*,[child] + * `q=+{!child of='-\_nest_path_:* \*:*'}+tags:"jobs" &fl=*,[child] &fq=\_nest_path_:/posts` + This query returns all posts and their comments, which had "Search Engineer" in their title, and were under a page tagged with "jobs". From db78d70e45adcfdb361ce370d2c2dcc89b383373 Mon Sep 17 00:00:00 2001 From: Moshe Date: Sun, 3 Feb 2019 16:10:20 +0200 Subject: [PATCH 09/19] SOLR-13129: fix for PR review --- solr/solr-ref-guide/src/nested-documents.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/solr/solr-ref-guide/src/nested-documents.adoc b/solr/solr-ref-guide/src/nested-documents.adoc index acdf76d23a6b..2eac49843458 100644 --- a/solr/solr-ref-guide/src/nested-documents.adoc +++ b/solr/solr-ref-guide/src/nested-documents.adoc @@ -19,7 +19,7 @@ Solr supports indexing nested documents such as a blog post parent document and comments as child documents -- or products as parent documents and sizes, colors, or other variations as child documents. The parent with all children is referred to as a "block" and it explains some of the nomenclature of related features. At query time, the <> can search these relationships, - and the `[child]` <> can attach child documents to the result documents. + and the `<` Document Transformer>> can attach child documents to the result documents. In terms of performance, indexing the relationships between documents usually yields much faster queries than an equivalent "query time join", since the relationships are already stored in the index and do not need to be computed. However, nested documents are less flexible than query time joins as it imposes rules that some applications may not be able to accept. From dbfb39bb873c51bb7ade89a86098cf95494d74aa Mon Sep 17 00:00:00 2001 From: Moshe Date: Sun, 3 Feb 2019 16:12:58 +0200 Subject: [PATCH 10/19] SOLR-13129: fix for PR review --- solr/solr-ref-guide/src/nested-documents.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/solr/solr-ref-guide/src/nested-documents.adoc b/solr/solr-ref-guide/src/nested-documents.adoc index 2eac49843458..98ca9d3684c7 100644 --- a/solr/solr-ref-guide/src/nested-documents.adoc +++ b/solr/solr-ref-guide/src/nested-documents.adoc @@ -19,7 +19,7 @@ Solr supports indexing nested documents such as a blog post parent document and comments as child documents -- or products as parent documents and sizes, colors, or other variations as child documents. The parent with all children is referred to as a "block" and it explains some of the nomenclature of related features. At query time, the <> can search these relationships, - and the `<` Document Transformer>> can attach child documents to the result documents. + and the `<>` Document Transformer can attach child documents to the result documents. In terms of performance, indexing the relationships between documents usually yields much faster queries than an equivalent "query time join", since the relationships are already stored in the index and do not need to be computed. However, nested documents are less flexible than query time joins as it imposes rules that some applications may not be able to accept. From 5b61c39f3a598da55d3b2a06a168812028054d62 Mon Sep 17 00:00:00 2001 From: Moshe Date: Mon, 4 Feb 2019 07:29:49 +0200 Subject: [PATCH 11/19] SOLR-13129: add space between bullets in index --- solr/solr-ref-guide/src/index.adoc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/solr/solr-ref-guide/src/index.adoc b/solr/solr-ref-guide/src/index.adoc index f34a1878e8ce..d646ba2bbfca 100644 --- a/solr/solr-ref-guide/src/index.adoc +++ b/solr/solr-ref-guide/src/index.adoc @@ -95,7 +95,9 @@ The *<>* section guides yo .Nested Documents [sidebar.col-sm-6.col-md-4] **** + * <>: Detailed information about index configuration for nested documents. + * <>: Querying nested documents how to guide. **** -- From b1f8ab1b958d7caf35f096454be2a795d6b6c03d Mon Sep 17 00:00:00 2001 From: Moshe Date: Mon, 4 Feb 2019 08:11:28 +0200 Subject: [PATCH 12/19] SOLR-13129: remove redundant indexing nested json --- ...transforming-and-indexing-custom-json.adoc | 100 ------------------ 1 file changed, 100 deletions(-) diff --git a/solr/solr-ref-guide/src/transforming-and-indexing-custom-json.adoc b/solr/solr-ref-guide/src/transforming-and-indexing-custom-json.adoc index 26bd60bd8458..c2fd06ece877 100644 --- a/solr/solr-ref-guide/src/transforming-and-indexing-custom-json.adoc +++ b/solr/solr-ref-guide/src/transforming-and-indexing-custom-json.adoc @@ -777,106 +777,6 @@ curl 'http://localhost:8983/api/collections/techproducts/update/json' -H 'Conten ==== -- -== Indexing Nested Documents - -The following is an example of indexing nested documents: - -[.dynamic-tabs] --- -[example.tab-pane#v1nested] -==== -[.tab-label]*V1 API* -[source,bash] ----- -curl 'http://localhost:8983/solr/techproducts/update/json/docs?split=/|/orgs'\ - -H 'Content-type:application/json' -d '{ - "name": "Joe Smith", - "phone": 876876687, - "orgs": [ - { - "name": "Microsoft", - "city": "Seattle", - "zip": 98052 - }, - { - "name": "Apple", - "city": "Cupertino", - "zip": 95014 - } - ] -}' ----- -==== - -[example.tab-pane#v2nested] -==== -[.tab-label]*V2 API Standalone Solr* -[source,bash] ----- -curl 'http://localhost:8983/api/cores/techproducts/update/json?split=/|/orgs'\ - -H 'Content-type:application/json' -d '{ - "name": "Joe Smith", - "phone": 876876687, - "orgs": [ - { - "name": "Microsoft", - "city": "Seattle", - "zip": 98052 - }, - { - "name": "Apple", - "city": "Cupertino", - "zip": 95014 - } - ] -}' ----- -==== - -[example.tab-pane#v2nestedcloud] -==== -[.tab-label]*V2 API SolrCloud* -[source,bash] ----- -curl 'http://localhost:8983/api/collections/techproducts/update/json?split=/|/orgs'\ - -H 'Content-type:application/json' -d '{ - "name": "Joe Smith", - "phone": 876876687, - "orgs": [ - { - "name": "Microsoft", - "city": "Seattle", - "zip": 98052 - }, - { - "name": "Apple", - "city": "Cupertino", - "zip": 95014 - } - ] -}' ----- -==== --- - -With this example, the documents indexed would be, as follows: - -[source,json] ----- -{ - "name":"Joe Smith", - "phone":876876687, - "_childDocuments_":[ - { - "name":"Microsoft", - "city":"Seattle", - "zip":98052}, - { - "name":"Apple", - "city":"Cupertino", - "zip":95014}]} ----- - == Tips for Custom JSON Indexing . Schemaless mode: This handles field creation automatically. The field guessing may not be exactly as you expect, but it works. The best thing to do is to setup a local server in schemaless mode, index a few sample docs and create those fields in your real setup with proper field types before indexing From 9308248732f9076b57d33d96459c40c4b7766e04 Mon Sep 17 00:00:00 2001 From: Moshe Date: Mon, 4 Feb 2019 08:34:00 +0200 Subject: [PATCH 13/19] SOLR-13129: change index links for nested docs --- solr/solr-ref-guide/src/index.adoc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/solr/solr-ref-guide/src/index.adoc b/solr/solr-ref-guide/src/index.adoc index d646ba2bbfca..741e0b890009 100644 --- a/solr/solr-ref-guide/src/index.adoc +++ b/solr/solr-ref-guide/src/index.adoc @@ -96,9 +96,9 @@ The *<>* section guides yo [sidebar.col-sm-6.col-md-4] **** -* <>: Detailed information about index configuration for nested documents. +* <>: Detailed information about indexing and schema configuration for nested documents. -* <>: Querying nested documents how to guide. +* <>: Searching nested documents how to guide. **** -- From 62837af9bb22cd9bacabf3b445e26ea9e5d72c87 Mon Sep 17 00:00:00 2001 From: Moshe Date: Mon, 4 Feb 2019 11:23:54 +0200 Subject: [PATCH 14/19] SOLR-13129: add nested faceting and update schema with example configs for fields --- solr/solr-ref-guide/src/nested-documents.adoc | 34 +++++++++++++------ 1 file changed, 23 insertions(+), 11 deletions(-) diff --git a/solr/solr-ref-guide/src/nested-documents.adoc b/solr/solr-ref-guide/src/nested-documents.adoc index 98ca9d3684c7..e3ce9991884c 100644 --- a/solr/solr-ref-guide/src/nested-documents.adoc +++ b/solr/solr-ref-guide/src/nested-documents.adoc @@ -32,28 +32,36 @@ In other words, even if a single child document or the parent document is change _Solr does not enforce this rule_; if it's violated, you may get sporadic query failures or incorrect results. ==== +== Indexing Nested Documents + Nested documents may be indexed via either the XML or JSON data syntax, and is also supported by <> with javabin. -=== Schema Notes +=== Schema Configuration + +{nbsp} + +*Fields:* - * The schema must include indexed field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth. The id of the top document in every nested hierarchy is populated in this field. - * `\_nest_path_` can be configured to store the path of the document in the hierarchy - * `\_nest_parent_` can be configured to store the `id` of the parent in the previous level + * The schema must include indexed field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth. The id of the top document in every nested hierarchy is populated in this field. + + `` + * `\_nest_path_` is used to store the path of the document in the hierarchy. This field is optional. + + ` + ` + * `\_nest_parent_` is used to store the `id` of the parent in the previous level. This field is optional. + + `` * Nested documents are very much documents in their own right even if certain nested documents hold different information from the parent. Therefore: ** the schema must be able to represent the fields of any document ** it may be infeasible to use `required` ** even child documents need a unique `id` - + * If you associate a child document as a field (e.g., comment), that field need not be defined in the schema, and probably + shouldn't be as it would be confusing. There is no child document field type. === Rudimentary Root-only schemas + * These schemas do not contain any other nested related fields apart from `\_root_`. + In this mode relationship types(field names) between parents and their children are not saved. + In this case <> transformer returns all children under the `\_childDocuments_` field. - * The schema must include an indexed, non-stored field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth. - * You must include a field that identifies the parent document as a parent; it can be any field that suits this purpose, and it will be used as input for the <>. - * If you associate a child document as a field (e.g., comment), that field need not be defined in the schema, and probably - shouldn't be as it would be confusing. There is no child document field type. + * Typically you should have a field that differentiates a root doc from any nested children. However this isn't strictly necessary; so long as it's possible to write a query that can select only root documents somehow. Such a query is needed for the <> and <> doc transformer to function. === XML Examples @@ -137,13 +145,13 @@ For the anonymous relationship, note the special `\_childDocuments_` key whose c When quried, child docs will be appended to _childDocuments_ key. ==== - -=== Querying Nested Documents +== Searching Nested Documents * `<>` Document Transformer * <> * <> * <> + * <> === Query Examples @@ -204,6 +212,7 @@ For the upcoming examples, assume the following documents have been indexed: ==== ==== Child Doc Transformer + Can be used enrich query results with the documents' descendants. + For a detailed explanation of this parser, click <>. @@ -233,6 +242,7 @@ For a detailed explanation of this parser, click <>. @@ -255,6 +265,7 @@ For a detailed explanation of this parser, click <>. @@ -274,6 +285,7 @@ For a detailed explanation of this parser, click < Date: Wed, 13 Feb 2019 08:26:32 +0200 Subject: [PATCH 15/19] SOLR-13129: PR review changes, and more detailed explanation for query parsers --- solr/solr-ref-guide/src/nested-documents.adoc | 124 +++++++++--------- 1 file changed, 63 insertions(+), 61 deletions(-) diff --git a/solr/solr-ref-guide/src/nested-documents.adoc b/solr/solr-ref-guide/src/nested-documents.adoc index e3ce9991884c..8afc64456551 100644 --- a/solr/solr-ref-guide/src/nested-documents.adoc +++ b/solr/solr-ref-guide/src/nested-documents.adoc @@ -16,7 +16,7 @@ // specific language governing permissions and limitations // under the License. -Solr supports indexing nested documents such as a blog post parent document and comments as child documents -- or products as parent documents and sizes, colors, or other variations as child documents. +Solr supports indexing nested documents such as a blog post parent document and comments as child documents -- or products as parent documents and sizes, colors, or other variations as child documents. + The parent with all children is referred to as a "block" and it explains some of the nomenclature of related features. At query time, the <> can search these relationships, and the `<>` Document Transformer can attach child documents to the result documents. @@ -24,7 +24,6 @@ In terms of performance, indexing the relationships between documents usually yi since the relationships are already stored in the index and do not need to be computed. However, nested documents are less flexible than query time joins as it imposes rules that some applications may not be able to accept. -.Note [NOTE] ==== A big limitation is that the whole block of parent-children documents must be updated or deleted together, not separately. @@ -38,25 +37,22 @@ Nested documents may be indexed via either the XML or JSON data syntax, and is a === Schema Configuration -{nbsp} + -*Fields:* - - * The schema must include indexed field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth. The id of the top document in every nested hierarchy is populated in this field. + + * The schema must include indexed field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth. The ID of the top document in every nested hierarchy is populated in this field. + `` * `\_nest_path_` is used to store the path of the document in the hierarchy. This field is optional. + ` ` - * `\_nest_parent_` is used to store the `id` of the parent in the previous level. This field is optional. + + * `\_nest_parent_` is used to store the `ID` of the parent in the previous level. This field is optional. + `` * Nested documents are very much documents in their own right even if certain nested documents hold different information from the parent. Therefore: ** the schema must be able to represent the fields of any document ** it may be infeasible to use `required` - ** even child documents need a unique `id` + ** even child documents need a unique `ID` * If you associate a child document as a field (e.g., comment), that field need not be defined in the schema, and probably shouldn't be as it would be confusing. There is no child document field type. -=== Rudimentary Root-only schemas +== Rudimentary Root-only schemas * These schemas do not contain any other nested related fields apart from `\_root_`. + In this mode relationship types(field names) between parents and their children are not saved. + @@ -75,22 +71,22 @@ Solr 8 will save the relationship. ---- - 1 + 1 Solr adds block join support parentDocument - 2 + 2 SolrCloud supports it too! - 3 + 3 New Lucene and Solr release is out parentDocument - 4 + 4 Lots of new features @@ -111,26 +107,26 @@ For the anonymous relationship, note the special `\_childDocuments_` key whose c ---- [ { - "id": "1", + "ID": "1", "title": "Solr adds block join support", "content_type": "parentDocument", "comments": [{ - "id": "2", + "ID": "2", "content": "SolrCloud supports it too!" }, { - "id": "3", + "ID": "3", "content": "New filter syntax" } ] }, { - "id": "4", + "ID": "4", "title": "New Lucene and Solr release is out", "content_type": "parentDocument", "_childDocuments_": [ { - "id": "5", + "ID": "5", "comments": "Lots of new features" } ] @@ -138,12 +134,10 @@ For the anonymous relationship, note the special `\_childDocuments_` key whose c ] ---- -.Root Only Mode +.Root-Only Mode [NOTE] -==== - In Root-only schemas, these two documents will result in the same docs being indexed(Root-only schemas do not honor nested relationships). - When quried, child docs will be appended to _childDocuments_ key. -==== + In Root-only schemas, these two documents will result in the same docs being indexed (Root-only schemas do not honor nested relationships). + When queried, child docs will be appended to _childDocuments_ key. == Searching Nested Documents @@ -157,82 +151,82 @@ For the anonymous relationship, note the special `\_childDocuments_` key whose c For the upcoming examples, assume the following documents have been indexed: -==== [source,json] ---- [ { - "id": "1", + "ID": "1", "title": "Cooking Recommendations", "tags": ["cooking", "meetup"], "posts": [{ - "id": "2", + "ID": "2", "title": "Cookies", "comments": [{ - "id": "3", + "ID": "3", "content": "Lovely recipe" }, { - "id": "4", + "ID": "4", "content": "A-" } ] }, { - "id": "5", + "ID": "5", "title": "Cakes" } ] }, { - "id": "6", + "ID": "6", "title": "For Hire", "tags": ["professional", "jobs"], "posts": [{ - "id": "7", + "ID": "7", "title": "Search Engineer", "comments": [{ - "id": "8", + "ID": "8", "content": "I am interested" }, { - "id": "9", + "ID": "9", "content": "How large is the team?" } ] }, { - "id": "10", + "ID": "10", "title": "Low level Engineer" } ] } ] ---- -==== ==== Child Doc Transformer Can be used enrich query results with the documents' descendants. + -For a detailed explanation of this parser, click <>. +For a detailed explanation of this transformer, see the section <>. -* `q=id:1, - fl=id,[child childFilter=/comments/content:recipe]` + - The child Filter will only match the first comment of doc(id:1), - therefore only that particular comment will be appended to the result. +For example, let us examine this query: +`q=ID:1, +fl=ID,[child childFilter=/comments/content:recipe]`. + +The Child Doc Transformer can be used to enrich matching docs with comments that match a particular filter. + +In this particular query, the child Filter will only match the first comment of doc(ID:1), +therefore only that particular comment will be appended to the result. [source,json] ---- { "response":{"numFound":1,"start":0,"docs":[ { - "id": "1", + "ID": "1", "title": "Cooking Recommendations", "tags": ["cooking", "meetup"], "posts": [{ - "id": "2", + "ID": "2", "title": "Cookies", "comments": [{ - "id": "3", + "ID": "3", "content": "Lovely recipe" }] }] @@ -244,20 +238,23 @@ For a detailed explanation of this parser, click <>. +For a detailed explanation of this parser, see the section <>. - * `q={!child of='_nest_path_:/posts}content:"Search Engineer"` + - This query returns the parent at the root(since all parents filter returns root documents). +For example, let us examine this query: +`q={!child of='_nest_path_:/posts}content:"Search Engineer"`. + +The `'of'` filter returns all posts. This is used to filter out all documents in a particular path of the hierarchy(all parents). +The second part of the query is a filter for some parents, which we wish to return their children. + +In this example, all comments of posts which had "Search Engineer in their `content` field will be returned. [source,json] ---- { "response":{"numFound":2,"start":0,"docs":[ { - "id": "8", + "ID": "8", "content": "I am interested" }, { - "id": "9", + "ID": "9", "content": "How large is the team?" } ]} @@ -267,16 +264,19 @@ For a detailed explanation of this parser, click <>. +For a detailed explanation of this parser, see the section <>. - * Can be used to query the doc in JSON example, - `q={!parent which='-_nest_path_:* \*:*'}content:"Search Engineer"` + - This query returns the parent at the root(since all parents filter returns root documents). +For example, let us examine this query: +`q={!parent which='-_nest_path_:* \*:*'}title:"Search Engineer"`. + +The `'which'` filter returns all root documents. +The second part of this query is a filter to match some child documents. +This query returns the parent at the root(since all parents filter returns root documents) of each +matching child document. In this case, all child documents which had `Search Engineer` in their `title` field. [source,json] ---- { "response":{"numFound":1,"start":0,"docs":[{ - "id": "6", + "ID": "6", "title": "For Hire", "tags": ["professional", "jobs"] } @@ -290,29 +290,31 @@ The combination of these two features enable seamless creation of powerful queri For example, querying posts which are under a page tagged as a job, contain the words "Search Engineer". The comments for matching posts can also be fetched, all done in a single Solr Query. - * `q=+{!child of='-\_nest_path_:* \*:*'}+tags:"jobs" &fl=*,[child] - &fq=\_nest_path_:/posts` + - This query returns all posts and their comments, which had "Search Engineer" in their title, - and were under a page tagged with "jobs". +For example, let us examine this query: +`q=+{!child of='-\_nest_path_:* \*:*'}+tags:"jobs" &fl=*,[child] +&fq=\_nest_path_:/posts`. + +This query returns all posts and their comments, which had "Search Engineer" in their title, +and are indexed under a page tagged with "jobs". +The comments are appended to the matching posts, since the ChildDocTransformer is specified under the `fl` parameter. [source,json] ---- { "response":{"numFound":1,"start":0,"docs":[ { - "id": "7", + "ID": "7", "title": "Search Engineer", "comments": [{ - "id": "8", + "ID": "8", "content": "I am interested" }, { - "id": "9", + "ID": "9", "content": "How large is the team?" } ] }, { - "id": "10", + "ID": "10", "title": "Low level Engineer" }] } From 4426018b6f77f3342d835b75f434f9dfaa8b2c4b Mon Sep 17 00:00:00 2001 From: internet Date: Wed, 13 Feb 2019 13:22:44 +0200 Subject: [PATCH 16/19] SOLR-13129: fix nested docs links --- solr/solr-ref-guide/src/json-facet-api.adoc | 2 +- solr/solr-ref-guide/src/json-faceting-domain-changes.adoc | 2 +- solr/solr-ref-guide/src/other-parsers.adoc | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/solr/solr-ref-guide/src/json-facet-api.adoc b/solr/solr-ref-guide/src/json-facet-api.adoc index fbccf9112a88..02feb62ce010 100644 --- a/solr/solr-ref-guide/src/json-facet-api.adoc +++ b/solr/solr-ref-guide/src/json-facet-api.adoc @@ -757,7 +757,7 @@ Most stat facet functions (`avg`, `sumsq`, etc.) allow users to perform math com === uniqueBlock() and Block Join Counts -When a collection contains <>, the `blockChildren` and `blockParent` <> can be useful when searching for parent documents and you want to compute stats against all of the affected children documents (or vice versa). +When a collection contains <>, the `blockChildren` and `blockParent` <> can be useful when searching for parent documents and you want to compute stats against all of the affected children documents (or vice versa). But if you only need to know the _count_ of all the blocks that exist in the current domain, a more efficient option is the `uniqueBlock()` aggregate function. Suppose we have products with multiple SKUs, and we want to count products for each color. diff --git a/solr/solr-ref-guide/src/json-faceting-domain-changes.adoc b/solr/solr-ref-guide/src/json-faceting-domain-changes.adoc index 6f57a695a5e9..b4d27e8dd1e7 100644 --- a/solr/solr-ref-guide/src/json-faceting-domain-changes.adoc +++ b/solr/solr-ref-guide/src/json-faceting-domain-changes.adoc @@ -177,7 +177,7 @@ NOTE: While a `query` domain can be combined with an additional domain `filter`, == Block Join Domain Changes -When a collection contains <>, the `blockChildren` or `blockParent` domain options can be used transform an existing domain containing one type of document, into a domain containing the documents with the specified relationship (child or parent of) to the documents from the original domain. +When a collection contains <>, the `blockChildren` or `blockParent` domain options can be used transform an existing domain containing one type of document, into a domain containing the documents with the specified relationship (child or parent of) to the documents from the original domain. Both of these options work similarly to the corresponding <> by taking in a single String query that exclusively matches all parent documents in the collection. If `blockParent` is used, then the resulting domain will contain all parent documents of the children from the original domain. If `blockChildren` is used, then the resulting domain will contain all child documents of the parents from the original domain. diff --git a/solr/solr-ref-guide/src/other-parsers.adoc b/solr/solr-ref-guide/src/other-parsers.adoc index 7c50873d5aef..abcaf7f709bb 100644 --- a/solr/solr-ref-guide/src/other-parsers.adoc +++ b/solr/solr-ref-guide/src/other-parsers.adoc @@ -24,7 +24,7 @@ Many of these parsers are expressed the same way as <>. +There are two query parsers that support block joins. These parsers allow indexing and searching for relational content that has been <>. The example usage of the query parsers below assumes these two documents and each of their child documents have been indexed: From 9fe24182d73d6b97058ab6d90ee3d55258190b62 Mon Sep 17 00:00:00 2001 From: moshebla Date: Mon, 25 Feb 2019 13:06:56 +0200 Subject: [PATCH 17/19] SOLR-13129: split nested-documents page into separate indexing and searching nested docs pages --- .../src/blockjoin-faceting.adoc | 2 +- solr/solr-ref-guide/src/index.adoc | 18 +- ...ts.adoc => indexing-nested-documents.adoc} | 196 +---------------- solr/solr-ref-guide/src/json-facet-api.adoc | 2 +- .../src/json-faceting-domain-changes.adoc | 2 +- solr/solr-ref-guide/src/other-parsers.adoc | 2 +- .../src/searching-nested-documents.adoc | 202 ++++++++++++++++++ solr/solr-ref-guide/src/searching.adoc | 4 +- .../src/transforming-result-documents.adoc | 2 +- 9 files changed, 220 insertions(+), 210 deletions(-) rename solr/solr-ref-guide/src/{nested-documents.adoc => indexing-nested-documents.adoc} (52%) create mode 100644 solr/solr-ref-guide/src/searching-nested-documents.adoc diff --git a/solr/solr-ref-guide/src/blockjoin-faceting.adoc b/solr/solr-ref-guide/src/blockjoin-faceting.adoc index 60ea4c189d15..94b51108f4de 100644 --- a/solr/solr-ref-guide/src/blockjoin-faceting.adoc +++ b/solr/solr-ref-guide/src/blockjoin-faceting.adoc @@ -41,7 +41,7 @@ This example shows how you could add this search components to `solrconfig.xml` This component can be added into any search request handler. This component work with distributed search in SolrCloud mode. -Documents should be added in children-parent blocks as described in <>. Examples: +Documents should be added in children-parent blocks as described in <>. Examples: .Sample document [source,xml] diff --git a/solr/solr-ref-guide/src/index.adoc b/solr/solr-ref-guide/src/index.adoc index 741e0b890009..b196b5acd27e 100644 --- a/solr/solr-ref-guide/src/index.adoc +++ b/solr/solr-ref-guide/src/index.adoc @@ -1,5 +1,5 @@ = Apache Solr Reference Guide -:page-children: about-this-guide, getting-started, deployment-and-operations, using-the-solr-administration-user-interface, documents-fields-and-schema-design, understanding-analyzers-tokenizers-and-filters, indexing-and-basic-data-operations, searching, nested-documents, streaming-expressions, solrcloud, legacy-scaling-and-distribution, the-well-configured-solr-instance, monitoring-solr, securing-solr, client-apis, further-assistance, solr-glossary, errata, how-to-contribute +:page-children: about-this-guide, getting-started, deployment-and-operations, using-the-solr-administration-user-interface, documents-fields-and-schema-design, understanding-analyzers-tokenizers-and-filters, indexing-and-basic-data-operations, searching, indexing-nested-documents, streaming-expressions, solrcloud, legacy-scaling-and-distribution, the-well-configured-solr-instance, monitoring-solr, securing-solr, client-apis, further-assistance, solr-glossary, errata, how-to-contribute :page-notitle: :page-toc: false :page-layout: home @@ -75,6 +75,8 @@ The *<>* section guides yo *<>*: This section describes how Solr organizes data in the index. It explains how a Solr schema defines the fields and field types which Solr uses to organize data within the document files it indexes. +*<>*: Detailed information about indexing and schema configuration for nested documents. + *<>*: This section explains how Solr prepares text for indexing and searching. Analyzers parse text and produce a stream of tokens, lexical units used for indexing and searching. Tokenizers break field data down into tokens. Filters perform other transformational or selective work on token streams. **** @@ -86,19 +88,9 @@ The *<>* section guides yo *<>*: A stream processing language for Solr, with a suite of functions to perform many types of queries and parallel execution tasks. -*<>*: This section tells you how to access Solr through various client APIs, including JavaScript, JSON, and Ruby. -**** --- +*<>*: Searching nested documents how to guide. -[.row.match-my-cols] --- -.Nested Documents -[sidebar.col-sm-6.col-md-4] -**** - -* <>: Detailed information about indexing and schema configuration for nested documents. - -* <>: Searching nested documents how to guide. +*<>*: This section tells you how to access Solr through various client APIs, including JavaScript, JSON, and Ruby. **** -- diff --git a/solr/solr-ref-guide/src/nested-documents.adoc b/solr/solr-ref-guide/src/indexing-nested-documents.adoc similarity index 52% rename from solr/solr-ref-guide/src/nested-documents.adoc rename to solr/solr-ref-guide/src/indexing-nested-documents.adoc index 8afc64456551..fc789c9b1f2b 100644 --- a/solr/solr-ref-guide/src/nested-documents.adoc +++ b/solr/solr-ref-guide/src/indexing-nested-documents.adoc @@ -1,4 +1,4 @@ -= Nested Child Documents += Indexing Nested Child Documents // Licensed to the Apache Software Foundation (ASF) under one // or more contributor license agreements. See the NOTICE file // distributed with this work for additional information @@ -23,6 +23,7 @@ At query time, the <> with javabin. [NOTE] ==== @@ -31,11 +32,7 @@ In other words, even if a single child document or the parent document is change _Solr does not enforce this rule_; if it's violated, you may get sporadic query failures or incorrect results. ==== -== Indexing Nested Documents - -Nested documents may be indexed via either the XML or JSON data syntax, and is also supported by <> with javabin. - -=== Schema Configuration +== Schema Configuration * The schema must include indexed field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth. The ID of the top document in every nested hierarchy is populated in this field. + `` @@ -56,8 +53,8 @@ Nested documents may be indexed via either the XML or JSON data syntax, and is a * These schemas do not contain any other nested related fields apart from `\_root_`. + In this mode relationship types(field names) between parents and their children are not saved. + - In this case <> transformer returns all children under the `\_childDocuments_` field. - * Typically you should have a field that differentiates a root doc from any nested children. However this isn't strictly necessary; so long as it's possible to write a query that can select only root documents somehow. Such a query is needed for the <> and <> doc transformer to function. + In this case <> transformer returns all children under the `\_childDocuments_` field. + * Typically you should have a field that differentiates a root doc from any nested children. However this isn't strictly necessary; so long as it's possible to write a query that can select only root documents somehow. Such a query is needed for the <> and <> doc transformer to function. === XML Examples @@ -138,186 +135,3 @@ For the anonymous relationship, note the special `\_childDocuments_` key whose c [NOTE] In Root-only schemas, these two documents will result in the same docs being indexed (Root-only schemas do not honor nested relationships). When queried, child docs will be appended to _childDocuments_ key. - -== Searching Nested Documents - - * `<>` Document Transformer - * <> - * <> - * <> - * <> - -=== Query Examples - -For the upcoming examples, assume the following documents have been indexed: - -[source,json] ----- -[ - { - "ID": "1", - "title": "Cooking Recommendations", - "tags": ["cooking", "meetup"], - "posts": [{ - "ID": "2", - "title": "Cookies", - "comments": [{ - "ID": "3", - "content": "Lovely recipe" - }, - { - "ID": "4", - "content": "A-" - } - ] - }, - { - "ID": "5", - "title": "Cakes" - } - ] - }, - { - "ID": "6", - "title": "For Hire", - "tags": ["professional", "jobs"], - "posts": [{ - "ID": "7", - "title": "Search Engineer", - "comments": [{ - "ID": "8", - "content": "I am interested" - }, - { - "ID": "9", - "content": "How large is the team?" - } - ] - }, - { - "ID": "10", - "title": "Low level Engineer" - } - ] - } -] ----- - -==== Child Doc Transformer - -Can be used enrich query results with the documents' descendants. + -For a detailed explanation of this transformer, see the section <>. - -For example, let us examine this query: -`q=ID:1, -fl=ID,[child childFilter=/comments/content:recipe]`. + -The Child Doc Transformer can be used to enrich matching docs with comments that match a particular filter. + -In this particular query, the child Filter will only match the first comment of doc(ID:1), -therefore only that particular comment will be appended to the result. - -[source,json] ----- - { "response":{"numFound":1,"start":0,"docs":[ - { - "ID": "1", - "title": "Cooking Recommendations", - "tags": ["cooking", "meetup"], - "posts": [{ - "ID": "2", - "title": "Cookies", - "comments": [{ - "ID": "3", - "content": "Lovely recipe" - }] - }] - }] - } - } ----- - -==== Children Query Parser - -Can be used to retrieve children of a matching document. + -For a detailed explanation of this parser, see the section <>. - -For example, let us examine this query: -`q={!child of='_nest_path_:/posts}content:"Search Engineer"`. + -The `'of'` filter returns all posts. This is used to filter out all documents in a particular path of the hierarchy(all parents). -The second part of the query is a filter for some parents, which we wish to return their children. + -In this example, all comments of posts which had "Search Engineer in their `content` field will be returned. - -[source,json] ----- - { "response":{"numFound":2,"start":0,"docs":[ - { - "ID": "8", - "content": "I am interested" - }, - { - "ID": "9", - "content": "How large is the team?" - } - ]} - } ----- - -==== Parents Query Parser - -Can be used to retrieve parents of a child document. + -For a detailed explanation of this parser, see the section <>. - -For example, let us examine this query: -`q={!parent which='-_nest_path_:* \*:*'}title:"Search Engineer"`. + -The `'which'` filter returns all root documents. -The second part of this query is a filter to match some child documents. -This query returns the parent at the root(since all parents filter returns root documents) of each -matching child document. In this case, all child documents which had `Search Engineer` in their `title` field. - -[source,json] ----- - { "response":{"numFound":1,"start":0,"docs":[{ - "ID": "6", - "title": "For Hire", - "tags": ["professional", "jobs"] - } - ]} - } ----- - -==== Combining Block Join Query Parsers with Child Doc Transformer - -The combination of these two features enable seamless creation of powerful queries. + -For example, querying posts which are under a page tagged as a job, contain the words "Search Engineer". -The comments for matching posts can also be fetched, all done in a single Solr Query. - -For example, let us examine this query: -`q=+{!child of='-\_nest_path_:* \*:*'}+tags:"jobs" &fl=*,[child] -&fq=\_nest_path_:/posts`. + -This query returns all posts and their comments, which had "Search Engineer" in their title, -and are indexed under a page tagged with "jobs". -The comments are appended to the matching posts, since the ChildDocTransformer is specified under the `fl` parameter. - -[source,json] ----- - { "response":{"numFound":1,"start":0,"docs":[ - { - "ID": "7", - "title": "Search Engineer", - "comments": [{ - "ID": "8", - "content": "I am interested" - }, - { - "ID": "9", - "content": "How large is the team?" - } - ] - }, - { - "ID": "10", - "title": "Low level Engineer" - }] - } - } ----- - diff --git a/solr/solr-ref-guide/src/json-facet-api.adoc b/solr/solr-ref-guide/src/json-facet-api.adoc index 02feb62ce010..61b96f4cce67 100644 --- a/solr/solr-ref-guide/src/json-facet-api.adoc +++ b/solr/solr-ref-guide/src/json-facet-api.adoc @@ -757,7 +757,7 @@ Most stat facet functions (`avg`, `sumsq`, etc.) allow users to perform math com === uniqueBlock() and Block Join Counts -When a collection contains <>, the `blockChildren` and `blockParent` <> can be useful when searching for parent documents and you want to compute stats against all of the affected children documents (or vice versa). +When a collection contains <>, the `blockChildren` and `blockParent` <> can be useful when searching for parent documents and you want to compute stats against all of the affected children documents (or vice versa). But if you only need to know the _count_ of all the blocks that exist in the current domain, a more efficient option is the `uniqueBlock()` aggregate function. Suppose we have products with multiple SKUs, and we want to count products for each color. diff --git a/solr/solr-ref-guide/src/json-faceting-domain-changes.adoc b/solr/solr-ref-guide/src/json-faceting-domain-changes.adoc index b4d27e8dd1e7..60a842ed8c7e 100644 --- a/solr/solr-ref-guide/src/json-faceting-domain-changes.adoc +++ b/solr/solr-ref-guide/src/json-faceting-domain-changes.adoc @@ -177,7 +177,7 @@ NOTE: While a `query` domain can be combined with an additional domain `filter`, == Block Join Domain Changes -When a collection contains <>, the `blockChildren` or `blockParent` domain options can be used transform an existing domain containing one type of document, into a domain containing the documents with the specified relationship (child or parent of) to the documents from the original domain. +When a collection contains <>, the `blockChildren` or `blockParent` domain options can be used transform an existing domain containing one type of document, into a domain containing the documents with the specified relationship (child or parent of) to the documents from the original domain. Both of these options work similarly to the corresponding <> by taking in a single String query that exclusively matches all parent documents in the collection. If `blockParent` is used, then the resulting domain will contain all parent documents of the children from the original domain. If `blockChildren` is used, then the resulting domain will contain all child documents of the parents from the original domain. diff --git a/solr/solr-ref-guide/src/other-parsers.adoc b/solr/solr-ref-guide/src/other-parsers.adoc index abcaf7f709bb..430c29307799 100644 --- a/solr/solr-ref-guide/src/other-parsers.adoc +++ b/solr/solr-ref-guide/src/other-parsers.adoc @@ -24,7 +24,7 @@ Many of these parsers are expressed the same way as <>. +There are two query parsers that support block joins. These parsers allow indexing and searching for relational content that has been <>. The example usage of the query parsers below assumes these two documents and each of their child documents have been indexed: diff --git a/solr/solr-ref-guide/src/searching-nested-documents.adoc b/solr/solr-ref-guide/src/searching-nested-documents.adoc new file mode 100644 index 000000000000..27e394a07f79 --- /dev/null +++ b/solr/solr-ref-guide/src/searching-nested-documents.adoc @@ -0,0 +1,202 @@ += Searching Nested Child Documents +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +This section exposes potential techniques which can be used for searching deeply nested documents, +show casing how more complex queries can be constructed using some of Solr's query parsers and Doc Transformers. +These features require `\_root_`, `\_nest_path_` to be declared in the index's schema. + +Please refer to the <> +section for more details about schema and index configuration. + + +[NOTE] +This section does not show case faceting on nested documents. For nested document faceting, please refer to the +<> section. + +== Query Examples + +For the upcoming examples, assume the following documents have been indexed: + +[source,json] +---- +[ + { + "ID": "1", + "title": "Cooking Recommendations", + "tags": ["cooking", "meetup"], + "posts": [{ + "ID": "2", + "title": "Cookies", + "comments": [{ + "ID": "3", + "content": "Lovely recipe" + }, + { + "ID": "4", + "content": "A-" + } + ] + }, + { + "ID": "5", + "title": "Cakes" + } + ] + }, + { + "ID": "6", + "title": "For Hire", + "tags": ["professional", "jobs"], + "posts": [{ + "ID": "7", + "title": "Search Engineer", + "comments": [{ + "ID": "8", + "content": "I am interested" + }, + { + "ID": "9", + "content": "How large is the team?" + } + ] + }, + { + "ID": "10", + "title": "Low level Engineer" + } + ] + } +] +---- + +=== Child Doc Transformer + +Can be used enrich query results with the documents' descendants. + +For a detailed explanation of this transformer, see the section <>. + +For example, let us examine this query: +`q=ID:1, +fl=ID,[child childFilter=/comments/content:recipe]`. + +The Child Doc Transformer can be used to enrich matching docs with comments that match a particular filter. + +In this particular query, the child Filter will only match the first comment of doc(ID:1), +therefore only that particular comment will be appended to the result. + +[source,json] +---- + { "response":{"numFound":1,"start":0,"docs":[ + { + "ID": "1", + "title": "Cooking Recommendations", + "tags": ["cooking", "meetup"], + "posts": [{ + "ID": "2", + "title": "Cookies", + "comments": [{ + "ID": "3", + "content": "Lovely recipe" + }] + }] + }] + } + } +---- + +=== Children Query Parser + +Can be used to retrieve children of a matching document. + +For a detailed explanation of this parser, see the section <>. + +For example, let us examine this query: +`q={!child of='_nest_path_:/posts}content:"Search Engineer"`. + +The `'of'` filter returns all posts. This is used to filter out all documents in a particular path of the hierarchy(all parents). +The second part of the query is a filter for some parents, which we wish to return their children. + +In this example, all comments of posts which had "Search Engineer in their `content` field will be returned. + +[source,json] +---- + { "response":{"numFound":2,"start":0,"docs":[ + { + "ID": "8", + "content": "I am interested" + }, + { + "ID": "9", + "content": "How large is the team?" + } + ]} + } +---- + +=== Parents Query Parser + +Can be used to retrieve parents of a child document. + +For a detailed explanation of this parser, see the section <>. + +For example, let us examine this query: +`q={!parent which='-_nest_path_:* \*:*'}title:"Search Engineer"`. + +The `'which'` filter returns all root documents. +The second part of this query is a filter to match some child documents. +This query returns the parent at the root(since all parents filter returns root documents) of each +matching child document. In this case, all child documents which had `Search Engineer` in their `title` field. + +[source,json] +---- + { "response":{"numFound":1,"start":0,"docs":[{ + "ID": "6", + "title": "For Hire", + "tags": ["professional", "jobs"] + } + ]} + } +---- + +=== Combining Block Join Query Parsers with Child Doc Transformer + +The combination of these two features enable seamless creation of powerful queries. + +For example, querying posts which are under a page tagged as a job, contain the words "Search Engineer". +The comments for matching posts can also be fetched, all done in a single Solr Query. + +For example, let us examine this query: +`q=+{!child of='-\_nest_path_:* \*:*'}+tags:"jobs" &fl=*,[child] +&fq=\_nest_path_:/posts`. + +This query returns all posts and their comments, which had "Search Engineer" in their title, +and are indexed under a page tagged with "jobs". +The comments are appended to the matching posts, since the ChildDocTransformer is specified under the `fl` parameter. + +[source,json] +---- + { "response":{"numFound":1,"start":0,"docs":[ + { + "ID": "7", + "title": "Search Engineer", + "comments": [{ + "ID": "8", + "content": "I am interested" + }, + { + "ID": "9", + "content": "How large is the team?" + } + ] + }, + { + "ID": "10", + "title": "Low level Engineer" + }] + } + } +---- diff --git a/solr/solr-ref-guide/src/searching.adoc b/solr/solr-ref-guide/src/searching.adoc index 9fa0e57cae84..5e0775c7771f 100644 --- a/solr/solr-ref-guide/src/searching.adoc +++ b/solr/solr-ref-guide/src/searching.adoc @@ -27,7 +27,8 @@ realtime-get, + exporting-result-sets, + parallel-sql-interface, + - analytics + analytics, + + searching-nested-documents // Licensed to the Apache Software Foundation (ASF) under one // or more contributor license agreements. See the NOTICE file @@ -68,6 +69,7 @@ This section describes how Solr works with search requests. It covers the follow * <>: Detailed information about re-ranking top scoring documents from simple queries using more complex scores. ** <>: How to use LTR to run machine learned ranking models in Solr. +* <>: Detailed information about constructing nested and hierarchical queries. * <>: Detailed information about using `DocTransformers` to add computed information to individual documents * <>: Detailed information about Solr's powerful autosuggest component. * <>: Detailed information about Solr's similar results query component. diff --git a/solr/solr-ref-guide/src/transforming-result-documents.adoc b/solr/solr-ref-guide/src/transforming-result-documents.adoc index 7063e393edf8..49d91f808f6c 100644 --- a/solr/solr-ref-guide/src/transforming-result-documents.adoc +++ b/solr/solr-ref-guide/src/transforming-result-documents.adoc @@ -124,7 +124,7 @@ A default style can be configured by specifying an `args` parameter in your `sol === [child] - ChildDocTransformerFactory -This transformer returns all <> of each parent document matching your query in a flat list nested inside the matching parent document. This is useful when you have indexed nested child documents and want to retrieve the child documents for the relevant parent documents for any type of search query. +This transformer returns all <> of each parent document matching your query in a flat list nested inside the matching parent document. This is useful when you have indexed nested child documents and want to retrieve the child documents for the relevant parent documents for any type of search query. [source,plain] ---- From b7a20a7761d8aa6f1a29ca2b48f41c57181a9b93 Mon Sep 17 00:00:00 2001 From: moshebla Date: Wed, 27 Feb 2019 08:46:16 +0200 Subject: [PATCH 18/19] SOLR-13129: added a brief discussion about how updates to parent-child blocks are handled --- .../src/indexing-nested-documents.adoc | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/solr/solr-ref-guide/src/indexing-nested-documents.adoc b/solr/solr-ref-guide/src/indexing-nested-documents.adoc index fc789c9b1f2b..d9ff7d78f5e3 100644 --- a/solr/solr-ref-guide/src/indexing-nested-documents.adoc +++ b/solr/solr-ref-guide/src/indexing-nested-documents.adoc @@ -135,3 +135,15 @@ For the anonymous relationship, note the special `\_childDocuments_` key whose c [NOTE] In Root-only schemas, these two documents will result in the same docs being indexed (Root-only schemas do not honor nested relationships). When queried, child docs will be appended to _childDocuments_ key. + +== Updating Nested Documents + +Currently Solr supports updating whole hierarchies using atomic updates. Documents should be updated by the Root (top) +document's ID, and the update should contain all its children. This is needed considering Solr deletes the old hierarchy, +since the update term is `\_root_:id`. In case some child documents are omitted from the update command, +said documents will be deleted from the index. + +.Updating By a Child Document's ID +[NOTE] + An update by ID to a child document will index a new document with the same ID as the one in the nested hierarchy, + yet the new document will not be indexed as a child, but rather as a new document outside of the nested hierarchy. \ No newline at end of file From 713fdb7b1656673dcdd3cd7548cb0025ce7b84eb Mon Sep 17 00:00:00 2001 From: moshebla Date: Wed, 27 Feb 2019 13:11:13 +0200 Subject: [PATCH 19/19] SOLR-13129: link indexing and searching nested docs --- solr/solr-ref-guide/src/index.adoc | 6 +----- .../src/indexing-and-basic-data-operations.adoc | 4 +++- solr/solr-ref-guide/src/indexing-nested-documents.adoc | 5 ++++- solr/solr-ref-guide/src/other-parsers.adoc | 2 +- solr/solr-ref-guide/src/searching.adoc | 4 ++-- 5 files changed, 11 insertions(+), 10 deletions(-) diff --git a/solr/solr-ref-guide/src/index.adoc b/solr/solr-ref-guide/src/index.adoc index b196b5acd27e..bb3c3962953b 100644 --- a/solr/solr-ref-guide/src/index.adoc +++ b/solr/solr-ref-guide/src/index.adoc @@ -1,5 +1,5 @@ = Apache Solr Reference Guide -:page-children: about-this-guide, getting-started, deployment-and-operations, using-the-solr-administration-user-interface, documents-fields-and-schema-design, understanding-analyzers-tokenizers-and-filters, indexing-and-basic-data-operations, searching, indexing-nested-documents, streaming-expressions, solrcloud, legacy-scaling-and-distribution, the-well-configured-solr-instance, monitoring-solr, securing-solr, client-apis, further-assistance, solr-glossary, errata, how-to-contribute +:page-children: about-this-guide, getting-started, deployment-and-operations, using-the-solr-administration-user-interface, documents-fields-and-schema-design, understanding-analyzers-tokenizers-and-filters, indexing-and-basic-data-operations, searching, streaming-expressions, solrcloud, legacy-scaling-and-distribution, the-well-configured-solr-instance, monitoring-solr, securing-solr, client-apis, further-assistance, solr-glossary, errata, how-to-contribute :page-notitle: :page-toc: false :page-layout: home @@ -75,8 +75,6 @@ The *<>* section guides yo *<>*: This section describes how Solr organizes data in the index. It explains how a Solr schema defines the fields and field types which Solr uses to organize data within the document files it indexes. -*<>*: Detailed information about indexing and schema configuration for nested documents. - *<>*: This section explains how Solr prepares text for indexing and searching. Analyzers parse text and produce a stream of tokens, lexical units used for indexing and searching. Tokenizers break field data down into tokens. Filters perform other transformational or selective work on token streams. **** @@ -88,8 +86,6 @@ The *<>* section guides yo *<>*: A stream processing language for Solr, with a suite of functions to perform many types of queries and parallel execution tasks. -*<>*: Searching nested documents how to guide. - *<>*: This section tells you how to access Solr through various client APIs, including JavaScript, JSON, and Ruby. **** -- diff --git a/solr/solr-ref-guide/src/indexing-and-basic-data-operations.adoc b/solr/solr-ref-guide/src/indexing-and-basic-data-operations.adoc index 40b5f3f57825..e145a71986f0 100644 --- a/solr/solr-ref-guide/src/indexing-and-basic-data-operations.adoc +++ b/solr/solr-ref-guide/src/indexing-and-basic-data-operations.adoc @@ -1,5 +1,5 @@ = Indexing and Basic Data Operations -:page-children: introduction-to-solr-indexing, post-tool, uploading-data-with-index-handlers, uploading-data-with-solr-cell-using-apache-tika, uploading-structured-data-store-data-with-the-data-import-handler, updating-parts-of-documents, detecting-languages-during-indexing, de-duplication, content-streams +:page-children: introduction-to-solr-indexing, post-tool, uploading-data-with-index-handlers, indexing-nested-documents, uploading-data-with-solr-cell-using-apache-tika, uploading-structured-data-store-data-with-the-data-import-handler, updating-parts-of-documents, detecting-languages-during-indexing, de-duplication, content-streams // Licensed to the Apache Software Foundation (ASF) under one // or more contributor license agreements. See the NOTICE file // distributed with this work for additional information @@ -27,6 +27,8 @@ This section describes how Solr adds data to its index. It covers the following * *<>*: Index any JSON of your choice +* *<>*: Detailed information about indexing and schema configuration for nested documents. + * *<>*: Information about using the Solr Cell framework to upload data for indexing. * *<>*: Information about uploading and indexing data from a structured data store. diff --git a/solr/solr-ref-guide/src/indexing-nested-documents.adoc b/solr/solr-ref-guide/src/indexing-nested-documents.adoc index d9ff7d78f5e3..ce07a3126356 100644 --- a/solr/solr-ref-guide/src/indexing-nested-documents.adoc +++ b/solr/solr-ref-guide/src/indexing-nested-documents.adoc @@ -16,7 +16,10 @@ // specific language governing permissions and limitations // under the License. -Solr supports indexing nested documents such as a blog post parent document and comments as child documents -- or products as parent documents and sizes, colors, or other variations as child documents. + +Solr supports indexing nested documents for creating stronger bonds and relationships between documents, +to be used for updates and <>. + +Nested documents in Solr can be used to bind a blog post parent document and comments as child documents +-- or products as parent documents and sizes, colors, or other variations as child documents. + The parent with all children is referred to as a "block" and it explains some of the nomenclature of related features. At query time, the <> can search these relationships, and the `<>` Document Transformer can attach child documents to the result documents. diff --git a/solr/solr-ref-guide/src/other-parsers.adoc b/solr/solr-ref-guide/src/other-parsers.adoc index 430c29307799..266080c8087a 100644 --- a/solr/solr-ref-guide/src/other-parsers.adoc +++ b/solr/solr-ref-guide/src/other-parsers.adoc @@ -24,7 +24,7 @@ Many of these parsers are expressed the same way as <>. +There are two query parsers that support block joins. These parsers allow indexing and searching for relational content that has been <>. The example usage of the query parsers below assumes these two documents and each of their child documents have been indexed: diff --git a/solr/solr-ref-guide/src/searching.adoc b/solr/solr-ref-guide/src/searching.adoc index 5e0775c7771f..8367105a14b7 100644 --- a/solr/solr-ref-guide/src/searching.adoc +++ b/solr/solr-ref-guide/src/searching.adoc @@ -10,6 +10,7 @@ spell-checking, + query-re-ranking, + transforming-result-documents, + + searching-nested-documents, + suggester, + morelikethis, + pagination-of-results, + @@ -27,8 +28,7 @@ realtime-get, + exporting-result-sets, + parallel-sql-interface, + - analytics, + - searching-nested-documents + analytics // Licensed to the Apache Software Foundation (ASF) under one // or more contributor license agreements. See the NOTICE file