From 6428ddb10e975f8b8955aebb9b59e5de201face0 Mon Sep 17 00:00:00 2001 From: Cassandra Targett Date: Thu, 2 Nov 2017 13:18:09 -0500 Subject: [PATCH] SOLR-11144: Add Analytics Component docs to the Ref Guide --- .../src/analytics-expression-sources.adoc | 91 ++ .../src/analytics-mapping-functions.adoc | 360 ++++++++ .../src/analytics-reduction-functions.adoc | 120 +++ solr/solr-ref-guide/src/analytics.adoc | 819 ++++++++++++++++++ solr/solr-ref-guide/src/searching.adoc | 3 +- 5 files changed, 1392 insertions(+), 1 deletion(-) create mode 100644 solr/solr-ref-guide/src/analytics-expression-sources.adoc create mode 100644 solr/solr-ref-guide/src/analytics-mapping-functions.adoc create mode 100644 solr/solr-ref-guide/src/analytics-reduction-functions.adoc create mode 100644 solr/solr-ref-guide/src/analytics.adoc diff --git a/solr/solr-ref-guide/src/analytics-expression-sources.adoc b/solr/solr-ref-guide/src/analytics-expression-sources.adoc new file mode 100644 index 000000000000..c56337f59c5a --- /dev/null +++ b/solr/solr-ref-guide/src/analytics-expression-sources.adoc @@ -0,0 +1,91 @@ += Analytics Expression Sources +:page-tocclass: right +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +Expression sources are the source of the data being aggregated in <>. + +These sources can be either Solr fields indexed with docValues, or constants. + +== Supported Field Types + +The following <> are supported. +Fields of these types can be either multi-valued and single-valued. + +All fields used in analytics expressions *must* have <> enabled. + + +// Since Trie* fields are deprecated as of 7.0, we should consider removing Trie* fields from this list... + +[horizontal] +String:: + StrField +Boolean:: + BoolField +Integer:: + TrieIntField + + IntPointField +Long:: + TrieLongField + + LongPointField +Float:: + TrieFloatField + + FloatPointField +Double:: + TrieDoubleField + + DoublePointField +Date:: + TrieDateField + + DatePointField + +.Multi-valued Field De-duplication +[WARNING] +==== +All multi-valued field types, except for PointFields, are de-duplicated, meaning duplicate values for the same field are removed during indexing. +In order to save duplicates, you must use PointField types. +==== + +== Constants + +Constants can be included in expressions to use along side fields and functions. The available constants are shown below. +Constants do not need to be surrounded by any function to define them, they can be used exactly like fields in an expression. + +=== Strings + +There are two possible ways of specifying constant strings, as shown below. + +* Surrounded by double quotes, inside the quotes both `"` and `\` must be escaped with a `\` character. ++ +`"Inside of 'double' \\ \"quotes\""` \=> `Inside of 'double' \ "quotes"` +* Surrounded by single quotes, inside the quotes both `'` and `\` must be escaped with a `\` character. ++ +`'Inside of "single" \\ \'quotes\''` \=> `Inside of "double" \ 'quotes'` + +=== Dates + +Dates can be specified in the same way as they are in Solr queries. Just use ISO-8601 format. +For more information, refer to the <> section. + +* `2017-07-17T19:35:08Z` + +=== Numeric + +Any non-decimal number will be read as an integer, or as a long if it is too large for an integer. All decimal numbers will be read as doubles. + +* `-123421`: Integer +* `800000000000`: Long +* `230.34`: Double diff --git a/solr/solr-ref-guide/src/analytics-mapping-functions.adoc b/solr/solr-ref-guide/src/analytics-mapping-functions.adoc new file mode 100644 index 000000000000..f6de0de41cda --- /dev/null +++ b/solr/solr-ref-guide/src/analytics-mapping-functions.adoc @@ -0,0 +1,360 @@ += Analytics Mapping Functions +:page-tocclass: right +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +Mapping functions map values for each Solr Document or Reduction. + +Below is a list of all mapping functions provided by the Analytics Component. +These mappings can be chained together to implement more complex functionality. + +== Numeric Functions + +=== Negation +Negates the result of a numeric expression. + +`neg(<_Numeric_ T>)` \=> ``:: + * `neg(10.53)` \=> `-10.53` + * `neg([1, -4])` \=> `[-1, 4]` + +=== Absolute Value +Returns the absolute value of the numeric expression. + +`abs(< _Numeric_ T >)` \=> `< T >`:: + * `abs(-10.53)` \=> `10.53` + * `abs([1, -4])` \=> `[1, 4]` + +[[analytics-round]] +=== Round +Rounds the numeric expression to the nearest `Integer` or `Long` value. + +`round(< _Float_ >)` \=> `< _Int_ >`:: +`round(< _Double_ >)` \=> `< _Long_ >`:: + * `round(-1.5)` \=> `-1` + * `round([1.75, 100.34])` \=> `[2, 100]` + +=== Ceiling +Rounds the numeric expression to the nearest `Integer` or `Long` value that is greater than or equal to the original value. + +`ceil(< _Float_ >)` \=> `< _Int_ >`:: +`ceil(< _Double_ >)` \=> `< _Long_ >`:: + * `ceil(5.01)` \=> `5` + * `ceil([-4.999, 6.99])` \=> `[-4, 7]` + +[[analytics-floor]] +=== Floor +Rounds the numeric expression to the nearest `Integer` or `Long` value that is less than or equal to the original value. + +`floor(< _Float_ >)` \=> `< _Int_ >`:: +`floor(< _Double_ >)` \=> `< _Long_ >`:: + * `floor(5.75)` \=> `5` + * `floor([-4.001, 6.01])` \=> `[-5, 6]` + +=== Addition +Adds the values of the numeric expressions. + +`add(< _Multi Double_ >)` \=> `< _Single Double_ >`:: + * `add([1, -4])` \=> `-3.0` +`add(< _Single Double_ >, < _Multi Double_ >)` \=> `< _Multi Double_ >`:: + * `add(3.5, [1, -4])` \=> `[4.5, -0.5]` +`add(< _Multi Double_ >, < _Single Double_ >)` \=> `< _Multi Double_ >`:: + * `add([1, -4], 3.5)` \=> `[4.5, -0.5]` +`add(< _Single Double_ >, ...)` \=> `< _Single Double_ >`:: + * `add(3.5, 100, -27.6)` \=> `75.9` + +=== Subtraction +Subtracts the values of the numeric expressions. + +`sub(< _Single Double_ >, < _Single Double_ >)` \=> `< _Single Double_ >`:: + * `sub(3.5, 100)` \=> `-76.5` +`sub(< _Single Double_ >, < _Multi Double_ >)` \=> `< _Multi Double_ >`:: + * `sub(3.5, [1, -4])` \=> `[2.5, 7.5]` +`sub(< _Multi Double_ >, < _Single Double_ >)` \=> `< _Multi Double_ >`:: + * `sub([1, -4], 3.5)` \=> `[-2.5, -7.5]` + +=== Multiplication +Multiplies the values of the numeric expressions. + +`mult(< _Multi Double_ >)` \=> `< _Single Double_ >`:: + * `mult([1, -4])` \=> `-4.0` +`mult(< _Single Double_ >, < _Multi Double_ >)` \=> `< _Multi Double_ >`:: + * `mult(3.5, [1, -4])` \=> `[3.5, -16.0]` +`mult(< _Multi Double_ >, < _Single Double_ >)` \=> `< _Multi Double_ >`:: + * `mult([1, -4], 3.5)` \=> `[3.5, 16.0]` +`mult(< _Single Double_ >, ...)` \=> `< _Single Double_ >`:: + * `mult(3.5, 100, -27.6)` \=> `-9660` + +=== Division +Divides the values of the numeric expressions. + +`div(< _Single Double_ >, < _Single Double_ >)` \=> `< _Single Double_ >`:: + * `div(3.5, 100)` \=> `.035` +`div(< _Single Double_ >, < _Multi Double_ >)` \=> `< _Multi Double_ >`:: + * `div(3.5, [1, -4])` \=> `[3.5, -0.875]` +`div(< _Multi Double_ >, < _Single Double_ >)` \=> `< _Multi Double_ >`:: + * `div([1, -4], 25)` \=> `[0.04, -0.16]` + +=== Power +Takes one numeric expression to the power of another. + +*NOTE:* The square root function `sqrt(< _Double_ >)` can be used as shorthand for `pow(< _Double_ >, .5)` + +`pow(< _Single Double_ >, < _Single Double_ >)` \=> `< _Single Double_ >`:: + * `pow(2, 4)` \=> `16.0` +`pow(< _Single Double_ >, < _Multi Double_ >)` \=> `< _Multi Double_ >`:: + * `pow(16, [-1, 0])` \=> `[0.0625, 1]` +`pow(< _Multi Double_ >, < _Single Double_ >)` \=> `< _Multi Double_ >`:: + * `pow([1, 16], .25)` \=> `[1.0, 2.0]` + +=== Logarithm +Takes one logarithm of numeric expressions, with an optional second numeric expression as the base. +If only one expression is given, the natural log is used. + +`log(< _Double_ >)` \=> `< _Double_ >`:: + * `log(5)` \=> `1.6094...` + * `log([1.0, 100.34])` \=> `[0.0, 4.6085...]` +`log(< _Single Double_ >, < _Single Double_ >)` \=> `< _Single Double_ >`:: + * `log(2, 4)` \=> `0.5` +`log(< _Single Double_ >, < _Multi Double_ >)` \=> `< _Multi Double_ >`:: + * `log(16, [2, 4])` \=> `[4, 2]` +`log(< _Multi Double_ >, < _Single Double_ >)` \=> `< _Multi Double_ >`:: + * `log([81, 3], 9)` \=> `[2.0, 0.5]` + +== Logic + +[[analytics-logic-neg]] +=== Negation +Negates the result of a boolean expression. + +`neg(< _Bool_ >)` \=> `< _Bool_>`:: + * `neg(F)` \=> `T` + * `neg([F, T])` \=> `[T, F]` + +[[analytics-and]] +=== And +ANDs the values of the boolean expressions. + +`and(< _Multi Bool_ >)` \=> `< _Single Bool_ >`:: + * `and([T, F, T])` \=> `F` +`and(< _Single Bool_ >, < _Multi Bool_ >)` \=> `< _Multi Bool_ >`:: + * `and(F, [T, T])` \=> `[F, F]` +`and(< _Multi Bool_ >, < _Single Bool_ >)` \=> `< _Multi Bool_ >`:: + * `and([F, T], T)` \=> `[F, T]` +`and(< _Single Bool_ >, ...)` \=> `< _Single Bool_ >`:: + * `and(T, T, T)` \=> `T` + +[[analytics-or]] +=== Or +ORs the values of the boolean expressions. + +`or(< _Multi Bool_ >)` \=> `< _Single Bool_ >`:: + * `or([T, F, T])` \=> `T` +`or(< _Single Bool_ >, < _Multi Bool_ >)` \=> `< _Multi Bool_ >`:: + * `or(F, [F, T])` \=> `[F, T]` +`or(< _Multi Bool_ >, < _Single Bool_ >)` \=> `< _Multi Bool_ >`:: + * `or([F, T], T)` \=> `[T, T]` +`or(< _Single Bool_ >, ...)` \=> `< _Single Bool_ >`:: + * `or(F, F, F)` \=> `F` + +==== Exists +Checks whether any value(s) exist for the expression. + +`exists( T )` \=> `< _Single Bool_ >`:: + * `exists([1, 2, 3])` \=> `T` + * `exists([])` \=> `F` + * `exists(_empty_)` \=> `F` + * `exists('abc')` \=> `T` + +== Comparison + +=== Equality +Checks whether two expressions' values are equal. The parameters must be the same type, after implicit casting. + +`equal(< _Single_ T >, < _Single_ T >)` \=> `< _Single Bool_ >`:: + * `equal(F, F)` \=> `T` +`equal(< _Single_ T >, < _Multi_ T >)` \=> `< _Multi Bool_ >`:: + * `equal("a", ["a", "ab"])` \=> `[T, F]` +`equal(< _Multi_ T >, < _Single_ T >)` \=> `< _Multi Bool_ >`:: + * `equal([1.5, -3.0], -3)` \=> `[F, T]` + +=== Greater Than +Checks whether a numeric or `Date` expression's values are greater than another expression's values. +The parameters must be the same type, after implicit casting. + +`gt(< _Single Numeric/Date_ T >, < _Single_ T >)` \=> `< _Single Bool_ >`:: + * `gt(1800-01-02, 1799-12-20)` \=> `F` +`gt(< _Single Numeric/Date_ T >, < _Multi_ T >)` \=> `< _Multi Bool_ >`:: + * `gt(30.756, [30, 100])` \=> `[F, T]` +`gt(< _Multi Numeric/Date_ T >, < _Single_ T >)` \=> `< _Multi Bool_ >`:: + * `gt([30, 75.6], 30)` \=> `[F, T]` + +=== Greater Than or Equals +Checks whether a numeric or `Date` expression's values are greater than or equal to another expression's values. +The parameters must be the same type, after implicit casting. + +`gte(< _Single Numeric/Date_ T >, < _Single_ T >)` \=> `< _Single Bool_ >`:: + * `gte(1800-01-02, 1799-12-20)` \=> `F` +`gte(< _Single Numeric/Date_ T >, < _Multi_ T >)` \=> `< _Multi Bool_ >`:: + * `gte(30.756, [30, 100])` \=> `[F, T]` +`gte(< _Multi Numeric/Date_ T >, < _Single_ T >)` \=> `< _Multi Bool_ >`:: + * `gte([30, 75.6], 30)` \=> `[T, T]` + +=== Less Than +Checks whether a numeric or `Date` expression's values are less than another expression's values. +The parameters must be the same type, after implicit casting. + +`lt(< _Single Numeric/Date_ T >, < _Single_ T >)` \=> `< _Single Bool_ >`:: + * `lt(1800-01-02, 1799-12-20)` \=> `T` +`lt(< _Single Numeric/Date_ T >, < _Multi_ T >)` \=> `< _Multi Bool_ >`:: + * `lt(30.756, [30, 100])` \=> `[T, F]` +`lt(< _Multi Numeric/Date_ T >, < _Single_ T >)` \=> `< _Multi Bool_ >`:: + * `lt([30, 75.6], 30)` \=> `[F, F]` + +=== Less Than or Equals +Checks whether a numeric or `Date` expression's values are less than or equal to another expression's values. +The parameters must be the same type, after implicit casting. + +`lte(< _Single Numeric/Date_ T >, < _Single_ T >)` \=> `< _Single Bool_ >`:: + * `lte(1800-01-02, 1799-12-20)` \=> `T` +`lte(< _Single Numeric/Date_ T >, < _Multi_ T >)` \=> `< _Multi Bool_ >`:: + * `lte(30.756, [30, 100])` \=> `[T, F]` +`lte(< _Multi Numeric/Date_ T >, < _Single_ T >)` \=> `< _Multi Bool_ >`:: + * `lte([30, 75.6], 30)` \=> `[T, F]` + +[[analytics-top]] +=== Top +Returns the maximum of the numeric, `Date` or `String` expression(s)' values. +The parameters must be the same type, after implicit casting. +(Currently the only type not compatible is `Boolean`, which will be converted to a `String` implicitly in order to compile the expression) + +`top(< _Multi_ T >)` \=> `< _Single_ T >`:: + * `top([30, 400, -10, 0])` \=> `400` +`top(< _Single_ T >, ...)` \=> `< _Single_ T >`:: + * `top("a", 1, "d")` \=> `"d"` + +=== Bottom +Returns the minimum of the numeric, `Date` or `String` expression(s)' values. +The parameters must be the same type, after implicit casting. +(Currently the only type not compatible is `Boolean`, which will be converted to a `String` implicitly in order to compile the expression) + +`bottom(< _Multi_ T >)` \=> `< _Single_ T >`:: + * `bottom([30, 400, -10, 0])` \=> `-10` +`bottom(< _Single_ T >, ...)` \=> `< _Single_ T >`:: + * `bottom("a", 1, "d")` \=> `"1"` + +== Conditional + +[[analytics-if]] +=== If +Returns the value(s) of the `THEN` or `ELSE` expressions depending on whether the boolean conditional expression's value is `true` or `false`. +The `THEN` and `ELSE` expressions must be of the same type and cardinality after implicit casting is done. + +`if(< _Single Bool_>, < T >, < T >)` \=> `< T >`:: + * `if(true, "abc", [1,2])` \=> `["abc"]` + * `if(false, "abc", 123)` \=> `"123"` + +=== Replace +Replace all values from the 1^st^ expression that are equal to the value of the 2^nd^ expression with the value of the 3^rd^ expression. +All parameters must be the same type after implicit casting is done. + +`replace(< T >, < _Single_ T >, < _Single_ T >)` \=> `< T >`:: + * `replace([1,3], 3, "4")` \=> `["1", "4"]` + * `replace("abc", "abc", 18)` \=> `"18"` + * `replace("abc", 1, "def")` \=> `"abc"` + +=== Fill Missing +If the 1^st^ expression does not have values, fill it with the values for the 2^nd^ expression. +Both expressions must be of the same type and cardinality after implicit casting is done + +`fill_missing(< T >, < T >)` \=> `< T >`:: + * `fill_missing([], 3)` \=> `[3]` + * `fill_missing(_empty_, "abc")` \=> `"abc"` + * `fill_missing("abc", [1])` \=> `["abc"]` + +=== Remove +Remove all occurrences of the 2^nd^ expression's value from the values of the 1^st^ expression. +Both expressions must be of the same type after implicit casting is done + +`remove(< T >, < _Single_ T >)` \=> `< T >`:: + * `remove([1,2,3,2], 2)` \=> `[1, 3]` + * `remove("1", 1)` \=> `_empty_` + * `remove(1, "abc")` \=> `"1"` + +=== Filter +Return the values of the 1^st^ expression if the value of the 2^nd^ expression is `true`, otherwise return no values. + +`filter(< T >, < _Single Boolean_ >)` \=> `< T >`:: + * `filter([1,2,3], true)` \=> `[1,2,3]` + * `filter([1,2,3], false)` \=> `[]` + * `filter("abc", false)` \=> `_empty_` + * `filter("abc", true)` \=> `1` + +== Date + +=== Date Parse +Explicitly converts the values of a `String` or `Long` expression into `Dates`. + +`date(< _String_ >)` \=> `< _Date_ >`:: + * `date('1800-01-02')` \=> `1800-01-02T​00:00:00Z` + * `date(['1800-01-02', '2016-05-23'])` \=> `[1800-01-02T..., 2016-05-23T...]` +`date(< _Long_ >)` \=> `< _Date_ >`:: + * `date(1232343246648)` \=> `2009-01-19T​05:34:06Z` + * `date([1232343246648, 223234324664])` \=> `[2009-01-19T..., 1977-01-27T...]` + +[[analytics-date-math]] +=== Date Math +Compute the given date math strings for the values of a `Date` expression. The date math strings *must* be <>. + +`date_math(< _Date_ >, < _Constant String_ >...)` \=> `< _Date_ >`:: + * `date_math(1800-04-15, '+1DAY', '-1MONTH')` \=> `1800-03-16` + * `date_math([1800-04-15,2016-05-24], '+1DAY', '-1MONTH')` \=> `[1800-03-16, 2016-04-25]` + +== String + +=== Explicit Casting +Explicitly casts the expression to a `String` expression. + +`string(< _String_ >)` \=> `< _String_ >`:: + * `string(1)` \=> `'1'` + * `string([1.5, -2.0])` \=> `['1.5', '-2.0']` + +=== Concatenation +Concatenations the values of the `String` expression(s) together. + +`concat(< _Multi String_ >)` \=> `< _Single String_ >`:: + * `concat(['a','b','c'])` \=> `'abc'` +`concat(< _Single String_ >, < _Multi String_ >)` \=> `< _Multi String_ >`:: + * `concat(1, ['a','b','c'])` \=> `['1a','1b','1c']` +`concat(< _Multi String_ >, < _Single String_ >)` \=> `< _Multi String_ >`:: + * `concat(['a','b','c'], 1)` \=> `['a1','b1','c1']` +`concat(< _Single String_ >...)` \=> `< _Single String_ >`:: + * `concat('a','b','c')` \=> `'abc'` + * `concat('a',_empty_,'c')` \=> `'ac'` + + _Empty values are ignored_ + +=== Separated Concatenation +Concatenations the values of the `String` expression(s) together using the given <> value as a separator. + +`concat_sep(< _Constant String_ >, < _Multi String_ >)` \=> `< _Single String_ >`:: + * `concat_sep('-', ['a','b'])` \=> `'a-b'` +`concat_sep(< _Constant String_ >, < _Single String_ >, < _Multi String_ >)` \=> `< _Multi String_ >`:: + * `concat_sep(2,1,['a','b'])` \=> `['12a','12b']` +`concat_sep(< _Constant String_ >, < _Multi String_ >, < _Single String_ >)` \=> `< _Multi String_ >`:: + * `concat_sep(2,['a','b'],1)` \=> `['a21','b21']` + * `concat_sep('-','a',2,3)` \=> `'a-2-3'` + * `concat_sep(';','a',_empty_,'c')` \=> `'a;c'` + +_Empty values are ignored_ diff --git a/solr/solr-ref-guide/src/analytics-reduction-functions.adoc b/solr/solr-ref-guide/src/analytics-reduction-functions.adoc new file mode 100644 index 000000000000..60c65fab3def --- /dev/null +++ b/solr/solr-ref-guide/src/analytics-reduction-functions.adoc @@ -0,0 +1,120 @@ += Analytics Reduction Functions +:page-tocclass: right +:page-toclevels: 2 +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +Reduction functions reduce the values of <> +and/or unreduced <> +for every Solr Document to a single value. + +Below is a list of all reduction functions provided by the Analytics Component. +These can be combined using mapping functions to implement more complex functionality. + +== Counting Reductions + +=== Count +The number of existing values for an expression. For single-valued expressions, this is equivalent to `docCount`. +If no expression is given, the number of matching documents is returned. + +`count()` \=> `< _Single Long_ >` +`count(< T >)` \=> `< _Single Long_ >` + +=== Doc Count +The number of documents for which an expression has existing values. For single-valued expressions, this is equivalent to `count`. +If no expression is given, the number of matching documents is returned. + +`doc_count()` \=> `< _Single Long_ >` + +`doc_count(< T >)` \=> `< _Single Long_ >` + +=== Missing +The number of documents for which an expression has no existing value. + +`missing(< T >)` \=> `< _Single Long_ >` + +[[analytics-unique]] +=== Unique +The number of unique values for an expression. This function accepts `Numeric`, `Date` and `String` expressions. + +`unique(< T >)` \=> `< _Single Long_ >` + +== Math Reductions + +=== Sum +Returns the sum of all values for the expression. + +`sum(< _Double_ >)` \=> `< _Single Double_ >` + +=== Variance +Returns the variance of all values for the expression. + +`variance(< _Double_ >)` \=> `< _Single Double_ >` + +=== Standard Deviation +Returns the standard deviation of all values for the expression. + +`stddev(< _Double_ >)` \=> `< _Single Double_ >` + +=== Mean +Returns the arithmetic mean of all values for the expression. + +`mean(< _Double_ >)` \=> `< _Single Double_ >` + +=== Weighted Mean +Returns the arithmetic mean of all values for the second expression weighted by the values of the first expression. + +`wmean(< _Double_ >, < _Double_ >)` \=> `< _Single Double_ >` + +NOTE: The expressions must satisfy the rules for `mult` function parameters. + +== Ordering Reductions + +=== Minimum +Returns the minimum value for the expression. This function accepts `Numeric`, `Date` and `String` expressions. + +`min(< T >)` \=> `< _Single_ T >` + +=== Maximum +Returns the maximum value for the expression. This function accepts `Numeric`, `Date` and `String` expressions. + +`max(< T >)` \=> `< _Single_ T >` + +=== Median +Returns the median of all values for the expression. This function accepts `Numeric` and `Date` expressions. + +`median(< T >)` \=> `< _Single_ T >` + +=== Percentile +Calculates the given percentile of all values for the expression. +This function accepts `Numeric`, `Date` and `String` expressions for the 2^nd^ parameter. + +The percentile, given as the 1^st^ parameter, must be a <> between [0, 100). + +`percentile(, < T >)` \=> `< _Single_ T >` + +=== Ordinal +Calculates the given ordinal of all values for the expression. +This function accepts `Numeric`, `Date` and `String` expressions for the 2^nd^ parameter. +The ordinal, given as the 1^st^ parameter, must be a <>. +*0 is not accepted as an ordinal value.* + +If the ordinal is positive, the returned value will be the _n_^th^ smallest value. + +If the ordinal is negative, the returned value will be the _n_^th^ largest value. + +`ordinal(, < T >)` \=> `< _Single_ T >` diff --git a/solr/solr-ref-guide/src/analytics.adoc b/solr/solr-ref-guide/src/analytics.adoc new file mode 100644 index 000000000000..fe9b1105ce21 --- /dev/null +++ b/solr/solr-ref-guide/src/analytics.adoc @@ -0,0 +1,819 @@ += Analytics Component +:page-children: analytics-expression-sources, analytics-mapping-functions, analytics-reduction-functions +:page-tocclass: right +:page-toclevel: 2 +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +The Analytics Component allows users to calculate complex statistical aggregations over result sets. + +The component enables interacting with data in a variety of ways, both through a diverse set of analytics functions as well as powerful faceting functionality. +The standard facets are supported within the analytics component with additions that leverage its analytical capabilities. + +== Analytics Configuration + +The Analytics component is in a contrib module, therefore it will need to be enabled in the `solrconfig.xml` for each collection where you would like to use it. + +Since the Analytics framework is a _search component_, it must be declared as such and added to the search handler. + +For distributed analytics requests over cloud collections, the component uses the `AnalyticsHandler` strictly for inter-shard communication. +The Analytics Handler should not be used by users to submit analytics requests. + +To configure Solr to use the Analytics Component, the first step is to add a `lib` directive so Solr loads the Analytic Component classes (for more about the `lib` directive, see <>). In the section of `solrconfig.xml` where the default `lib` directive are, add a line: + +[source,xml] + + +Next you need to enable the request handler and search component. Add the following lines to `solrconfig.xml`, near the defintions for other request handlers: + +[source,xml] +.solrconfig.xml +---- + + + + + + analytics + + + + + +---- + +For these changes to take effect, restart Solr or reload the core or collection. + +== Request Syntax + +An Analytics request is passed to Solr with the parameter `analytics` in a request sent to the +<>. +Since the analytics request is sent inside of a search handler request, it will compute results based on the result set determined by the search handler. + +For example, this curl command encodes and POSTs a simple analytics request to the the search handler: + +[source,bash] +---- +curl --data-urlencode 'analytics={ + "expressions" : { + "revenue" : "sum(mult(price,quantity))" + } + }' + http://localhost:8983/solr/sales/select?q=*:*&wt=json&rows=0 +---- + +There are 3 main parts of any analytics request: + +Expressions:: +A list of calculations to perform over the entire result set. Expressions aggregate the search results into a single value to return. +This list is entirely independent of the expressions defined in each of the groupings. Find out more about them in the section <>. + +Functions:: +One or more <> to be used throughout the rest of the request. These are essentially lambda functions and can be combined in a number of ways. +These functions for the expressions defined in `expressions` as well as `groupings`. + +Groupings:: +The list of <> to calculate in addition to the expressions. +Groupings hold a set of facets and a list of expressions to compute over those facets. +The expressions defined in a grouping are only calculated over the facets defined in that grouping. + +[NOTE] +.Optional Parameters +Either the `expressions` or the `groupings` parameter must be present in the request, or else there will be no analytics to compute. +The `functions` parameter is always optional. + +[source,json] +.Example Analytics Request +---- +{ + "functions": { + "sale()": "mult(price,quantity)" + }, + "expressions" : { + "max_sale" : "max(sale())", + "med_sale" : "median(sale())" + }, + "groupings" : { + "sales" : { + "expressions" : { + "stddev_sale" : "stddev(sale())", + "min_price" : "min(price)", + "max_quantity" : "max(quantity)" + }, + "facets" : { + "category" : { + "type" : "value", + "expression" : "fill_missing(category, 'No Category')", + "sort" : { + "criteria" : [ + { + "type" : "expression", + "expression" : "min_price", + "direction" : "ascending" + }, + { + "type" : "facetvalue", + "direction" : "descending" + } + ], + "limit" : 10 + } + }, + "temps" : { + "type" : "query", + "queries" : { + "hot" : "temp:[90 TO *]", + "cold" : "temp:[* TO 50]" + } + } + } + } + } +} +---- + +== Expressions + +Expressions are the way to request pieces of information from the analytics component. These are the statistical expressions that you want computed and returned in your response. + +=== Constructing an Expression + +==== Expression Components + +An expression is built using fields, constants, mapping functions and reduction functions. The ways that these can be defined are described below. + +Sources:: +* Constants: The values defined in the expression. +The supported constant types are described in the <>. + +* Fields: Solr fields that are read from the index. +The supported fields are listed in the <>. + +Mapping Functions:: +Mapping functions map values for each Solr Document or Reduction. +The provided mapping functions are detailed in the <>. + +* Unreduced Mapping: Mapping a Field with another Field or Constant returns a value for every Solr Document. +Unreduced mapping functions can take fields, constants as well as other unreduced mapping functions as input. + +* Reduced Mapping: Mapping a Reduction Function with another Reduction Function or Constant returns a single value. + +Reduction Functions:: +Functions that reduce the values of sources and/or unreduced mapping functions for every Solr Document to a single value. +The provided reduction functions are detailed in the <>. + +==== Component Ordering + +The expression components must be used in the following order to create valid expressions. + +. Reduced Mapping Function +.. Constants +.. Reduction Function +... Sources +... Unreduced Mapping Function +.... Sources +.... Unreduced Mapping Function +.. Reduced Mapping Function +. Reduction Function + +This ordering is based on the following rules: + +* No reduction function can be an argument of another reduction function. +Since all reduction is done together in one step, one reduction function cannot rely on the result of another. +* No fields can be left unreduced, since the analytics component cannot return a list of values for an expression (one for every document). +Every expression must be reduced to a single value. +* Mapping functions are not necessary when creating functions, however as many nested mappings as needed can be used. +* Nested mapping functions must be the same type, so either both must be unreduced or both must be reduced. +A reduced mapping function cannot take an unreduced mapping function as a parameter and vice versa. + +==== Example Construction + +With the above definitions and ordering, an example expression can be broken up into its components: + +[source,bash] +div(sum(a,fill_missing(b,0)),add(10.5,count(mult(a,c))))) + +As a whole, this is a reduced mapping function. The `div` function is a reduced mapping function since it is a <> and has reduced arguments. + +If we break down the expression further: + +* `sum(a,fill_missing(b,0))`: Reduction Function + +`sum` is a <>. +** `a`: Field +** `fill_missing(b,0)`: Unreduced Mapping Function + +`fill_missing` is an unreduced mapping function since it is a <> and has a field argument. +*** `b`: Field +*** `0`: Constant + +* `add(10.5,count(mult(a,c)))`: Reduced Mapping Function + +`add` is a reduced mapping function since it is a <> and has a reduction function argument. +** `10.5`: Constant +** `count(mult(a,c))`: Reduction Function + +`count` is a <> +*** `mult(a,c)`: Unreduced Mapping Function + +`mult` is an unreduced mapping function since it is a <> and has two field arguments. +**** `a`: Field +**** `c`: Field + +=== Expression Cardinality (Multi-Valued and Single-Valued) + +The root of all multi-valued expressions are multi-valued fields. Single-valued expressions can be started with constants or single-valued fields. +All single-valued expressions can be treated as multi-valued expressions that contain one value. + +Single-valued expressions and multi-valued expressions can be used together in many mapping functions, as well as multi-valued expressions being used alone, and many single-valued expressions being used together. For example: + +`add(, , ...)`:: +Returns a single-valued double expression where the value of the values of each expression are added. + +`add(, )`:: +Returns a multi-valued double expression where each value of the second expression is added to the single value of the first expression. + +`add(, )`:: +Acts the same as the above function. + +`add()`:: +Returns a single-valued double expression which is the sum of the multiple values of the parameter expression. + +=== Types and Implicit Casting + +The new analytics component currently supports the types listed in the below table. +These types have one-way implicit casting enabled for the following relationships: + +[cols="20s,80",options="header"] +|=== +| Type | Implicitly Casts To +| Boolean | String +| Date | Long, String +| Integer | Long, Float, Double, String +| Long | Double, String +| Float | Double, String +| Double | String +| String | _none_ +|=== + +An implicit cast means that if a function requires a certain type of value as a parameter, arguments will be automatically converted to that type if it is possible. + +For example, `concat()` only accepts string parameters and since all types can be implicitly cast to strings, any type is accepted as an argument. + +This also goes for dynamically typed functions. `fill_missing()` requires two arguments of the same type. However, two types that implicitly cast to the same type can also be used. + +For example, `fill_missing(,)` will be cast to `fill_missing(,)` since long cannot be cast to float and float cannot be cast to long implicitly. + +There is an ordering to implicit casts, where the more specialized type is ordered ahead of the more general type. +Therefore even though both long and float can be implicitly cast to double and string, they will be cast to double. +This is because double is a more specialized type than string, which every type can be cast to. + +The ordering is the same as their order in the above table. + +Cardinality can also be implicitly cast. +Single-valued expressions can always be implicitly cast to multi-valued expressions, since all single-valued expressions are multi-valued expressions with one value. + +Implicit casting will only occur when an expression will not "compile" without it. +If an expression follows all typing rules initially, no implicit casting will occur. +Certain functions such as `string()`, `date()`, `round()`, `floor()`, and `ceil()` act as explicit casts, declaring the type that is desired. +However `round()`, `floor()` and `cell()` can return either int or long, depending on the argument type. + +== Variable Functions + +Variable functions are a way to shorten your expressions and make writing analytics queries easier. They are essentially lambda functions defined in a request. + +[source,json] +.Example Basic Function +---- +{ + "functions" : { + "sale()" : "mult(price,quantity)" + }, + "expressions" : { + "max_sale" : "max(sale())", + "med_sale" : "median(sale())" + } +} +---- + +In the above request, instead of writing `mult(price,quantity)` twice, a function `sale()` was defined to abstract this idea. Then that function was used in the multiple expressions. + +Suppose that we want to look at the sales of specific categories: + +[source,json] +---- +{ + "functions" : { + "clothing_sale()" : "filter(mult(price,quantity),equal(category,'Clothing'))", + "kitchen_sale()" : "filter(mult(price,quantity),equal(category,\"Kitchen\"))" + }, + "expressions" : { + "max_clothing_sale" : "max(clothing_sale())" + , "med_clothing_sale" : "median(clothing_sale())" + , "max_kitchen_sale" : "max(kitchen_sale())" + , "med_kitchen_sale" : "median(kitchen_sale())" + } +} +---- + +=== Arguments + +Instead of making a function for each category, it would be much easier to use `category` as an input to the `sale()` function. +An example of this functionality is shown below: + +[source,json] +.Example Function with Arguments +---- +{ + "functions" : { + "sale(cat)" : "filter(mult(price,quantity),equal(category,cat))" + }, + "expressions" : { + "max_clothing_sale" : "max(sale(\"Clothing\"))" + , "med_clothing_sale" : "median(sale('Clothing'))" + , "max_kitchen_sale" : "max(sale(\"Kitchen\"))" + , "med_kitchen_sale" : "median(sale('Kitchen'))" + } +} +---- + +Variable Functions can take any number of arguments and use them in the function expression as if they were a field or constant. + +=== Variable Length Arguments + +There are analytics functions that take a variable amount of parameters. +Therefore there are use cases where variable functions would need to take a variable amount of parameters. + +For example, maybe there are multiple, yet undetermined, number of components to the price of a product. +Functions can take a variable length of parameters if the last parameter is followed by `..` + +[source,json] +.Example Function with a Variable Length Argument +---- +{ + "functions" : { + "sale(cat, costs..)" : "filter(mult(add(costs),quantity),equal(category,cat))" + }, + "expressions" : { + "max_clothing_sale" : "max(sale('Clothing', material, tariff, tax))" + , "med_clothing_sale" : "median(sale('Clothing', material, tariff, tax))" + , "max_kitchen_sale" : "max(sale('Kitchen', material, construction))" + , "med_kitchen_sale" : "median(sale('Kitchen', material, construction))" + } +} +---- + +In the above example a variable length argument is used to encapsulate all of the costs to use for a product. +There is no definite number of arguments requested for the variable length parameter, therefore the clothing expressions can use 3 and the kitchen expressions can use 2. +When the `sale()` function is called, `costs` is expanded to the arguments given. + +Therefore in the above request, inside of the `sale` function: + +* `add(costs)` + +is expanded to both of the following: + +* `add(material, tariff, tax)` +* `add(material, construction)` + +=== For-Each Functions + +[CAUTION] +.Advanced Functionality +==== +The following function details are for advanced requests. +==== + +Although the above functionality allows for an undefined number of arguments to be passed to a function, it does not allow for interacting with those arguments. + +Many times we might want to wrap each argument in additional functions. +For example maybe we want to be able to look at multiple categories at the same time. +So we want to see if `category EQUALS x *OR* category EQUALS y` and so on. + +In order to do this we need to use for-each lambda functions, which transform each value of the variable length parameter. +The for-each is started with the `:` character after the variable length parameter. + +[source,json] +.Example Function with a For-Each +---- +{ + "functions" : { + "sale(cats..)" : "filter(mult(price,quantity),or(cats:equal(category,_)))" + }, + "expressions" : { + "max_sale_1" : "max(sale('Clothing', 'Kitchen'))" + , "med_sale_1" : "median(sale('Clothing', 'Kitchen'))" + , "max_sale_2" : "max(sale('Electronics', 'Entertainment', 'Travel'))" + , "med_sale_2" : "median(sale('Electronics', 'Entertainment', 'Travel'))" + } +} +---- + +In this example, `cats:` is the syntax that starts a for-each lambda function over every parameter `cats`, and the `\_` character is used to refer to the value of `cats` in each iteration in the for-each. +When `sale("Clothing", "Kitchen")` is called, the lambda function `equal(category,_)` is applied to both Clothing and Kitchen inside of the `or()` function. + +Using all of these rules, the expression: + +[source,text] +`sale("Clothing","Kitchen")` + +is expanded to: + +[source,text] +`filter(mult(price,quantity),or(equal(category,"Kitchen"),equal(category,"Clothing")))` + +by the expression parser. + +== Groupings And Facets + +Facets, much like in other parts of Solr, allow analytics results to be broken up and grouped by attributes of the data that the expressions are being calculated over. + +The currently available facets for use in the analytics component are Value Facets, Pivot Facets, Range Facets and Query Facets. +Each facet is required to have a unique name within the grouping it is defined in, and no facet can be defined outside of a grouping. + +Groupings allow users to calculate the same grouping of expressions over a set of facets. +Groupings must have both `expressions` and `facets` given. + +[source,json] +.Example Base Facet Request +---- +{ + "functions" : { + "sale()" : "mult(price,quantity)" + }, + "groupings" : { + "sales_numbers" : { + "expressions" : { + "max_sale" : "max(sale())", + "med_sale" : "median(sale())" + }, + "facets" : { + "" : "< facet request >" + } + } + } +} +---- + +[source,json] +.Example Base Facet Response +---- +{ + "analytics_response" : { + "groupings" : { + "sales_numbers" : { + "" : "< facet response >" + } + } + } +} +---- + +=== Facet Sorting + +Some Analytics facets allow for complex sorting of their results. +The two current sortable facets are <> and <>. + +==== Parameters + +`criteria`:: +The list of criteria to sort the facet by. ++ +It takes the following parameters: + +`type`::: The type of sort. There are two possible values: +* `expression`: Sort by the value of an expression defined in the same grouping. +* `facetvalue`: Sort by the string-representation of the facet value. + +`Direction`::: +_(Optional)_ The direction to sort. +* `ascending` _(Default)_ +* `descending` + +`expression`::: +When `type = expression`, the name of an expression defined in the same grouping. + +`limit`:: +Limit the number of returned facet values to the top _N_. _(Optional)_ + +`offset`:: + When a limit is set, skip the top _N_ facet values. _(Optional)_ + +[source,json] +.Example Sort Request +---- +{ + "criteria" : [ + { + "type" : "expression", + "expression" : "max_sale", + "direction" : "ascending" + }, + { + "type" : "facetvalue", + "direction" : "descending" + } + ], + "limit" : 10, + "offset" : 5 +} +---- + +=== Value Facets + +Value Facets are used to group documents by the value of a mapping expression applied to each document. +Mapping expressions are expressions that do not include a reduction function. + +For more information, refer to the <>. + +* `mult(quantity, sum(price, tax))`: breakup documents by the revenue generated +* `fillmissing(state, "N/A")`: breakup documents by state, where N/A is used when the document doesn't contain a state + +Value Facets can be sorted. + +==== Parameters + +`expression`:: The expression to choose a facet bucket for each document. +`sort`:: A <> for the results of the pivot. + +[NOTE] +.Optional Parameters +The `sort` parameter is optional. + +[source,json] +.Example Value Facet Request +---- +{ + "type" : "value", + "expression" : "fillmissing(category,'No Category')", + "sort" : {} +} +---- + +[source,json] +.Example Value Facet Response +---- +[ + { "..." : "..." }, + { + "value" : "Electronics", + "results" : { + "max_sale" : 103.75, + "med_sale" : 15.5 + } + }, + { + "value" : "Kitchen", + "results" : { + "max_sale" : 88.25, + "med_sale" : 11.37 + } + }, + { "..." : "..." } +] +---- + +[NOTE] +.Field Facets +This is a replacement for Field Facets in the original Analytics Component. +Field Facet functionality is maintained in Value Facets by using the name of a field as the expression. + +=== Analytic Pivot Facets + +Pivot Facets are used to group documents by the value of multiple mapping expressions applied to each document. + +Pivot Facets work much like layers of <>. +A list of pivots is required, and the order of the list directly impacts the results returned. +The first pivot given will be treated like a normal value facet. +The second pivot given will be treated like one value facet for each value of the first pivot. +Each of these second-level value facets will be limited to the documents in their first-level facet bucket. +This continues for however many pivots are provided. + +Sorting is enabled on a per-pivot basis. This means that if your top pivot has a sort with `limit:1`, then only that first value of the facet will be drilled down into. Sorting in each pivot is independent of the other pivots. + +==== Parameters + +`pivots`:: The list of pivots to calculate a drill-down facet for. The list is ordered by top-most to bottom-most level. +`name`::: The name of the pivot. +`expression`::: The expression to choose a facet bucket for each document. +`sort`::: A <> for the results of the pivot. + +[NOTE] +.Optional Parameters +The `sort` parameter within the pivot object is optional, and can be given in any, none or all of the provided pivots. + +[source,json] +.Example Pivot Facet Request +---- +{ + "type" : "pivot", + "pivots" : [ + { + "name" : "country", + "expression" : "country", + "sort" : {} + }, + { + "name" : "state", + "expression" : "fillmissing(state, fillmissing(providence, territory))" + }, + { + "name" : "city", + "expression" : "fillmissing(city, 'N/A')", + "sort" : {} + } + ] +} +---- + + +[source,json] +.Example Pivot Facet Response +---- +[ + { "..." : "..." }, + { + "pivot" : "Country", + "value" : "USA", + "results" : { + "max_sale" : 103.75, + "med_sale" : 15.5 + }, + "children" : [ + { "..." : "..." }, + { + "pivot" : "State", + "value" : "Texas", + "results" : { + "max_sale" : 99.2, + "med_sale" : 20.35 + }, + "children" : [ + { "..." : "..." }, + { + "pivot" : "City", + "value" : "Austin", + "results" : { + "max_sale" : 94.34, + "med_sale" : 17.60 + } + }, + { "..." : "..." } + ] + }, + { "..." : "..." } + ] + }, + { "..." : "..." } +] +---- + +=== Analytics Range Facets + +Range Facets are used to group documents by the value of a field into a given set of ranges. +The inputs for analytics range facets are identical to those used for Solr range facets. +Refer to the <> for additional questions regarding use. + +==== Parameters + +`field`:: Field to be faceted over +`start`:: The bottom end of the range +`end`:: The top end of the range +`gap`:: A list of range gaps to generate facet buckets. If the buckets do not add up to fit the `start` to `end` range, +then the last `gap` value will repeated as many times as needed to fill any unused range. +`hardend`:: Whether to cutoff the last facet bucket range at the `end` value if it spills over. Defaults to `false`. +`include`:: The boundaries to include in the facet buckets. Defaults to `lower`. +* `lower` - All gap-based ranges include their lower bound. +* `upper` - All gap-based ranges include their upper bound. +* `edge` - The first and last gap ranges include their edge bounds (lower for the first one, upper for the last one) even if the corresponding upper/lower option is not specified. +* `outer` - The `before` and `after` ranges will be inclusive of their bounds, even if the first or last ranges already include those boundaries. +* `all` - Includes all options: `lower`, `upper`, `edge`, and `outer` +`others`:: Additional ranges to include in the facet. Defaults to `none`. +* `before` - All records with field values lower then lower bound of the first range. +* `after` - All records with field values greater then the upper bound of the last range. +* `between` - All records with field values between the lower bound of the first range and the upper bound of the last range. +* `none` - Include facet buckets for none of the above. +* `all` - Include facet buckets for `before`, `after` and `between`. + +[NOTE] +.Optional Parameters +The `hardend`, `include` and `others` parameters are all optional. + +[source,json] +.Example Range Facet Request +---- +{ + "type" : "range", + "field" : "price", + "start" : "0", + "end" : "100", + "gap" : [ + "5", + "10", + "10", + "25" + ], + "hardend" : true, + "include" : [ + "lower", + "upper" + ], + "others" : [ + "after", + "between" + ] +} +---- + +[source,json] +.Example Range Facet Response +---- +[ + { + "value" : "[0 TO 5]", + "results" : { + "max_sale" : 4.75, + "med_sale" : 3.45 + } + }, + { + "value" : "[5 TO 15]", + "results" : { + "max_sale" : 13.25, + "med_sale" : 10.20 + } + }, + { + "value" : "[15 TO 25]", + "results" : { + "max_sale" : 22.75, + "med_sale" : 18.50 + } + }, + { + "value" : "[25 TO 50]", + "results" : { + "max_sale" : 47.55, + "med_sale" : 30.33 + } + }, + { + "value" : "[50 TO 75]", + "results" : { + "max_sale" : 70.25, + "med_sale" : 64.54 + } + }, + { "..." : "..." } +] +---- + +=== Query Facets + +Query Facets are used to group documents by given set of queries. + +==== Parameters + +`queries`:: The list of queries to facet by. + +[source,json] +.Example Query Facet Request +---- +{ + "type" : "query", + "queries" : { + "high_quantity" : "quantity:[ 5 TO 14 ] AND price:[ 100 TO * ]", + "low_quantity" : "quantity:[ 1 TO 4 ] AND price:[ 100 TO * ]" + } +} +---- + +[source,json] +.Example Query Facet Response +---- +[ + { + "value" : "high_quantity", + "results" : { + "max_sale" : 4.75, + "med_sale" : 3.45 + } + }, + { + "value" : "low_quantity", + "results" : { + "max_sale" : 13.25, + "med_sale" : 10.20 + } + } +] +---- diff --git a/solr/solr-ref-guide/src/searching.adoc b/solr/solr-ref-guide/src/searching.adoc index 6b9c49c61655..724f379202ce 100644 --- a/solr/solr-ref-guide/src/searching.adoc +++ b/solr/solr-ref-guide/src/searching.adoc @@ -1,5 +1,5 @@ = Searching -:page-children: overview-of-searching-in-solr, velocity-search-ui, relevance, query-syntax-and-parsing, json-request-api, faceting, highlighting, spell-checking, query-re-ranking, transforming-result-documents, suggester, morelikethis, pagination-of-results, collapse-and-expand-results, result-grouping, result-clustering, spatial-search, the-terms-component, the-term-vector-component, the-stats-component, the-query-elevation-component, response-writers, near-real-time-searching, realtime-get, exporting-result-sets, streaming-expressions, parallel-sql-interface +:page-children: overview-of-searching-in-solr, velocity-search-ui, relevance, query-syntax-and-parsing, json-request-api, faceting, highlighting, spell-checking, query-re-ranking, transforming-result-documents, suggester, morelikethis, pagination-of-results, collapse-and-expand-results, result-grouping, result-clustering, spatial-search, the-terms-component, the-term-vector-component, the-stats-component, the-query-elevation-component, response-writers, near-real-time-searching, realtime-get, exporting-result-sets, streaming-expressions, parallel-sql-interface, analytics // Licensed to the Apache Software Foundation (ASF) under one // or more contributor license agreements. See the NOTICE file // distributed with this work for additional information @@ -55,3 +55,4 @@ This section describes how Solr works with search requests. It covers the follow * <>: Functionality to export large result sets out of Solr. * <>: A stream processing language for Solr, with a suite of functions to perform many types of queries and parallel execution tasks. * <>: An interface for sending SQL statements to Solr, and using advanced parallel query processing and relational algebra for complex data analysis. +* <>: A framework to compute complex analytics over a result set.