From 33204ff774b30b4c817405a595059a41b0f0eef3 Mon Sep 17 00:00:00 2001 From: pdelboca Date: Mon, 20 May 2019 09:40:06 -0300 Subject: [PATCH 1/7] Add section Search in Detail to user-guide.rst --- doc/user-guide.rst | 104 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 104 insertions(+) diff --git a/doc/user-guide.rst b/doc/user-guide.rst index cb40061736e..b0f391d4542 100644 --- a/doc/user-guide.rst +++ b/doc/user-guide.rst @@ -472,6 +472,110 @@ using the "Follow" button on the dataset page. See the section :ref:`managing_your_news_feed` below. You must have a user account and be logged in to use this feature. +Search in detail +================ + +CKAN supports 2 search modes, both are used from the same search field. +If the search terms entered into the search field contain no colon (":") +CKAN will perform a simple search. If the search expression does contain at +least one colon (":") CKAN will perform and advanced search. + +Simple Search +------------- + +CKAN defers most of the search to Solr and by default it uses the `DisMax Query +Parser `_ +that was primarily designed to be easy to use and to accept almost any input +without returning an error. + +The search words typed by the user in the search box defines the main "query" +constituting the essence of the search. The + and - characters are +treated as **mandatory** and **prohibited** modifiers for terms. Text wrapped +in balanced quote characters (for example, "San Jose") is treated as a phrase. +By default, all words or phrases specified by the user are treated as +**optional** unless they are preceded by a "+" or a "-". + +.. note:: + + CKAN will search for the **complete** word and when doing simple search are + wildcards are not supported. + +Simple search examples: + +* ``census`` will search for all the datasets containing the word "census" in + the query fields. + +* ``census +2019`` will search for all the datasets contaning the word "census" + and filter only those matching also "2019" as it is treated as mandatory. + +* ``census -2019`` will search for all the datasets containing the word + "census" and will exclude "2019" from the results as it is treated as + prohibited. + +* ``"european census"`` will search for all the datasets containing the phrase + "european census". + +Solr applies some preprocessing and stemming when searching. Stemmers remove +morphological affixes from words, leaving only the word stem. This may cause, +for example, that searching for "testing" or "tested" will show also results +containing the word "test". + +* ``Testing`` will search for all the datasets containing the word "Testing" + and also "Test" as it is the stem of "Testing". + +.. note:: + + If the Name of the dataset contains words separated by "-" it will consider + each word independently in the search. + + +Advanced Search +--------------- + +If the query has a colon in it it will be considered a fielded search and the +query sintax of Solr be used to search. This will allow us to use wilcards +"*", proximity matching "~" and general features described in Solr docs. +The basic syntax is ``field:term``. + +Advanced Search Examples: + +* ``title:european`` this will look for all the datasets containing in its + title a the word "european". + +* ``title:europ*`` this will look for all the datasets containing in its title + a word that starts with "europ" like "europe" and "european". + +* ``title:europe || title:africa`` will look for datasets containing "europe" + or "africa" in its title. + +* ``title: "european census" ~ 4`` A proximity search looks for terms that + are within a specific distance from one another. This example will look for + datasets which title contains the words "european" and "census" within a + distance of 4 words. + +* ``author:powell~`` CKAN supports fuzzy searches based on the Levenshtein + Distance, or Edit Distance algorithm. To do a fuzzy search use the "~" + symbol at the end of a single-word term. In this example words like + "jowell" or "pomell" will also be found. + + +.. note:: + + Field names used in advanced search may differ from Datasets Attributes, + the mapping rules are defined in the ``schema.xml`` file. You can use ``title`` + to search by the dataset name and ``text`` to look in a catch-all field that + includes author, license, mantainer, tags, etc. + +.. warning:: + + CKAN uses Apache Solr as its search engine. For further details check the + `Solr documentation + `_. + Please note that CKAN sometimes uses different values than what is mentioned + in that documentation. Also note that not the whole functionality is offered + through the simplified search interface in CKAN or it can differ due to + extensions or local development in you CKAN instance. + Personalization =============== From d1c574c7820ee512ecc099a4b5f0f0f2cfbdab5b Mon Sep 17 00:00:00 2001 From: Patricio Del Boca Date: Tue, 21 May 2019 10:08:42 -0300 Subject: [PATCH 2/7] Update doc/user-guide.rst Co-Authored-By: David Read --- doc/user-guide.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/user-guide.rst b/doc/user-guide.rst index b0f391d4542..5309ff10402 100644 --- a/doc/user-guide.rst +++ b/doc/user-guide.rst @@ -478,7 +478,7 @@ Search in detail CKAN supports 2 search modes, both are used from the same search field. If the search terms entered into the search field contain no colon (":") CKAN will perform a simple search. If the search expression does contain at -least one colon (":") CKAN will perform and advanced search. +least one colon (":") CKAN will perform an advanced search. Simple Search ------------- From 124225bee49447767386ac026fbbbb5168d019a8 Mon Sep 17 00:00:00 2001 From: Patricio Del Boca Date: Tue, 21 May 2019 10:08:54 -0300 Subject: [PATCH 3/7] Update doc/user-guide.rst Co-Authored-By: David Read --- doc/user-guide.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/user-guide.rst b/doc/user-guide.rst index 5309ff10402..656de985853 100644 --- a/doc/user-guide.rst +++ b/doc/user-guide.rst @@ -475,7 +475,7 @@ logged in to use this feature. Search in detail ================ -CKAN supports 2 search modes, both are used from the same search field. +CKAN supports two search modes, both are used from the same search field. If the search terms entered into the search field contain no colon (":") CKAN will perform a simple search. If the search expression does contain at least one colon (":") CKAN will perform an advanced search. From 548e9aa9a9977e7ea7d40288b5b4282e9f5d8af2 Mon Sep 17 00:00:00 2001 From: Patricio Del Boca Date: Tue, 21 May 2019 13:05:36 -0300 Subject: [PATCH 4/7] Update doc/user-guide.rst Co-Authored-By: David Read --- doc/user-guide.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/user-guide.rst b/doc/user-guide.rst index 656de985853..c7d8364ae41 100644 --- a/doc/user-guide.rst +++ b/doc/user-guide.rst @@ -533,7 +533,7 @@ Advanced Search --------------- If the query has a colon in it it will be considered a fielded search and the -query sintax of Solr be used to search. This will allow us to use wilcards +query syntax of Solr will be used to search. This will allow us to use wildcards "*", proximity matching "~" and general features described in Solr docs. The basic syntax is ``field:term``. From 0da0c73a66392f7e6d0c1d49bc1580b37b8b66d4 Mon Sep 17 00:00:00 2001 From: Patricio Del Boca Date: Tue, 21 May 2019 13:05:45 -0300 Subject: [PATCH 5/7] Update doc/user-guide.rst Co-Authored-By: David Read --- doc/user-guide.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/user-guide.rst b/doc/user-guide.rst index c7d8364ae41..26104bae865 100644 --- a/doc/user-guide.rst +++ b/doc/user-guide.rst @@ -540,7 +540,7 @@ The basic syntax is ``field:term``. Advanced Search Examples: * ``title:european`` this will look for all the datasets containing in its - title a the word "european". + title the word "european". * ``title:europ*`` this will look for all the datasets containing in its title a word that starts with "europ" like "europe" and "european". From da358c6c08bd20692827d5313dfc8399e0688c7d Mon Sep 17 00:00:00 2001 From: Patricio Del Boca Date: Wed, 22 May 2019 08:41:59 -0300 Subject: [PATCH 6/7] Update doc/user-guide.rst Co-Authored-By: Ian Ward --- doc/user-guide.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/user-guide.rst b/doc/user-guide.rst index 26104bae865..6e07931489b 100644 --- a/doc/user-guide.rst +++ b/doc/user-guide.rst @@ -574,7 +574,7 @@ Advanced Search Examples: Please note that CKAN sometimes uses different values than what is mentioned in that documentation. Also note that not the whole functionality is offered through the simplified search interface in CKAN or it can differ due to - extensions or local development in you CKAN instance. + extensions or local development in your CKAN instance. Personalization =============== From cd829b94b03fe464e26f1e87663fc29746329061 Mon Sep 17 00:00:00 2001 From: Patricio Del Boca Date: Wed, 22 May 2019 09:00:33 -0300 Subject: [PATCH 7/7] Changed warning to note. --- doc/user-guide.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/user-guide.rst b/doc/user-guide.rst index 6e07931489b..9dc11ab5cbd 100644 --- a/doc/user-guide.rst +++ b/doc/user-guide.rst @@ -566,7 +566,7 @@ Advanced Search Examples: to search by the dataset name and ``text`` to look in a catch-all field that includes author, license, mantainer, tags, etc. -.. warning:: +.. note:: CKAN uses Apache Solr as its search engine. For further details check the `Solr documentation