Skip to content

KmSYS/edismax-solr-spring

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Java CI with Maven

Get Started

Problem

  • How to add the following (qf, bq, mm) parameters to the Solr query generated by Spring Data Solr, OR
  • How to use standard query parser and dismax parameters in the same query that generated by Spring Data Solr, OR
  • How to use edismax parameters to the Solr query generated by Spring Data Solr
  • so let's take a very brief about standard query parser, dismax query parser and finally edismax query parser

Standard Query Parser

the standard query parser can be tricky:

  • Conditional Searches: AND, OR, NOT
  • Wildcard Searches: *
  • Fuzzy searches: ~
  • Search Term inclusion/Exclusion: +/-

Dismax Query parser

Dismax is a subset of standard query parser and its purpose is to process queries the simplest way possible and less error messages for the user to deal with. It can still perform phrase searches, Search Term inclusion/Exclusion: +/-, q, qf and bq, Don't worry we will dissucess them in edismax section

eDismax Query Parser

The eDisMax(Extended DisMax) query parser is an improved version of the the-dismax-query-parser It supports both dismax and standard query parser's syntax and gives better control over the field that can be queried, So edismax:

  • supports the-standard-query-parser syntax such as (non-exhaustive list) ** boolean operators such as AND (+, &&), OR (||), NOT (-).
  • includes improved smart partial escaping in the case of syntax errors; fielded queries, +/-, and phrase queries are still supported in this mode.
  • improves proximity boosting by using word shingles; you do not need the query to match all words in the document before proximity boosting is applied.

The commonly used query parameters are:

In addition to all the the-dismax-query-parser, Extended DisMax includes these query parameters:

  • q (query): Defines the raw input strings for the query.

  • q.alt (alternative query) :defines a query that will be excuted when q is blank or absent.

  • qf (Query Fields) : specifies the fields in the index on which to perform the query. If absent, defaults to df.

  • bq (Boost Query) : specifies a factor by which a term or phrase should be “boosted” in importance when considering a match.

  • mm (Minimum should match): in any query we have three types of clauses: (Mandatory(+), Prohibited(-), and Optional) By default, all words or phrases specified in the q parameter are treated as "optional" clauses unless they are preceded by a "+" or a "-", so the mm works only on optional terms, let's have examples to understand:

  • if mm=3 then it must match at least 3 optional terms

  • if mm=-3 then it must match at least (total-2) optional terms

  • if mm=90% then it must match at least 90% of optional terms

  • if mm=-10% then it must Ignore at Most 10% of optional terms, or it should gave same result like (mm=90%)

  • If q.op is effectively AND’ed, then mm=100%; if q.op is OR’ed, then mm=1. So it should set a default value for the 'mm' parameter in solrconfig.xml file.

  • boost: Boost Query: specifies a factor by which a term or phrase should be "boosted" in importance when considering a match.