Solr contextual synonyms using payloads
Add "contextual" synonyms to Solr using payloads.

Contextual Synonyms

A single term can relate to different concepts in the same field/document. We call a "contextual synonym" to a synonym only appliend to one (or more) specific tokens withing a field.

The principle behind this component is explained in this post. This code is provided as a support for the given post. Although a very similar approach was used in a production environment.


To build the project just execute

mvn -e package


You can wrap the .jar file on the target/ directory and add it to your Solr/Fusion installation. After that you need to add the filter to one of your fieldtype:

<fieldtype name="payloads" stored="false" indexed="true" class="solr.TextField" >
   <tokenizer class="solr.WhitespaceTokenizerFactory"/>
   <filter class="solr.DelimitedPayloadTokenFilterFactory" delimiter="|" encoder="identity"/>
   <filter class="solr.custom.PayloadSynonymTokenFilterFactory"/>

Once your fieldtype is defined we can use the very helpful Analysis page of the Solr Admin UI to check if things are working as expected. If we use the test string: Bill|Clinton talked about the bill in the Field value (index) input and select our payload fieldtype we can see an output similar to what is shown in the figure.

Solr Admin UI

A quick inspection, reveals that the tokens Bill and Clinton have the same positional information. Also the Clinton token has a defined type of SYNONYM.

