Payload Component

This plugin for Solr will return the payloads for terms that matched in your query, and was developed to meet the needs of doing (Keyword in Context Highlighting)[https://en.wikipedia.org/wiki/Key_Word_in_Context] of OCR'ed content. You can see it in action in this demonstration project: https://github.com/o19s/pdf-discovery-demo/.

Example document:

{
  "id": "my sample doc",
  "payload_content": "Look|ignored at this|wow"
}

Querying for payload_content:this would generate a response like the following:

{
  "response":{
    "docs":[
      {
        "id":"my sample doc",
        "payload_content":"Look|ignored at this|wow",
      }
    ]
  },
  "payloads":{
    "my sample_doc":{
      "payload_content":{
        "this":[
          "wow"
        ]
      }
    }
  }
}

Since wow was a payload of the this token, and this was in the query, wow comes back in the payloads response.

Why?

This project was originally conceived as a solution for storing bounding boxes with terms for OCR highlighting.

See it in action at http://github.com/o19s/pdf-discovery-demo.

Requirements

Solr 8.7
A field type that utilizes payloads

Installing as a Solr Package

To install as a Solr package, first ensure Solr has been started with -Denable.packages=true then add the repo:

bin/solr package add-repo osc https://raw.githubusercontent.com/o19s/payload-component/master/repo

To install the package run:

bin/solr package install solr-payloads:1.1.4