Skip to content

Customizing Solr Indexing and Search

jkim-ru edited this page Nov 30, 2017 · 9 revisions

Based off of: https://github.com/dibbs-vdc/ccql/tree/v0.2

TODO: Try to look over and incorporate ideas from this related thread -- https://groups.google.com/forum/#!topic/samvera-tech/cTXJo6BS6CU

How to Configure for Discovery in General

Steps for adding basic discovery of your properties are described here: http://samvera.github.io/customize-metadata-discovery.html

One step further - Indexing and Searching for Vdc::Person

In a vanilla Hyrax application, the default creator of either a work or collection is plain text, and that creator field is indexed in Solr. In our customized application for the VDC, we wanted our work Vdc::Resource to contain a property that points to a Vdc::Person object, so that searches for certain Person properties would yield hits on related works. For our purposes, just modifying the Solr document model and catalog controller wasn't enough.

Some things to note:

  • A Vdc::Person extends from ActiveFedora::Base and is created when a user is approved to use the application by an Administrator.
  • The vdc_creator property of a Vdc::Resource stores a Vdc::Person's id (Fedora UUID), so that's our link.
  • We wanted to index only some of the properties of a Vdc::Person (like like preferred name, discipline, organization) so that searches for those values would produce hits on our works and collections.

Here are some additional hints/steps we took to get the Vdc::Person's properties indexed with our Vdc::Resource and Collection.

Indexing Vdc::Person Information in Solr

By virtue of inheriting from ActiveFedora::Base, Vdc::Person objects are automatically indexed within Solr when you save. To view any existing Person Objects via the Solr Admin UI, try the following link (example assumes local dev environment):

http://localhost:8983/solr/hydra-development/select?fq=has_model_ssim:%22Vdc::Person%22&indent=on&q=*:*&wt=json

Getting Vdc::Person indexed and searchable from a work

  1. Override the Hyrax::WorkIndexer so that the correct Vdc::Person id is stored as part of the work's Solr document.

    (solr_doc[Solrizer.solr_name('person_ids', :symbol)] ||= []) << object.vdc_creator
    

    There are a few things to take note of when doing this.

    • Solrizer.solr_name('person_ids', :symbol) will make sure that the correct key is used. In this case, the key is resolved to person_ids_ssim. The _ssim suffix means stored, string, indexed, multivalued.
    • You need to add the Vdc::Person's Fedora ID to person_ids_ssim.
  2. Override the Hyrax::CatalogSearchBuilder to do the appropriate join from person id to the work using the Solr document key person_ids_ssim.

    # Attempting to override from Hyrax Gem 1.0.4:
    #   app/search_builders/hyrax/catalog_search_builder.rb
     
    module Hyrax
      module Vdc
        module CatalogSearchBuilderOverride
    
          # the {!lucene} gives us the OR syntax
          def new_query
            "{!lucene}#{interal_query(dismax_query)} #{interal_query(join_for_works_from_files)} #{interal_query(join_for_works_from_person)}"
          end
    
          # join from person id to work relationship via solrized person_ids_ssim
          def join_for_works_from_person
            "{!join from=#{ActiveFedora.id_field} to=person_ids_ssim}#{dismax_query}"
          end
    
        end
      end
    end
    

    I still don't know a lot of the specifics of how this works, but I used the indexing of Hyrax FileSets as an example.

  3. Update app/models/solr_document.rb and app/controllers/catalog_controller.rb with the Vdc::Person properties that you want indexed and searchable. In this case, we want the Vdc::Person's preferred_name and organization. We also want discipline, but that seems to be added already from the work and collection. (Side note: Does this mean that I can't prevent Person's discipline to be searchable this way?)

    In app/controllers/catalog_controller.rb:

    config.add_show_field solr_name("preferred_name", :stored_searchable)
    config.add_show_field solr_name("organization", :stored_searchable)
    

    In app/models/solr_document.rb:

    def preferred_name
      self[Solrizer.solr_name('preferred_name')]
    end
    
    def organization
      self[Solrizer.solr_name('organization')]
    end 
    
  4. Updateconfig/application.rb to make the override classes prepend the original Hyrax versions. Here's a potentially useful stack overflow post on prepend: https://stackoverflow.com/questions/5944278/overriding-method-by-another-defined-in-module

Getting Vdc::Person indexed and searchable from a collection

  1. Override the Hyrax::CollectionIndexer so that the correct Vdc::Person id is stored as part of the work's Solr document.

    # Attempting to override from Hyrax Gem 1.0.4:
    #  app/indexers/hyrax/work_indexer.rb
    # See config/application.rb for prepend statement
    
    module Hyrax
      module Vdc
        module CollectionIndexerOverride
    
          def generate_solr_document
            super.tap do |solr_doc|
              (solr_doc[Solrizer.solr_name('person_ids', :symbol)] ||= []) << object.vdc_creator
            end
          end
    
        end
      end
    end
    
  2. Override the Hyrax::CatalogSearchBuilder. (See above. You may have already done this when getting this working for a work.)

  3. Update app/models/solr_document.rb and app/controllers/catalog_controller.rb. (See above. You may have already done this when getting this working for a work.)

  4. Updateconfig/application.rb to make the override classes prepend the original Hyrax versions.

     # Overrides
     config.to_prepare do
       # ...
       Hyrax::CollectionIndexer.prepend Hyrax::Vdc::CollectionIndexerOverride
       # ...
     end
    

Related Issues: https://github.com/dibbs-vdc/ccql/issues/37