Skip to content
This repository was archived by the owner on Jul 26, 2019. It is now read-only.

Setting up a SOLR test server

Ken Price edited this page Jan 30, 2016 · 3 revisions

Installing SOLR

Follow this tutorial, up to and including step 2.

Creating A Collection

Note: I provided a sample schema.xml and solrconfig.xml at the end of this doc. Use those!

  1. Navigate to /var/solr
  2. From here on out, you should do a su - solr so that any new files you create will be owned by solr. You should at least make sure all of the files within /var/solr are owned by solr, since the SOLR service itself is run by solr.
  3. mkdir corenamehere, where corenamehere is your desired core name.
  4. In /var/solr/corenamehere, do a mkdir conf
  5. In the conf directory, create two files, schema.xml and solrconfig.xml. Use these to configure your core and define your schema.
  6. In your browser, navigate to http://YOUR_IP_HERE:8983/solr.
  7. Under Core Admin, click 'Add Core'. Pick whatever name you want, but make sure you specify instanceDir so that it matches the corenamehere you created in step 3. In the screen shot below, I created a directory test for my core, so I specify test as my instanceDir.

If successful, you should see something like this:

Creating Test Data

The following instructions are for populating test data manually.

  1. Navigate to the SOLR admin page.

  2. Select the core you created from the last section using the Core Selector dropdown.

  3. Click 'Documents' from the option. You can create an entry for the document. The format of the input is JSON, and you can specify any fields from schema.xml. Here's an example input:

{
"title":"Reification of Naught: Repercussions of the Fundamental Singularity",
"authors":["Michael Hunt", "Jamil Dilla"],
"contributors":["Wolfgang Heimenschack"],
"organizations":["University of Zurich"],
"body": "But I must explain to you how all this mistaken idea of denouncing pleasure and praising pain was born and I will give you a complete account of the system, and expound the actual teachings of the great explorer of the truth, the master-builder of human happiness. No one rejects, dislikes, or avoids pleasure itself, because it is pleasure, but because those who do not know how to pursue pleasure rationally encounter consequences that are extremely painful. Nor again is there anyone who loves or pursues or desires to obtain pain of itself, because it is pain, but because occasionally circumstances occur in which toil and pain can procure him some great pleasure. To take a trivial example, which of us ever undertakes laborious physical exercise, except to obtain some advantage from it? But who has any right to find fault with a man who chooses to enjoy a pleasure that has no annoying consequences, or one who avoids a pain that produces no resultant pleasure?"
}

Click "Submit Document", and hope that the submission is successful. You should be given a success message like the one seen in the screen shot above. Refer to schema.xml for the names of fields.

Test Query

  1. Select the core you want to work with.

  2. In the options, click "Query".

  3. Replace ":" in the q textarea field with your SOLR query string, then click Execute Query. In the screen show below, I executed a query with the query string collector:mistaken, which searches for the keyword "mistaken" in all of the text fields of each document.

collector is defined in my schema.xml. It's just a field used during query-time to conveniently search all text fields.

Happy searching!

Sample schema and config

schema.xml

<?xml version="1.0" encoding="UTF-8" ?>

<schema name="gios-asu" version="0.1">

  <fields>
    <field name="id" type="text" indexed="true" stored="true" required="true"/>
    <field name="title" type="textgen" indexed="true" stored="true"/>

    <!-- ATTRIBUTION -->
    <!-- textgen tokenizes at whilespace, will catenate words and split on case change when
          indexed and queried. So a query for "Augustus DeMorgan" and "Augustus De-Morgan" 
          will both be a match for Augustus De Morgan
    -->
    <field name="authors" type="textgen" indexed="true" stored="true" multiValued="true"/>
    <field name="contributors" type="textgen" indexed="true" stored="true" multiValued="true"/>
    <field name="organizations" type="textgen" indexed="true" stored="true" multiValued="true"/>

    <!-- SPATIAL AND TEMPORAL COVERAGE -->
    <field name="dates" type="DateRangeField" indexed="true" stored="true" multiValued="true"/>
    <field name="locations" type="location" indexed="true" stored="true" multiValued="true"/>
    <field name="location_names" type="textgen" indexed="true" stored="true" multiValued="true"/>

    <!-- document body should use a special type of class solr.TextField so analyzers can
       be used -->
    <field name="body" type="document_body" indexed="true" stored="true"/>

    <field name="publication_date" type="DateRangeField" indexed="true" stored="true"/>

    <!-- collection of all fields; not stored. may be useful for querying -->
    <field name="collector" type="textgen" indexed="true" stored="false" multiValued="true"/>
  </fields>

  <!-- may simplify some queries to combine fields into one -->
  <copyField source="title"               dest="collector"/>
  <copyField source="authors"             dest="collector"/>
  <copyField source="contributors"        dest="collector"/>
  <copyField source="body"                dest="collector"/>
  <copyField source="organizations"        dest="collector"/>
  <copyField source="location_names"    dest="collector"/>

  <fieldType name="text" class="solr.TextField" sortMissingLast="true" omitNorms="true"/>

  <!-- SPATIAL AND TEMPORAL COVERAGE -->
  <fieldType name="DateRangeField" class="solr.DateRangeField"/>
  <fieldType name="location" class="solr.LatLonType" subFieldSuffix="_coordinate"/>

  <!-- custom type for document body -->
  <fieldType name="document_body" class="solr.TextField"/>

  <fieldType name="textgen" class="solr.TextField" positionIncrementGap="100">
    <analyzer type="index">
      <tokenizer class="solr.WhitespaceTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
      <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
    <analyzer type="query">
      <tokenizer class="solr.WhitespaceTokenizerFactory"/>
      <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
      <filter class="solr.StopFilterFactory"
              ignoreCase="true"
              words="stopwords.txt"
              enablePositionIncrements="true"
              />
      <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
  </fieldType>
</schema>

solrconfig.xml

<?xml version="1.0" encoding="UTF-8" ?>
<config>
  <luceneMatchVersion>LUCENE_43</luceneMatchVersion>
  <requestDispatcher handleSelect="false">
    <httpCaching never304="true" />
  </requestDispatcher>
  <requestHandler name="/select" class="solr.SearchHandler" />
  <requestHandler name="/update" class="solr.UpdateRequestHandler" />
  <requestHandler name="/admin" class="solr.admin.AdminHandlers" />
  <requestHandler name="/analysis/field" class="solr.FieldAnalysisRequestHandler" startup="lazy" />

  <updateRequestProcessorChain>
   <processor class="solr.UUIDUpdateProcessorFactory">
     <str name="fieldName">id</str>
   </processor>
   <processor class="solr.LogUpdateProcessorFactory" />
   <processor class="solr.RunUpdateProcessorFactory" />
  </updateRequestProcessorChain>

  <maxDocs>1000</maxDocs>
</config>

Documentation - Design docs

Dynamic Models

Meeting Notes

Developer Resources


SOLR

Test Server - Info about our test server

Test Server Setup - SOLR configuration tut


Misc.

Snippets - Useful snippets

Clone this wiki locally