-
Notifications
You must be signed in to change notification settings - Fork 1
Setting up a SOLR test server
Follow this tutorial, up to and including step 2.
Note: I provided a sample schema.xml and solrconfig.xml at the end of this doc. Use those!
- Navigate to
/var/solr - From here on out, you should do a
su - solrso that any new files you create will be owned bysolr. You should at least make sure all of the files within/var/solrare owned bysolr, since the SOLR service itself is run bysolr. -
mkdir corenamehere, wherecorenamehereis your desired core name. - In
/var/solr/corenamehere, do amkdir conf - In the
confdirectory, create two files,schema.xmlandsolrconfig.xml. Use these to configure your core and define your schema. - In your browser, navigate to
http://YOUR_IP_HERE:8983/solr. - Under Core Admin, click 'Add Core'. Pick whatever name you want, but make sure you specify instanceDir so that it matches the
corenamehereyou created in step 3. In the screen shot below, I created a directorytestfor my core, so I specifytestas my instanceDir.
If successful, you should see something like this:
The following instructions are for populating test data manually.
-
Navigate to the SOLR admin page.
-
Select the core you created from the last section using the Core Selector dropdown.
-
Click 'Documents' from the option. You can create an entry for the document. The format of the input is JSON, and you can specify any fields from
schema.xml.Here's an example input:
{
"title":"Reification of Naught: Repercussions of the Fundamental Singularity",
"authors":["Michael Hunt", "Jamil Dilla"],
"contributors":["Wolfgang Heimenschack"],
"organizations":["University of Zurich"],
"body": "But I must explain to you how all this mistaken idea of denouncing pleasure and praising pain was born and I will give you a complete account of the system, and expound the actual teachings of the great explorer of the truth, the master-builder of human happiness. No one rejects, dislikes, or avoids pleasure itself, because it is pleasure, but because those who do not know how to pursue pleasure rationally encounter consequences that are extremely painful. Nor again is there anyone who loves or pursues or desires to obtain pain of itself, because it is pain, but because occasionally circumstances occur in which toil and pain can procure him some great pleasure. To take a trivial example, which of us ever undertakes laborious physical exercise, except to obtain some advantage from it? But who has any right to find fault with a man who chooses to enjoy a pleasure that has no annoying consequences, or one who avoids a pain that produces no resultant pleasure?"
}
Click "Submit Document", and hope that the submission is successful. You should be given a success message like the one seen in the screen shot above. Refer to schema.xml for the names of fields.
-
Select the core you want to work with.
-
In the options, click "Query".
-
Replace ":" in the
qtextarea field with your SOLR query string, then click Execute Query. In the screen show below, I executed a query with the query stringcollector:mistaken, which searches for the keyword "mistaken" in all of the text fields of each document.
collector is defined in my schema.xml. It's just a field used during query-time to conveniently search all text fields.
Happy searching!
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="gios-asu" version="0.1">
<fields>
<field name="id" type="text" indexed="true" stored="true" required="true"/>
<field name="title" type="textgen" indexed="true" stored="true"/>
<!-- ATTRIBUTION -->
<!-- textgen tokenizes at whilespace, will catenate words and split on case change when
indexed and queried. So a query for "Augustus DeMorgan" and "Augustus De-Morgan"
will both be a match for Augustus De Morgan
-->
<field name="authors" type="textgen" indexed="true" stored="true" multiValued="true"/>
<field name="contributors" type="textgen" indexed="true" stored="true" multiValued="true"/>
<field name="organizations" type="textgen" indexed="true" stored="true" multiValued="true"/>
<!-- SPATIAL AND TEMPORAL COVERAGE -->
<field name="dates" type="DateRangeField" indexed="true" stored="true" multiValued="true"/>
<field name="locations" type="location" indexed="true" stored="true" multiValued="true"/>
<field name="location_names" type="textgen" indexed="true" stored="true" multiValued="true"/>
<!-- document body should use a special type of class solr.TextField so analyzers can
be used -->
<field name="body" type="document_body" indexed="true" stored="true"/>
<field name="publication_date" type="DateRangeField" indexed="true" stored="true"/>
<!-- collection of all fields; not stored. may be useful for querying -->
<field name="collector" type="textgen" indexed="true" stored="false" multiValued="true"/>
</fields>
<!-- may simplify some queries to combine fields into one -->
<copyField source="title" dest="collector"/>
<copyField source="authors" dest="collector"/>
<copyField source="contributors" dest="collector"/>
<copyField source="body" dest="collector"/>
<copyField source="organizations" dest="collector"/>
<copyField source="location_names" dest="collector"/>
<fieldType name="text" class="solr.TextField" sortMissingLast="true" omitNorms="true"/>
<!-- SPATIAL AND TEMPORAL COVERAGE -->
<fieldType name="DateRangeField" class="solr.DateRangeField"/>
<fieldType name="location" class="solr.LatLonType" subFieldSuffix="_coordinate"/>
<!-- custom type for document body -->
<fieldType name="document_body" class="solr.TextField"/>
<fieldType name="textgen" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
</schema>
<?xml version="1.0" encoding="UTF-8" ?>
<config>
<luceneMatchVersion>LUCENE_43</luceneMatchVersion>
<requestDispatcher handleSelect="false">
<httpCaching never304="true" />
</requestDispatcher>
<requestHandler name="/select" class="solr.SearchHandler" />
<requestHandler name="/update" class="solr.UpdateRequestHandler" />
<requestHandler name="/admin" class="solr.admin.AdminHandlers" />
<requestHandler name="/analysis/field" class="solr.FieldAnalysisRequestHandler" startup="lazy" />
<updateRequestProcessorChain>
<processor class="solr.UUIDUpdateProcessorFactory">
<str name="fieldName">id</str>
</processor>
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
<maxDocs>1000</maxDocs>
</config>
Documentation - Design docs
SOLR
Test Server - Info about our test server
Test Server Setup - SOLR configuration tut
Misc.
Snippets - Useful snippets