Skip to content

Index All Metadata Content

Marten Hogeweg edited this page Dec 14, 2012 · 1 revision

Note: In an out-of-the-box geoportal implementation, all Dublin Core metadata is already completely indexed; you do not have to apply this customization for Dublin Core.

Before undertaking this customization, you should be familiar with the information presented in the Details of Lucene Indexing in the Geoportal and the Add a Custom Profile topics. Indexing is important because it determines what search results are returned when a user submits search criteria to the geoportal. By default, not all metadata elements are indexed by the geoportal. The Geoportal Server is set to index information in a metadata document as defined in that profile's indexables.xml file.

There are advantages to only indexing certain pieces of information. One is that the Lucene index will not be as large if only certain information is indexed. This facilitates faster searching. Also, some information included in metadata is not useful for text-based searching. For example, if the metadata record contains a thumbnail, there is no need to index the thumbnail binary section in the metadata because users are not going to search for characters within the binary. Also, specifying that only specific information be indexed provides control over the search results. A user searching for "New York" may want to retrieve results with "New York" in the title and abstract, not the Point of Contact's address information.

However, if it is very important to your organization that all information in a metadata document is searchable, then you may want to index all metadata content. To do this, follow the steps below.

  1. Identify the indexables.xml file that applies to your metadata standard of interest. This customization would be applied to each metadata standard for which you want to index all metadata content. So for example if your geoportal supports both FGDC and ISO 19115 metadata and you want to index all content from both types of metadata, you would need to apply this customization to the \\geoportal\WEB-INF\classes\gpt\metadata\fgdc\fgdc-indexables.xml file and the \\geoportal\WEB-INF\classes\gpt\metadata\iso\apiso-indexables.xml file, respectively.
  2. Open the applicable indexables.xml file. Scroll to the bottom of the file and find the closing tag.
  3. Add the following line just before that closing tag: <property meaning="body" xpathType="STRING" xpath="/*"/>. This line indicates that the geoportal should index whatever is found at all nodes within the document, and index them as STRING types.
  4. Save the indexables.xml file.
  5. Repeat for each supported profile in your geoportal for which you want to index all content.
  6. Restart the geoportal web application for your changes to take effect.
IMPORTANT: If you already had records published to your geoportal before applying this change, you will need to re-index them if you want all of their content to be indexed according to this new rule. The easiest way to do this is to Login as an Administrator, and set the records to "Posted". Then set them to "Approved". This re-approval will force the reindexing. If you have too many records to do this re-approving efficiently, then you may choose to rebuild your entire geoportal index. To rebuild the entire index, you can either navigate to the folder defined for the lucene index (the filepath located at the <lucene></lucene> element's indexLocation attribute in the \\geoportal\WEB-INF\classes\gpt\config\gpt.xml file) and delete all the files from that folder, or you can create a new folder and update the filepath in the <lucene></lucene> element in gpt.xml. You will have to restart your geoportal web application. Note, after you clear out the old index files or change the index folder, it may take awhile for the new index to be recreated.
Back to Customizations
Clone this wiki locally