Skip to content

Add an OpenSearch endpoint for Federated Search

Christine White edited this page Jul 22, 2013 · 3 revisions

The geoportal not only supports the OpenSearch protocol (see Add the Geoportal Search to a List of Search Providers); it can also be configured to support searching another endpoint through the OpenSearch protocol from the geoportal's search page, much the way it can search the ArcGIS.com site or other CS-W endpoints from the Search page (see https://github.com/Esri/geoportal-server/wiki/How-to-Search-for-Resources#geoportal-basic-and-advanced-search.

This topic walks through an example for how to configure an OpenSearch endpoint for federated search on the Geoportal's Search page.

Table of Contents

Configuration Overview

At a high level, you will add a segment to the geoportal's main configuration file (gpt.xml) to connect the user interface with an OpenSearch configuration. Then, you will update another geoportal configuration file (GptXslSearchProfiles.xml) with a mapping to a request and a response transformation for communication between the geoportal and the OpenSearch endpoint. Then, you will reference these two transformations: one for passing parameters from the geoportal search interface to an OpenSearch endpoint, another you will create to parse the response from the OpenSearch endpoint to the geoportal as search results. Finally, you will update the geoportal strings file (gpt.properties) with new strings relevant to the OpenSearch endpoint.

These steps are described in the sections below. Here, we use the National Snow and Ice Data Center's OpenSearch endpoint, described here, as an example. You can also use sample files available here to accompany this documentation.

Add OpenSearch endpoint as a Repository node in gpt.xml

Open the \\gpt.xml file and find the section called repositories. This is where the configurations for Federated search are included. You'll note there is one already existing for ArcGIS.com, and also one that defines displaying CS-W endpoints configured for federated search. Here you will add a new repository entry, and populate it with information relevant to the OpenSearch endpoint to which you want to direct a search. The following parameters should be noted:

  • key: this should be a string unique among the other repository entries in this section; it will identify this particular OpenSearch entry in the gpt.xml.
  • class: the class called to pass geoportal REST-based search parameters to the OpenSearch endpoint. In most cases, you can use com.esri.gpt.catalog.search.SearchEngineRest, as shown below.
  • labelResourceKey: maps to the string in the gpt.properties file that will appear as the endpoint label next to the checkbox in the Search interface. You can create any key here, but it must begin with catalog. and be unique within the gpt.properties file.
  • abstractResourceKey: maps to the string in the gpt.properties file that will appear as upon hovering over the entry in the federated search list. You can create any key here, but it must begin with catalog. and be unique within the gpt.properties file.
  • endPointSearchUrl: the value here should be the OpenSearch endpoint template URL that you'll configure. You can get this URL from the endpoint's OpenSearch description document. In our example below, we support not only the search terms but also the startIndex and the itemsPerPage parameters to support pagination of results from the endpoint.
  • profileId: a URN that will map to a value in the GptXslSearchProfiles.xml file in your geoportal in the next step. This file maps endpoints to their request and response transformations. Here, create a new URN value, in keeping with the format shown in the example below: urn:esri:gpt:HTTP:XML:unique-distinguishing-name.
Example:
<repository key="nsidc" class="com.esri.gpt.catalog.search.SearchEngineRest" labelResourceKey="catalog.search.searchSite.nsidc" abstractResourceKey="catalog.search.searchSite.nsidc.abstract">
	<parameter key="endPointSearchUrl" value="http://nsidc.org/api/opensearch/1.1/dataset?searchterms={searchTerms?}&amp;startIndex={startIndex}&amp;itemsPerPage={count}"/>
	<parameter key="profileId" value="urn:esri:gpt:HTTP:XML:NSIDC"/>
</repository>

Optional: update the timeout settings in the gpt.xml file

The gpt.xml file has a setting that defines the amount of time to await a response from a remote server before timing out for federated search. This is an attribute in the search element in the gpt.xml file, called distributedSearchTimeoutMillisecs. By default, it is set to 5000, which is five seconds. You may need to make time longer to avoid timeout errors from the remote endpoint.

Update the GptXslSearchProfiles.xml file with request and response mappings

Open the \\geoportal\WEB-INF\classes\gpt\search\profiles\GptXslSearchProfiles.xml file.

If you scroll through this file, you will see that there are a number of entries defined with GptProfile elements. You will create a new GptProfile section and populate it similar to the ones shown in this file.

Find the final closing /GptProfile element, and paste this section just below it; note that this section will map the profileID you defined in the new repository section in your gpt.xml file to two xslt transformations, which you will author in the next steps:

<GptProfile>
	<ID>urn:esri:gpt:HTTP:XML:NSIDC</ID>
	<Name>National Snow and Ice Data Center</Name>
	<GetRecords>
		<XSLTransformations>
			<Request expectedGptXmlOutput="FULL_NATIVE_GPTXML">OpenSearchXmlParams_1.0_GetRecords_Request.xslt</Request>
			<Response>NSIDC_GetRecords_Response.xslt</Response>
		</XSLTransformations>
	</GetRecords>
	<GetRecordByID>
		<RequestKVPs><![CDATA[{id}]]></RequestKVPs>
	</GetRecordByID>
	<SupportSpatialQuery>False</SupportSpatialQuery>
	<SupportContentTypeQuery>False</SupportContentTypeQuery>
	<SupportSpatialResponse>False</SupportSpatialResponse>
	<Harvestable>False</Harvestable>
</GptProfile>

Update this example as follows:

  • ID: the same string as in the profileID defined in your new repository section in gpt.xml.
  • Name: a name of your choice that describes your endpoint; this is not displayed to the user, but is used as documentation within this file
  • GetRecords/XSLTransformations/Request: this can be populated with OpenSearchXmlParams_1.0_GetRecords_Request.xslt; this file is already in the \\geoportal\WEB-INF\classes\gpt\search\profiles directory, and will be used for the OpenSearch request. If you don't like how this xslt parses the request, you can make a copy of it, rename it, and then point to the copy in this part of the configuration.
  • GetRecords/XSLTransformations/Response: the name of an xslt that you will author in the next step. Make sure that it has 'response' somewhere in the title (this is not a programmatic dependency, just something for your own good!), and ends with an .xslt file extension.
  • SupportSpatialQuery: set to True if you know the OpenSearch endpoint supports spatial query and you feel confident you can configure this in the request xslt.
  • SupportContentTypeQuery: set to True if you know the OpenSearch endpoint supports the Esri Content Type concept (e.g., other geoportals and ArcGIS software only) and you feel confident you can configure this in the request xslt.
  • SupportSpatialResponse:set to True if you know the OpenSearch endpoint supports spatial query, you configured this in the request, and you feel confident you can configure this in the response xslt.
  • Harvestable: default is False here; harvestable is for other profiles in this xslt that may support harvesting. For federated search from the geoportal search page, you don't need harvesting.
Save the GptXslSearchProfiles.xml file.

Create the response xslt

In the GptXslSearchProfiles.xml, you reference a file in the GetRecords/XSLTransformations/Response parameter, something like this: NSIDC_GetRecords_Response.xslt. You should modify this xslt to match the OpenSearch template in the endpoints description document. Open the NSIDC_GetRecords_Response.xslt file from the samples at https://github.com/Esri/geoportal-server/tree/develop/geoportal/profiles/search/opensearch/NSIDC, and you'll see a record section like this:

<Record>
	<ID>
		<xsl:value-of select="substring-after(atom:id,'http://nsidc.org/api/opensearch/1.1/dataset/')"/>
	</ID>
	<Title>
		<xsl:choose>
			<xsl:when test=" ./atom:title/text() != 'null' ">
				<xsl:value-of select="normalize-space(./atom:title)"/>
			</xsl:when>
			<xsl:otherwise>
				<xsl:value-of select="normalize-space(./media:group/media:title)"/>
			</xsl:otherwise>
		</xsl:choose>
	</Title>
	<Abstract>
		<xsl:value-of select="normalize-space(atom:summary)"/>
	</Abstract>
	<Type/>
	<MinX>
		<xsl:value-of select="substring-before(substring-after(georss:box,' '), ' ')"/>
	</MinX>
	<MinY>
		<xsl:value-of select="substring-before(georss:box,' ')"/>
	</MinY>
	<MaxX>
		<xsl:value-of select="substring-after(substring-after(substring-after(georss:box,' '),' '),' ')"/>
	</MaxX>
	<MaxY>
		<xsl:value-of select="substring-before(substring-after(substring-after(georss:box,' '),' '), ' ')"/>
	</MaxY>
	<ModifiedDate>
		<xsl:value-of select="./atom:updated"/>
	</ModifiedDate>
	<References>
		<xsl:value-of select="atom:link[@type='application/vnd.google-earth.kml+xml']/@href"/>
		<xsl:text>&#x2714;</xsl:text>urn:x-esri:specification:ServiceType:ArcIMS:Metadata:Server<xsl:text>&#x2715;</xsl:text>
	</References>
	<Types>
		<xsl:value-of select="normalize-space(./media:group/media:keywords/text())"/>
		<!-- convert remote type to GPT type again -->
		<xsl:text>&#x2714;</xsl:text>Video<xsl:text>&#x2715;</xsl:text>
	</Types>
	<Links>
		<Link gptLinkTag="customLink" show="true"/>
		<Link gptLinkTag="previewInfo" show="false"/>
		<Link label="catalog.search.searchSite.nsidc.html">
			<xsl:value-of select="atom:link[@type='text/html']/@href"/>
		</Link>
		<Link label="catalog.search.searchSite.nsidc.granule">
			<xsl:value-of select="atom:link[@type='application/opensearchdescription+xml']/@href"/>
		</Link>						
	</Links>
</Record>

This response xslt file maps values as described in the document to concepts useful to the geoportal. To map this section to the response, you'll need to obtain a response from the OpenSearch endpoint. In our example with the NSIDC, we can obtain a response by entering a request as described in the template from the NSIDC OpenSearch description document (OSDD):

Here is a template URL from the OSDD: http://nsidc.org/api/opensearch/1.1/dataset?searchterms={searchTerms?}

Replace {searchTerms?} with a search term, for example, "water": http://nsidc.org/api/opensearch/1.1/dataset?searchterms=water

The response you get in the browser - you may have to right click to view its source - can be used to map the geoportal concepts to the XML paths defined in the response.

Update this Record section in the response xslt as appropriate to the OpenSearch description document for your OpenSearch endpoint.

Update gpt.properties file with new strings referenced in gpt.xml

In the first step, you added two new strings in your repository section that should be referenced in the gpt.properties file. Open the \\geoportal\WEB-INF\classes\gpt\resources\gpt.properties file, and scroll to the end. Add two new strings as you defined in your gpt.xml file, similar to this below - note that what's on the right side of the equals sign is the text viewable in the geoportal interface:

catalog.search.searchSite.nsidc = National Snow and Ice Data Center
catalog.search.searchSite.nsidc.abstract = National Snow and Ice Data Center OpenSearch endpoint

Save the gpt.properties file.

Restart Tomcat and Test

Test by opening your geoportal's search page and expanding the "Search In" section. You should see the name of your endpoint in the list. Now check the box next to the endpoint, and enter a search term in the Search page search field. When you click "Search", your search should federate to that OpenSearch endpoint.

If you see an error, check the files you configured in this tutorial. A good way to troubleshoot is to open the files in an XML editor and see if closing tags, quotes, etc. are missing.

Clone this wiki locally