Skip to content

OpenSearch Web Service

Noah Torp-Smith edited this page Nov 26, 2020 · 6 revisions

OpenSearch Web Service

General Description

Open Search is a web service used for searching and retrieving records from a data repository. Depending on the specific record, there may be metadata, or full text and metadata, as well as relations for the records.

Supported communication methods are SOAP and HTTP POST (XML-requests). Most of Open Search’ functionality can be called with HTTP GET (URL-requests), except user defined ranking. Supported output formats are SOAP, XML, JSON and PHP.

Use of the service requires authentication.

Versions

API version Endpoint Repository version Environment Start of life End of life WSDL
5.2 https://opensearch.addi.dk/b3.5_5.2/ 3.5 production 20190906 WSDL
5.2 https://opensearch.addi.dk/staging_5.2/ 3.5 staging 20190906 WSDL
5.2 https://opensearch.addi.dk/test_5.2/ 3.5 test 20190906 WSDL

Syntax and functionality may differ between the different versions of the Open Search web service.

In case of any discrepancy between the documentation on this page and the WSDL/XSD of a specific version of the Open Search web service, the WSDL/XSD is to be considered the authoritative source.

Release Notes

5.2:

  • searchResponse: objects contain an error-element if record is unavailable.
  • searchRequest: userDefinedBoost – weight must be expressed as a positive float value.

Service Operations

The service has three operations:

search: send a search query and retrieve matching collections. Example of a basic search request: https://opensearch.addi.dk/test_5.2/?action=search&query="danmark"&agency=100200&profile=test&start=1&stepValue=5

getObject: retrieve a specific object in one or more formats by it's identifier. Example of a basic getObjectRequest: https://opensearch.addi.dk/test_5.2/?action=getObject&identifier=870970-basis:28692765&agency=100200&profile=test

info: returns service information as well as information about available indexes, search collections etc. for a given agency and profile. Example of an infoRequest: https://opensearch.addi.dk/test_5.2/?action=info&agency=100200&profile=test

searchRequest parameters

(*) marks the default value used if parameter is not present in searchRequest

Parameter Must be present Repeatable Sub element of Description
query yes no searchRequest Open Search uses CQL for search queries. See Query language, indexes and facets.
queryLanguage no no searchRequest Specify search language. Possible values: cql*, cqleng (same as cql), bestMatch. See Query language, indexes and facets.
agency yes no searchRequest Identifier of institution using the service.
profile yes yes searchRequest Identifier of profile. The agency's profile-settings determines which sources are being searched. An agency can have one or more associated profiles. See Search collections (sources).
showAgency no no searchRequest show records as they are seen by the given agency
allObjects no no searchRequest Possible values: true, false*. If true, all objects in a collection will be returned, no matter if the objects are part of the search result. If false, only objects that are part of the search result will be returned.
authentication no no searchRequest Only used if IP-authentication is not possible for service requester
groupIdAut no no authentication Identifier of the service requester (most often a library number)
passwordAut no no authentication Service requester's password
userIdAut no no authentication Service requester's user name
callBack no no searchRequest JSON-callback. Only if used if outputType is set to json.
collectionType no no searchRequest Set how collections in the search result are structured. Possible values: work*, manifestation, work-1. See Work Structure.
facets no no searchRequest Set the facets that will be returned and count for each facet. See Query language, indexes and facets for more details.
numberOfTerms no no facets Specify number of facets to return for each facetName.
facetSort no no facets The sorting order of the facets. Possible values: count*, index. count sorts the constraints by count (highest count first). index returns the constraints sorted in their index order (lexicographic by indexed term). For terms in the ascii range, this will be alphabetically sorted.
facetMinCount no no facets Minimum hit count available for facet, for facet to be included.
facetName no yes facets Name of the the facet to return. Element must be repeated for each requested facet.
includeHoldingsCount no no searchRequest Include information about how many agencies own each returned object and how many of these that will lend the object. Possible values: true, false*.
collapseHitsThreshold no no searchRequest Possible values: default or a positive integer
objectFormat no yes searchRequest Set in which format(s) objects will be returned. Default value if element is not present is dkabm. See Formats for more details.
outputType no no searchRequest Set format of web service output. Possible values: xml, json, php. Default output format for SOAP and URL-requests is SOAP. Default output format for XML-requests is XML.
relationData no no searchRequest Set detail level for returned relations. Possible values: type, uri, full. See Relations for more details.
repository no no searchRequest Set repository to search.
source no no searchRequest No longer in use.
start no no searchRequest The number of the first collection you want the service to return. Counting from 1*.
stepValue no no searchRequest The number of collections you want the service to return, counted from the value in start. Default if empty or not present is 0.
sort no no searchRequest Set how collections in search result are sorted. Can not be combined with userDefinedRanking. See Ranking and sorting for more details.
userDefinedRanking no no searchRequest Sub elements: tieValue, rankField. Can only be used in SOAP or XML-requests. Can not be combined with sort. See Ranking and sorting for more details
tieValue if userDefinedRanking is present no userDefinedRanking Emphasis of objects that match multiple terms of search query. Possible values: a decimal from 0 to 1.
rankField if userDefinedRanking is present yes userDefinedRanking Set which field to rank by and what weight should be given to each field. Sub elements: fieldName, fieldType, weight.
fieldName if rankField is present no rankField Field to be used for ranking
fieldType if rankField is present no rankField Possible values: word, phrase.
weight if rankField is present no rankField Weight to be given to the field specified in fieldName. Value must be a positive float, lowest possible value is 0.00001. Values between 0 and 1 (excluding) will weigh record down in search result.
userDefinedBoost no yes searchRequest Set field name for which boosting should be applied.
fieldName if userDefinedBoost is present no userDefinedBoost Field to be used for boost.
fieldValue if userDefinedBoost is present no userDefinedBoost Specific value for boosting
weight if userDefinedBoost is present no userDefinedBoost Weight to be given to the field specified in fieldName. Value must be a positive float, lowest possible value is 0.00001. Values between 0 and 1 (excluding) will weigh record down in search result.
queryDebug no no searchRequest Use this parameter to see how the service backend is handling the query. This is especially useful for debugging user defined ranking. Possible values: true, false*.
trackingId no no searchRequest Identifier for tracking a request.

getObjectRequest parameters

(*) marks the default value used if element is not present in getObjectRequest.

Parameter Must be present Repeatable Sub element of Description
agency yes no getObjectRequest Identifier of institution using the service. Default value if not present is 100200.
profile yes yes getObjectRequest Identifier of profile. The agency's profile-settings determines which sources are being searched. An agency can have one or more associated profiles. See Search collections (sources).
showAgency no no getObjectRequest show records as they are seen by the given agency
identifier if localIdentifier is not present yes getObjectRequest Identifier of object requested. Element must be repeated for each requested identifier.
localIdentifier if identifier is not present yes getObjectRequest Identifier of object requested. Element must be repeated for each requested identifier.
objectFormat no yes getObjectRequest Set in which format(s) objects will be returned. Default value if not present is dkabm.
authentication no no getObjectRequest Only used if IP-authentication is not possible for client
groupIdAut no no authentication Identifier of the service requester (most often a library number)
passwordAut no no authentication Service requester's password
userIdAut no no authentication Service requester's user name
callBack no no getObjectRequest JSON-callback. Only if used if outputType is set to json
includeHoldingsCount no no getObjectRequest Include information about how many agencies own each returned object and how many of these that will lend the object. Possible values: true, false*.
outputType no no getObjectRequest Set format of web service output. Possible values: xml, json, php. Default output format for SOAP and URL-requests is SOAP. Default output format for XML-requests is XML.
relationData no no getObjectRequest Set detail level for returned relations. Possible values: type, uri, full. See Relations for more details.
repository no no getObjectRequest Set which repository to search.
trackingId no no getObjectRequest Identifier for tracking a request.

searchResponse parameters

searchResponse consists of either the sub element error or the sub element result.

Parameter Always present Repeatable Sub element of Description
error if result is not present no searchResponse Message returned by the service, if an error occurs.
result if error is not present no searchResponse The result from the service including searchResult, facets and statInfo.
hitCount if result is present no result Estimated number of objects in entire search result.
collectionCount if result is present no result Number of searchResults/collections in the retrieved result.
more if result is present no result Indicates whether there are more hits that can be retrieved or the end of the result has been reached. Possible values: true, false.
sortUsed if sort is present in request no result The sort method used in this result.
searchResult no no result Result of the search organised in collection and formattedCollection.
collection if searchResult is present yes searchResult If collectionType in the request is set to manifestation, each collection will consist of exactly one object. If collectionType is set to work, each collection will consist of one or more objects depending on how many objects a specific work consists of. If collectionType is set to work-1 (work minus one) each collection will consist of just one object regardless how many objects the work consists of. See Work structure for more details.
resultPosition if collection is present no collection Position of the specific collection in the entire search result.
numberOfObjects if collection is present no collection Number of objects in a collection.
object if collection is present yes collection Manifestation from the data well. Formatted as specified in objectFormat (if the requested format is available).
identifier if object is present no object Unique identifier of object. Used in subsequent getObjectRequest.
primaryObjectIdentifier no no object Unique identifier of the primary bibliographic object. Useful if a collection consists of multiple objects.
recordStatus if object is present no object The status of the returned record. Possible values: active, delete.
creationDate no no object The creation date of either the original bibliographic record or the object.
holdingsCount if includeHoldingsCount is true in request no object Approximate number of agencies that own the object.
lendingLibraries if includeHoldingsCount is true in request no object Number of agencies that will lend out the object.
relations if relationData is present in request and relations are available for the specific object no object Relations for the specific object. See Relations for more details.
relation if relations is present yes relations Container for specific relation.
relationType if relation is present no relation Type of specific relation. See Relations for list of available relationTypes.
relationUri no no relation Identifier of the relation.
linkObject if relationObject is not present no relation The related object if it is an external resource.
accessType no no linkObject Possible values: download, rest, streaming
access no no linkObject Possible values: free, openurl, onsite, remote, uni-c
linkTo if linkObject is present no linkObject Possible values: file, linkresolver, proxy, webservice, website
linkCollectionIdentifier no yes linkObject Name of the collection containing the link
relationObject if linkObject is not present no relation The related object if it is another object in the repository. Sub element: object.
formatsAvailable if object is present no object Formats in which the object is available.
format if formatsAvailable is present yes formatsAvailable Repeated for each available format.
objectsAvailable no no object List of objects that are matched in the current object
identifier if objectsAvailable is present yes objectsAvailable Identifier of object
queryResultExplanation no no object No longer in use.
formattedCollection no yes searchResult Collection formatted with the Open Format-web service.
facetResult if result is present no result Facet for search result.
facet if facets is present in request yes facetResult Container for specific facet and it's related term.
facetName if facet is present no facet Name of the specific facet.
facetTerm no yes facet Container for specific facetTerm and it's frequency of occurrence in the search result.
frequence if facetTerm is present no facetTerm Frequency of a specific term in the search result.
term if facetTerm is present no facetTerm The specific facet term.
queryDebugResult if queryDebug is true in searchRequest no result Detailed information about how query is interpreted by SOLR/Lucene. This is especially useful for debugging user defined ranking.
rawQueryString if queryDebugResult is present no queryDebugResult Original search query.
queryString if queryDebugResult is present no queryDebugResult CQL-interpreted search query.
parsedQuery if queryDebugResult is present no queryDebugResult Search query as it's interpreted by SOLR/Lucene.
parsedQueryString if queryDebugResult is present no queryDebugResult Search query as it's interpreted by SOLR/Lucene.
rankFrequency no no queryDebugResult Information about rank_frequency.
statInfo if result is present no result Information used for debugging and statistics.
fedoraRecordsCached if statInfo is present no statInfo Number of objects found in cache.
fedoraRecordsRead if statInfo is present no statInfo Number of objects read from the data well.
time if statInfo is present no statInfo Number of seconds used by the web service.
trackingId if statInfo is present no statInfo Unique ID of the specific request.

infoRequest parameters

Parameter: Must be present: Repeatable: Sub element of: Description:
agency yes no infoRequest Identifier of institution using the service.
profile yes no infoRequest Identifier of profile. The agency's profile-settings determines which sources are being searched. An agency can have one or more associated profiles. See Search collections (sources).
callBack no no infoRequest JSON-callback. Only if used if outputType is set to json.
outputType no no infoRequest Set format of web service output. Possible values: xml, json, php. Default output format for SOAP and URL-requests is SOAP. Default output format for XML-requests is XML.
trackingId no no infoRequest Identifier for tracking a request.

infoResponse parameters

Parameter: Always present: Repeatable: Sub element of: Description:
infoGeneral yes no infoResponse General information about the current version of the webservice
defaultRepository yes no infoGeneral Name of the default repository for this version of the web service
infoRepositories yes no infoResponse Information about available repositories
infoRepository no yes infoRepositories Info about a specific repository
repository yes no infoRepository Name of repository
cqlIndexDoc no no infoRepository Name of xml index definition document.
infoCqlIndexDocs yes no infoResponse Information about index definitions
infoCqlIndexDoc no yes infoCqlIndexDocs Information about each index
cqlIndexDoc yes no infoCqlIndexDoc Name of xml index definition document.
cqlIndex no yes infoCqlIndexDoc Index definition.
indexName yes no cqlIndex Exposed name of index
indexSlop no no cqlIndex The specific slop for the index
indexAlias no yes cqlIndex Definiton of index alias. Note that an index can have a different slop than that of its alias.
indexName yes no IndexAlias Exposed name of indexAlias
indexSlop no no indexAlias The specific slop for the indexAlias
infoObjectFormats yes no infoResponse Information about available object formats
objectFormat no yes infoObjectFormats Name of available object format
infoSearchProfile yes no infoResponse Information about the search profile specified in infoRequest
searchCollection no yes infoSearchProfile Information about search collection
searchCollectionName yes no searchCollection Name of search collection as seen in VIP
searchCollectionIdentifier yes no searchCollection The internal identifier of the search collection
relationType no yes searchCollection Type of relation available for search collection
relationTypes no no infoSearchProfile List of available relation types
relationType no yes relationTypes Type of relation
infoSorts yes no infoResponse Information about available sort and ranking methods
infoSort no no infoSorts Information about sort method
sort yes no infoSort ...
internalType yes no infoSort Type of method
rankDetails yes no infoSorts Details about ranking method (fieldName and weight)
sortDetails yes no infoSorts Details about sort method
infoNameSpaces yes no infoResponse Information about namespaces used
infoNameSpace no yes infoNameSpaces Namespace prefix and URI
prefix yes no infoNameSpaces Namespace prefix
uri yes no infoNameSpaces Namespace URI

Work Structure

Data in the data repository is organised in works and manifestations. A work consists of one or more manifestations.

Data in the output from the service (result) is organised in searchResults. Each searchResult consists of a collection (and eventually a formattedCollection). A collection consists of one or more objects. Each object represents a manifestation of the specific work. If collectionType in the searchRequest is set to work all manifestations of a work are gathered as individual objects in the same collection.

Example: https://opensearch.addi.dk/test_5.2/?action=search&query="min%20kamp"&agency=100200&profile=test&collectionType=work&start=1&stepValue=5

If collectionType is set to work-1 objects are organised the same way but only one of the objects is shown.

Example: https://opensearch.addi.dk/test_5.2/?action=search&query="min%20kamp"&agency=100200&profile=test&collectionType=work-1&start=1&stepValue=5

If collectonType is set to manifestation the searchResult will not be organised in works. Instead each collection will consist of exactly one object representing one manifestation

Example: https://opensearch.addi.dk/test_5.2/?action=search&query="min%20kamp"&agency=100200&profile=test&collectionType=manifestation&start=1&stepValue=5

Search Collections (Sources)

Which sources are being searched, is set by each individual institution's sources, and by the configuration of search collections in the VIP-base (https://vip.dbc.dk, section J.2. Brøndprofiler - Open Search 3.x). The profile(s) one creates in VIP are called using the parameter profile and the agency-parameter. The combination of agency and profiles determines which data search collections/sources that are searched.

Information about which search collections that are activated for a given agency/profile-combination can be retrieved using the service's info-operation.

Query language, indexes and facets

Open Search uses CQL for search queries, compliant to level 2 of the CQL 1.2 standard.

The default index is cql.keywords if no index is specified in the search request.

Example of CQL-search request: https://opensearch.addi.dk/test_5.2/?action=search&query=dkcclterm.ti="min%20kamp"%20AND%20term.type=bog&agency=100200&profile=test&start=1&stepValue=1

A complete list of available CQL-indexes including available facets can be retrieved here: https://opensearch.addi.dk/test_5.2/?showCqlFile&repository=external_test&cql=opensearch_cql.xml. An overview of available repositories can be retrieved using the service's info-operation. Example: https://opensearch.addi.dk/test_5.2/?action=info&agency=100200&profile=test

Example of search returning facets for creator and material type: https://opensearch.addi.dk/test_5.2/?action=search&query="danmark"&agency=100200&facetName=facet.creator&facetName=facet.type&numberOfTerms=10&profile=test&start=1&stepValue=0

Filtering based on holdings items (not version 3.0 of the repository)

From version 3.5 of the Open Search repository it is possible to limit search results based on availability, eg. whether a given material is currently on shelf at a given branch of an agency. Holdings items information are not searchable, but can only be used as limiters. Example: https://opensearch.addi.dk/test_5.2/?action=search&query=hunde%20AND%20holdingsitem.agencyId=761500%20AND%20holdingsitem.status=OnShelf&agency=761500&profile=opac&start=1&stepValue=5

For a search result to represent the actual holdings of an agency, the holdingsitem.agency-index must always be included in a search. Example: https://opensearch.addi.dk/test_5.2/?action=search&query=phrase.creator=%22helle%20helle%22%20and%20holdingsitem.agencyId=761500&agency=761500&profile=opac&start=1&stepValue=5

Limiting on holdings items will only have impact on the two sources "Bibliotekets katalog" ([agencyId]-katalog) and "Folkebiblioteker og Nationalbibliografi (870970-basis)". Result from other sources will always be included in the result set (according to the settings described above).

See https://danbib.dk for more details (only in Danish). Information about available holdings items-indexes can be retrieved using the info-operation of the web service.

Formats

All objects are available in DKABM-format (http://biblstandard.dk/abm/). Bibliographic records are also available as marcXchange (v1.1 http://www.loc.gov/standards/iso25577/). Full text objects are available in Docbook-format (http://docbook.org/).

Example of request for object in DKABM and marcXchange: https://opensearch.addi.dk/test_5.2/?action=getObject&identifier=870970-basis:28692765&objectFormat=dkabm&objectFormat=marcxchange&agency=100200&profile=test

Information about available formats can be retrieved using the service's info-operation.

Ranking and sorting

The order of results in the search response can be set using either ranking (predefined or user defined) or sorting, using either the sort- or userDefinedRanking-parameter. When no sorting- or ranking parameters are set records are returned in order according to when they have been indexed with the newest first.

Sort and userDefinedRanking will be overruled when queryLanguage is set to bestMatch. With bestMatch each search term is weighed (regardless of boolean operators) and records are sorted based on how many of the search terms they match. This is especially useful with queries where no records matching all the search terms can be found.

Search results may be sorted ascending or descending by several different parameters, using the sort-parameter. This is done by title, creator or various dates.

Example: https://opensearch.addi.dk/test_5.2/?action=search&query="jonas%20t%20bengtsson"&agency=100200&profile=test&start=1&stepValue=10&sort=date_descending

Results can be ranked based on different criteria. In this example, works where the query terms appear in creator-field are ranked the highest: https://opensearch.addi.dk/test_5.2/?action=search&query="jonas%20t%20bengtsson"&agency=100200&profile=test&start=1&stepValue=10&sort=rank_creator

Information about available sort- and rank-parameters can be retrieved using the service's info-operation.

It is possible to define ranking criteria by defining which fields to base rank on and what weight each of these field should be given (userDefinedRanking). In this example works where the terms "henrik" and "stangerup" is part of the title will be ranked higher than works where "henrik" and "stangerup" is part of the subject description. Works where "henrik" and "stangerup" is part of the author name but not the title or subject description will be ranked lower.

Example of user defined ranking:

<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns1="http://oss.dbc.dk/ns/opensearch">
<SOAP-ENV:Body>
<ns1:searchRequest>
<ns1:query>"henrik stangerup"</ns1:query>
<ns1:agency>100200</ns1:agency>
<ns1:profile>test</ns1:profile>
<ns1:objectFormat>dkabm</ns1:objectFormat>
<ns1:objectFormat>marcxchange</ns1:objectFormat>
<ns1:start>1</ns1:start>
<ns1:stepValue>6</ns1:stepValue>
<ns1:userDefinedRanking>
<ns1:tieValue>0.1</ns1:tieValue>
<ns1:rankField>
<ns1:fieldName>dkcclterm.fo</ns1:fieldName>
<ns1:fieldType>word</ns1:fieldType>
<ns1:weight>2</ns1:weight>
</ns1:rankField>
<ns1:rankField>
<ns1:fieldName>dkcclterm.em</ns1:fieldName>
<ns1:fieldType>word</ns1:fieldType>
<ns1:weight>4</ns1:weight>
</ns1:rankField>
<ns1:rankField>
<ns1:fieldName>dkcclterm.ti</ns1:fieldName>
<ns1:fieldType>word</ns1:fieldType>
<ns1:weight>8</ns1:weight>
</ns1:rankField>
</ns1:userDefinedRanking>
</ns1:searchRequest>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Relations

Relations are used for linking relevant objects. Relations between objects in the repository are mostly two way-relations. A relation can for instance be between a book and a review, eg.: Book hasReview review. Review isReviewOf book.

Some relations are only one way. For instance relations from articles to journals and relations that point to external sources, eg.: article isPartOf manifestation, ebook hasOnlineAccess.

Relations are, if available, shown for each object if the element relationData is present in the search request with either the value type (only type of each available relation is shown), uri (type and uri of the available relation are shown aswell as the linkObject-element for relations to external sources) or full (type, uri as well as the related object (if another object in the data repository) or the linkObject (if an external resource) is shown.

The availability of the various relations for a given search collection can be set at https://vip.dbc.dk (section J.2. Brøndprofiler - Open Search 3.x). Information about available relations can be retrieved using the service's info-operation.

Example of a Relation Between two OpenSearch objects

An example of a relation between two objects is the hasReview-relation. Here the element relationType contains information about the type of relation (in this case: hasReview) and relationUri contains the Open Search-identifier of the object that represents the review. The element relationObject shows the actual object that the relation points to.

Often, the review that the relation points to, is available as full text. Either in the Open Search-data well as a docBook-object, if docbook is listed under formatsAvailable (set objectFormat to docbook in the initial search- or getObjectRequest, or the subsequent getObjectRequest for the related object, eg. the review, to retrieve this) or through an external (hasOnlineAccess) relation. The latter is a case of a relation that carries another relation.

Example request: https://opensearch.addi.dk/b3.5_5.2/?action=getObject&identifier=870970-basis:51799666&agency=100200&profile=test&relationData=full

Example of a subsequent request for an object containing a review as full text: https://opensearch.addi.dk/b3.5_5.2/?action=getObject&identifier=870976-anmeld:31214955&agency=100200&profile=test&relationData=full&objectFormat=dkabm&objectFormat=docbook

In the last example the counterpart of the hasReview-relation, isReviewOf, (pointing from the review-object to the object representing the reviewed item) can also be seen.

Example of a One-way, or External, Relation

The hasOnlineAccess-relation is an example of a relation that points from an object to an external resource. In this case relationType is hasOnlineAccess while the relationUri is the URL of the external resource. Furthermore, the linkObject contains information about how access to the resource may be obtained (for instance access: remote, accessType: streaming and linkTo: website) and which search collection(s) the object carrying the relation comes from (linkCollectionIdentifier).

Example request: https://opensearch.addi.dk/test_5.2/?action=getObject&identifier=870970-basis:50952614&agency=100200&profile=test&relationData=full

Deprecated Versions

API version Endpoint Repository version Environment Start of life End of life WSDL

License Terms

The web service is published under the GPLv3 license.