Permalink
Fetching contributors…
Cannot retrieve contributors at this time
308 lines (289 sloc) 9.85 KB
---
# Copyright 2017 Yahoo Holdings. Licensed under the terms of the Apache 2.0 license. See LICENSE in the project root.
title: "Search API"
---
<p>
This is the Vespa Search API guide - refer to the
<a href="reference/search-api-reference.html">search API reference</a> for details.
<h2 id="http">HTTP</h2>
<ul>
<li>
The <em>host:port</em> endpoint is the <a href="querying-vespa.html">Search Container</a>.
The general form of a search GET-request is:
<pre>
http://host:port/search/?param1=value1&amp;param2=value2&amp;...
</pre>
</li><li>
The general form of search request with JSON-queries is:
<pre>
{
param1 : value1,
param2 : value2,
...
}
</pre>
The format is based on the <a href="reference/search-api-reference.html">search API reference</a>, and has been
converted from the <em>flat</em> dot notation to a <em>nested</em> JSON-structure.
<ul>
<li>
The request-method must be POST and the <em>Content-Type</em> must be <em>"application/json"</em>.
</li>
<li>
Feel free to use the GUI for building queries at <a href="http://localhost:8080/querybuilder/">http://localhost:8080/querybuilder/</a> (with Vespa running) which can help you build queries, with e.g. autocompletion of YQL, pasting of already built queries and conversion of JSON- to URL-queries
</li>
</ul>
</br>
As for an example of the format:
<pre>
{
"yql" : "select * from sources * where default contains \"bad\";",
"offset" : 5,
"ranking" : {
"matchPhase" : {
"ascending" : true,
"maxHits" : 15
},
},
"presentation" : {
"bolding" : false,
"format" : "json"
},
"nocache" : true
}
</pre>
</li><li>
The only mandatory parameter is <em>yql</em>
</li><li>
Use GET or POST - Parameters can either be sent as GET-parameters or posted as JSON, these are equivalent:
<pre>
$ curl -H "Content-Type: application/json" --data '{"yql" : "select * from sources * where default contains \"bad\";"}' http://localhost:8080/search/
$ curl http://localhost:8080/search/?yql=select+%2A+from+sources+%2A+where+default+contains+%22bad%22%3B
</pre>
</li><li>
</li><li>
The Search Container uses Jetty for HTTP.
Configure the <a href="reference/services-http.html#server">http server</a> -
e.g. set <em>requestHeaderSize</em> to configure URL length (including headers):
<pre>
&lt;container version="1.0"&gt;
&lt;http&gt;
&lt;server port="8080" id="myserver"&gt;
&lt;config name="jdisc.http.connector"&gt;
&lt;requestHeaderSize&gt;32768&lt;/requestHeaderSize&gt;
&lt;/config&gt;
&lt;/server&gt;
&lt;/http&gt;
&lt;/container&gt;
</pre>
</li><li>
HTTP keepalive is supported
</li><li>
Values must be encoded according to standard URL encoding.
Thus, space is encoded as +, + as %2b and so on -
see <a href="http://www.ietf.org/rfc/rfc2396.txt">RFC 2396</a>
</li>
</ul>
<h2 id="features">Features</h2>
<p>
The query string contains the specification of which results the search should return,
typically some words which should be present in matching documents.
Queries are formulated in <em>YQL</em>.
Note: Also find the legacy
<a href="reference/simple-query-language-reference.html">simple query language</a> reference.
</p><p>
If Vespa cannot generate a valid search expression from the query string,
it will issue the error message <em>Null query</em>.
To troubleshoot, add <a href="reference/search-api-reference.html#tracelevel">&amp;tracelevel=2</a> to the request.
A missing <em>yql</em> parameter will also lead to this error message.
</p><p>
Examples - refer to <a href="query-language.html">YQL</a>, <a href="grouping.html">grouping</a> and the
<a href="reference/sorting.html">sorting reference</a> for details:
<table class="table">
<thead></thead><tbody>
<tr>
<th>Ordering</th>
<td>
<pre>
$ curl -H "Content-Type: application/json" --data '{"yql" : "select * from sources * where default contains \"bad\"\
<span style="background-color: yellow;">order by year desc</span>;"}' http://localhost:8080/search/
</pre>
</td>
</tr><tr>
<th>Grouping</th>
<td>
<pre>
$ curl -H "Content-Type: application/json" --data '{"yql" : "select * from sources * where default contains \"bad\"\
<span style="background-color: yellow;">| all(group(year) each(output(sum(duration))))</span>;"}' http://localhost:8080/search/
</pre>
</td>
</tr><tr>
<th>Pagination</th>
<td>
<pre>
$ curl -H "Content-Type: application/json" --data '{"yql" : "select * from sources * where default contains \"bad\"\
<span style="background-color: yellow;">limit 2 offset 1</span>;"}' http://localhost:8080/search/
</pre>
</td>
</tr><tr>
<th>Numeric</th>
<td>
<pre>
$ curl -H "Content-Type: application/json" --data '{"yql" : "select * from sources * where <span style="background-color: yellow;">year > 2000</span>"} http://localhost:8080/search/
</pre>
</td>
</tr><tr>
<th>Timeout</th>
<td>
<pre>
$ curl -H "Content-Type: application/json" --data '{"yql" : "select * from sources * where default contains \"bad\"\
<span style="background-color: yellow;">timeout 100</span>"} http://localhost:8080/search/
</pre>
</td>
</tr><tr>
<th>Regexp</th>
<td>
<pre>
$ curl -H "Content-Type: application/json" --data '{"yql" : "select * from sources * where title <span style="background-color: yellow;">matches \"mado[n]+a\"</span>"} http://localhost:8080/search/
</pre>
</td>
</tr>
</tbody>
</table>
</p>
<h3 id="query-params">Query parameters</h3>
Below is a list of query parameters to control aspects of queries -
refer to the <a href="reference/search-api-reference.html">search API reference</a>
for the full list.
<table class="table">
<thead></thead><tbody>
</tr><tr>
<th>ranking</th>
<td>
Unless ordering is specified, results are ranked using the default rank profile
<a href="nativerank.html">nativerank</a>.
Vespa has a rich ranking framework, read more in <a href="ranking.html">ranking</a>.
Control result ranking using <a href="reference/search-api-reference.html#ranking.profile">rank profiles</a>
</td>
</tr><tr>
<th>searchChain</th>
<td>
Use <a href="reference/search-api-reference.html#searchChain">search chains</a>
to implement query processing.
Set <em>&amp;tracelevel=2</em> to inspect the search chain components.
Refer to <a href="chained-components.html">chained components</a>
</td>
</tr><tr>
<th>sources</th>
<td>
An application can have multiple content clusters - Vespa searches in all by default.
<a href="federation.html">Federation</a> controls how to query the clusters,
<a href="reference/search-api-reference.html#model.sources">sources</a> names the clusters
</td>
</tr><tr>
<th>pos.ll</th>
<td>
Specify position using latitude and longitude to implement <a href="geo-search.html">geo search</a>
</td>
</tr><tr>
<th>queryProfile</th>
<td>
Use <a href="reference/search-api-reference.html#queryProfile">query profiles</a> to store
query parameters in configuration.
This makes query strings shorter, and makes it easy to modify queries by modifying configuration only.
Use cases are setting query properties for different markets, parameters that do not change, and so on.
Query profiles can be nested, versioned and use inheritance
</td>
<tr><th>tracelevel</th>
<td>
Set to a positive integer to see query <a href="reference/search-api-reference.html#tracelevel">tracing</a>.
Higher numbers produce more tracing output
</td>
</tr>
</tbody>
</table>
</p>
<h2 id="default-result-format">Default Result Format</h2>
<p>
The default output format is <a href="./reference/default-result-format.html">JSON</a>.
The basic structure is:
<pre>
{
"root": {
"children": [
objects with same structure as root itself...
],
"fields": {
"document field name": "document field contents",
}
}
}
</pre>
As for a complete example of the structure:
<pre>
{
"root": {
"children": [
{
"children": [
{
"fields": {
"c": "d",
"uri": "http://localhost/1"
},
"id": "http://localhost/1",
"relevance": 0.9,
"types": [
"summary"
]
}
],
"id": "usual",
"relevance": 1.0
},
{
"fields": {
"e": "f"
},
"id": "type grouphit",
"relevance": 1.0,
"types": [
"grouphit"
]
},
{
"fields": {
"description": "foo",
"uri": "http://localhost/"
},
"id": "http://localhost/",
"relevance": 0.95,
"types": [
"summary"
]
}
],
"coverage": {
"coverage": 100,
"documents": 500,
"full": true,
"nodes": 1,
"results": 1,
"resultsFull": 1
},
"errors": [
{
"code": 18,
"message": "boom",
"summary": "Internal server error."
}
],
"fields": {
"totalCount": 130
},
"id": "toplevel",
"relevance": 1.0
}
}
</pre>
</p>