Skip to content
Permalink
master
Go to file
 
 
Cannot retrieve contributors at this time
128 lines (113 sloc) 4.83 KB
---
# Copyright 2017 Yahoo Holdings. Licensed under the terms of the Apache 2.0 license. See LICENSE in the project root.
title: "Ranking"
---
<p>
Vespa defines Big Data Serving as:
</p><p>
<em>
Selection, organization and machine-learned model inference,
<ul>
<li>over many, constantly changing data items (thousands to billions),</li>
<li>with low latency (~100 ms) and high load (thousands of queries/second)</li>
</ul>
</em>
</p><p>
<em>Ranking</em> enables organization and ML inference,
and <em>multi-phased ranking</em> addresses latency and load:
<pre>
schema myapp {
rank-profile my-rank-profile {
first-phase {
expression: attribute(quality) * freshness(timestamp)
}
second-phase {
expression: sum(onnx("my-model.onnx", "add"))
}
}
}
</pre>
Applications use the <a href="query-api.html">query API</a> to <em>select</em> the documents to evaluate
using a <a href="query-language.html">query language</a>,
and choose a <a href="#rank-profile">rank profile</a> for the ranking.
</p><p>
Ranking is running <a href="ranking-expressions-features.html">ranking expressions</a> using
<a href="ranking-expressions-features.html">rank features</a>
(values / computed values from queries, document and constants).
</p><p>
Note: Vespa also supports <a href="stateless-model-evaluation.html">stateless model evaluation</a> -
making inferences without documents (i.e. query to model).
</p>
<h2 id="performance">Performance</h2>
<p>
Rank profiles can have one or two phases:
<ul>
<li>Phase one should use a computationally inexpensive function to rank candidates.
This phase is about <em>recall</em>, select the best candidates.</li>
<li>Phase two is run on a small candidate set.
This phase is about <em>precision</em> -
use more resources to fine tune ranking for best candidates on top.</li>
</ul>
In short, the query selection and first phase ranking reduces the size of the computation -
then machine-learned models can be used on the second-phase ranking on
<a href="reference/schema-reference.html#rerank-count">rerank-count</a> documents per node.
This makes the ranking scalable (see <a href="performance/sizing-search.html">sizing</a>):
<ul>
<li>Control the second phase candidate set size</li>
<li>Add content nodes to rank less documents per node</li>
</ul>
Ranking is run in the <a href="overview.html">content cluster</a>.
This means, <em>online inference</em> (a.k.a. <em>real time inference</em> or <em>dynamic inference</em>)
is executed for the candidate document set.
Read more about <a href="phased-ranking.html">phased ranking</a>.
</p>
<h2 id="machine-learned-model-inference">Machine-Learned model inference</h2>
<p>
Vespa supports the following ML models:
<ul>
<li><a href="onnx.html">ONNX</a></li>
<li><a href="tensorflow.html">Tensorflow</a></li>
<li><a href="xgboost.html">XGBoost</a></li>
<li><a href="lightgbm.html">LightGBM</a></li>
</ul>
As these are exposed as rank features, it is possible to rank using a <em>model ensemble</em>.
Deploy multiple model instances and write a rank expression that combines the results (max, avg, custom, ...) - example:
<pre>
schema myapp {
rank-profile my-rank-profile {
...
second-phase {
expression: max( sum(onnx("my-model-1.onnx", "add"), sum(onnx("my-model-2.onnx", "add") )
}
}
}
</pre>
</p>
<!-- ToDo: Not exclusive list, we can represent lots of model, logistic regression etc. Also mention PyTorch through ONNX. -->
<h2 id="model-training-and-deployment">Model Training and Deployment</h2>
<p>
To use data in Vespa to train a model, refer to the
<a href="learning-to-rank.html">Learning to Rank</a> guide.
</p><p>
Models are deployed in <a href="cloudconfig/application-packages.html">application packages</a>.
Read more on how to automate training, deployment and re-training in a closed loop using
<a href="https://cloud.vespa.ai/">Vespa Cloud</a>.
</p>
<h2 id="rank-profile">Rank profile</h2>
<p>
Ranking expressions are stored in <a href="reference/schema-reference.html#rank-profile">rank profiles</a>.
An application can have multiple rank profiles -
this can be used for implementing different use cases, or bucket testing ranking variations.
If not specified, the <a href="text-matching-ranking.html">default text ranking</a> profile is used.
</p><p>
A rank profile can <em>inherit</em> another rank profile.
</p><p>
Queries select rank profile using <a href="reference/query-api-reference.html#ranking.profile">ranking.profile</a>,
or in <a href="searcher-development.html">Searcher</a> code:
<pre>
query.getRanking().setProfile("my-rank-profile");
</pre>
Note that some use cases (where hits can be in any order, or explicitly <a href="reference/sorting.html">sorted</a>)
performs better using the
<a href="reference/schema-reference.html#rank-profile">unranked</a> rank profile.
</p>
You can’t perform that action at this time.