# Query models

Python API to define query models

A [QueryModel](reference-api.rst#vespa.query.QueryModel) is an abstraction
that encapsulates all the relevant information controlling how your app match and rank documents.
A `QueryModel` can be used for [querying](reference-api.rst#vespa.application.Vespa.query),
[evaluating](reference-api.rst#vespa.application.Vespa.evaluate)
and [collecting data](reference-api.rst#vespa.application.Vespa.collect_training_data) from an app.

Before version `0.5.0`, the only way to build a `QueryModel` was by specifying arguments like `match_phase`
and `rank_profile` using the pyvespa API,
such as _match operators_ like [OR](reference-api.rst#vespa.query.OR) - e.g.:

In [None]:
from learntorank.query import QueryModel, Ranking, OR

standard_query_model = QueryModel(
    name="or_bm25",
    match_phase = OR(),
    ranking = Ranking(name="bm25")
)

Starting in version `0.5.0` we can bypass the pyvespa high-level API and create a `QueryModel` with the full flexibility of the [Vespa Query API](https://docs.vespa.ai/en/reference/query-api-reference.html). This is useful for use cases not covered by the pyvespa API and for users that are familiar with and prefer to work with the Vespa Query API.

In [None]:
def body_function(query):
    body = {'yql': 'select * from sources * where userQuery();',
            'query': query,
            'type': 'any',
            'ranking': {'profile': 'bm25', 'listFeatures': 'false'}}
    return body

flexible_query_model = QueryModel(body_function = body_function)

The `flexible_query_model` defined above is equivalent to the `standard_query_model`, as we can see when querying the `app`. We will use the [cord19 app](https://cord19.vespa.ai/) in our demonstration.

In [None]:
from vespa.application import Vespa

app = Vespa(url = "https://api.cord19.vespa.ai")

In [None]:
from learntorank.query import send_query

standard_result = send_query(
    app=app, 
    query="this is a test", 
    query_model=standard_query_model
)
standard_result.get_hits().head(3)

In [None]:
flexible_result = send_query(
    app=app, 
    query="this is a test", 
    query_model=flexible_query_model
)
flexible_result.get_hits().head(3)

## Specify a query model

### Query + term-matching + rank profile

In [None]:
from learntorank.query import QueryModel, OR, Ranking, send_query

results = send_query(
    app=app,
    query="Is remdesivir an effective treatment for COVID-19?", 
    query_model = QueryModel(
        match_phase=OR(), 
        ranking=Ranking(name="bm25")
    )
)

In [None]:
results.number_documents_retrieved

### Query + term-matching + ann operator + rank_profile

In [None]:
from learntorank.query import QueryModel, QueryRankingFeature, ANN, WeakAnd, Union, Ranking
from random import random

match_phase = Union(
    WeakAnd(hits = 10), 
    ANN(
        doc_vector="specter_embedding", 
        query_vector="specter_vector", 
        hits = 10,
        label="title"
    )
)
ranking = Ranking(name="related-specter", list_features=True)
query_model = QueryModel(
    query_properties=[QueryRankingFeature(
        name="specter_vector", 
        mapping=lambda x: [random() for x in range(768)]
    )],
    match_phase=match_phase, ranking=ranking
)

In [None]:
results = send_query(
    app=app,
    query="Is remdesivir an effective treatment for COVID-19?", 
    query_model=query_model
)

In [None]:
results.number_documents_retrieved

## Recall specific documents

Let's take a look at the top 3 ids from the last query.

In [None]:
top_ids = [hit["fields"]["id"] for hit in results.hits[0:3]]
top_ids

Assume that we now want to retrieve the second and third ids above. We can do so with the `recall` argument.

In [None]:
results_with_recall = send_query(
    app=app,
    query="Is remdesivir an effective treatment for COVID-19?", 
    query_model=query_model,
    recall = ("id", top_ids[1:3])
)

It will only retrieve the documents with Vespa field `id` that is defined on the list that is inside the tuple.

In [None]:
id_recalled = [hit["fields"]["id"] for hit in results_with_recall.hits]
id_recalled