<picture>
  <source media="(prefers-color-scheme: dark)" srcset="https://vespa.ai/assets/vespa-ai-logo-heather.svg">
  <source media="(prefers-color-scheme: light)" srcset="https://vespa.ai/assets/vespa-ai-logo-rock.svg">
  <img alt="#Vespa" width="200" src="https://vespa.ai/assets/vespa-ai-logo-rock.svg" style="margin-bottom: 25px;">
</picture>

# Querying Vespa

This guide goes through how to query a Vespa instance using the Query API
and https://cord19.vespa.ai/ app as an example. 

<div class="alert alert-info">
    Refer to <a href="https://pyvespa.readthedocs.io/en/latest/troubleshooting.html">troubleshooting</a>
    for any problem when running this guide.
</div>

You can run this tutorial in Google Colab:

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vespa-engine/pyvespa/blob/master/docs/sphinx/source/query.ipynb)

In [None]:
!pip3 install pyvespa

Connect to a running Vespa instance. 

In [None]:
from vespa.application import Vespa
from vespa.io import VespaQueryResponse
from vespa.exceptions import VespaError

app = Vespa(url="https://api.cord19.vespa.ai")

See the [Vespa query language](https://docs.vespa.ai/en/reference/query-api-reference.html)
for Vespa query api request parameters.

The YQL [userQuery()](https://docs.vespa.ai/en/reference/query-language-reference.html#userquery)
operator uses the query read from `query`. The query also specificies to use the app specific [bm25 rank profile](https://docs.vespa.ai/en/reference/bm25.html). The code 
uses [context manager](https://realpython.com/python-with-statement/) `with session` statement to make sure that connection pools are released. If
you attempt to make multiple queries, this is important as each query will not have to setup new connections.  

In [None]:
with app.syncio() as session:
    response: VespaQueryResponse = session.query(
        yql="select documentid, cord_uid, title, abstract from sources * where userQuery()",
        hits=1,
        query="Is remdesivir an effective treatment for COVID-19?",
        ranking="bm25",
    )
    print(response.is_successful())
    print(response.url)

Alternatively, if the native [Vespa query parameter](https://docs.vespa.ai/en/reference/query-api-reference.html) 
contains ".", which cannot be used as a `kwarg`, the parameters can be sent as HTTP POST with 
the `body` argument. In this case `ranking` is an alias of `ranking.profile`, but using `ranking.profile` as a `**kwargs` argument is not allowed in python. This
will combine HTTP parameters with a HTTP POST body.

In [None]:
with app.syncio() as session:
    response: VespaQueryResponse = session.query(
        hits=1,
        body={
            "yql": "select documentid, cord_uid, title, abstract from sources * where userQuery()",
            "query": "Is remdesivir an effective treatment for COVID-19?",
            "ranking.profile": "bm25",
            "presentation.timing": True,
        },
    )
    print(response.is_successful())

The query specified that we wanted one hit:

In [None]:
response.hits

Example of iterating over the returned hits obtained from `respone.hits`, extracting the `cord_uid` field:

In [None]:
[hit["fields"]["cord_uid"] for hit in response.hits]

Access the full JSON response in the Vespa
[default JSON result format](https://docs.vespa.ai/en/reference/default-result-format.html):

In [None]:
response.json

## Query Performance

There are several things that impact end-to-end query performance

- HTTP layer performance, connecting handling, mututal TLS handshake and network round-trip latency 
  - Make sure to re-use connections using context manager `with vespa.app.syncio():` to avoid setting up new connections
  for every unique query. See [http best practises](https://cloud.vespa.ai/en/http-best-practices)
  - The size of the fields and the number of hits requested also greatly impacts network performance, a larger payload means higher latency. 
  - By adding `"presentation.timing": True` as a request parameter, the Vespa response includes the server side processing (also including reading the query 
  from network, but not delivering the result over the network). This can be handy to debug latency. 
- Vespa performance, the features used inside the Vespa instance. 



In [None]:
with app.syncio(connections=12) as session:
    response: VespaQueryResponse = session.query(
        hits=1,
        body={
            "yql": "select documentid, cord_uid, title, abstract from sources * where userQuery()",
            "query": "Is remdesivir an effective treatment for COVID-19?",
            "ranking.profile": "bm25",
            "presentation.timing": True,
        },
    )
    print(response.is_successful())

## Error handling

Vespa's default query timeout is 500ms, PyVespa will by default retry up to 3 times for queries
that return response codes like 429, 500,503 and 504. A `VespaError` is raised if retries did not end up with success. In the following
example we set a very low [timeout](https://docs.vespa.ai/en/reference/query-api-reference.html#timeout) of `1ms` which will cause 
Vespa to time out the request and it returns a 504 http error code. The underlaying error is wrapped in a `VespaError` with
the payload error message returned from Vespa:


In [None]:
with app.syncio(connections=12) as session:
    try:
        response: VespaQueryResponse = session.query(
            hits=1,
            body={
                "yql": "select * from sources * where userQuery()",
                "query": "Is remdesivir an effective treatment for COVID-19?",
                "timeout": "1ms",
            },
        )
        print(response.is_successful())
    except VespaError as e:
        print(str(e))

In the following example we forgot to include the `query` parameter, but still reference it in the yql, this cause a bad client request response (400):

In [None]:
with app.syncio(connections=12) as session:
    try:
        response: VespaQueryResponse = session.query(
            hits=1, body={"yql": "select * from sources * where userQuery()"}
        )
        print(response.is_successful())
    except VespaError as e:
        print(str(e))