![Vespa logo](https://vespa.ai/assets/vespa-logo-color.png)

# Multi-vector indexing

This is the pyvespa version of the
[multi-vector-indexing](https://github.com/vespa-engine/sample-apps/tree/master/multi-vector-indexing).
Go to the sample application for a full description and more details.
Below are the steps to reproduce the example using pyvespa. Highlighted features:

* Use a Component. This is used to configure the Huggingface embedder.
* Using synthetic fields with auto-generated
  [embeddings](https://docs.vespa.ai/en/embedding.html) in data and query flow -
  see `is_document_field=False` in the `paragraph_embeddings` field definition.
* File export, model files in the application package, deployment from files.
* How to control text search result highlighting.

This notebook requires pyvespa >= 0.37.0.

## Create the application

In [None]:
from vespa.package import *
from pathlib import Path

app_package = ApplicationPackage(name="wiki",
              components=[Component(id="e5-small-q", type="hugging-face-embedder",
                  parameters=[
                      Parameter("transformer-model", {"path": "model/e5-small-v2-int8.onnx"}),
                      Parameter("tokenizer-model", {"path": "model/tokenizer.json"})
              ])])

app_package.schema.add_fields(
    Field(name="id", type="int", indexing=["attribute", "summary"]),
    Field(name="title", type="string", indexing=["index", "summary"], index="enable-bm25"),
    Field(name="url", type="string", indexing=["index", "summary"], index="enable-bm25"),
    Field(name="paragraphs", type="array<string>", indexing=["index", "summary"],
          index="enable-bm25", bolding=True),
    Field(name="paragraph_embeddings", type="tensor<float>(p{},x[384])",
          indexing=["input paragraphs", "embed", "index", "attribute"],
          attribute=["distance-metric: angular"],
          is_document_field=False)
)

app_package.schema.add_field_set(FieldSet(name="default", fields=["title", "url", "paragraphs"]))

app_package.schema.add_rank_profile(RankProfile(
    name="semantic",
    inputs=[("query(q)", "tensor<float>(x[384])")],
    inherits="default",
    first_phase="cos(distance(field,paragraph_embeddings))",
    match_features=["closest(paragraph_embeddings)"])
)

app_package.schema.add_rank_profile(RankProfile(
        name = "bm25",
        first_phase = "2*bm25(title) + bm25(paragraphs)")
)

app_package.schema.add_rank_profile(RankProfile(
    name="hybrid",
    inherits="semantic",
    functions=[
        Function(name="avg_paragraph_similarity",
            expression="""reduce(
                              sum(l2_normalize(query(q),x) * l2_normalize(attribute(paragraph_embeddings),x),x),
                              avg,
                              p
                          )"""),
        Function(name="max_paragraph_similarity",
            expression="""reduce(
                              sum(l2_normalize(query(q),x) * l2_normalize(attribute(paragraph_embeddings),x),x),
                              max,
                              p
                          )"""),
        Function(name="all_paragraph_similarities",
            expression="sum(l2_normalize(query(q),x) * l2_normalize(attribute(paragraph_embeddings),x),x)")
    ],
    first_phase=FirstPhaseRanking(expression="cos(distance(field,paragraph_embeddings))"),
    second_phase=SecondPhaseRanking(expression="firstPhase + avg_paragraph_similarity() + log( bm25(title) + bm25(paragraphs) + bm25(url))"),
    match_features=["closest(paragraph_embeddings)",
                    "firstPhase",
                    "closest(paragraph_embeddings)",
                    "bm25(title)",
                    "bm25(paragraphs)",
                    "avg_paragraph_similarity",
                    "max_paragraph_similarity",
                    "all_paragraph_similarities"])
)

app_package.schema.add_document_summary(DocumentSummary(name="minimal",
                                                        summary_fields=[Summary("id", "int"),
                                                                        Summary("title", "string")]))

Path("pkg").mkdir(parents=True, exist_ok=True)
app_package.to_files("pkg")

## Download embedding model files:

In [None]:
! mkdir -p pkg/model
! curl -L -o pkg/model/tokenizer.json \
  https://raw.githubusercontent.com/vespa-engine/sample-apps/master/simple-semantic-search/model/tokenizer.json

! curl -L -o pkg/model/e5-small-v2-int8.onnx \
  https://github.com/vespa-engine/sample-apps/raw/master/simple-semantic-search/model/e5-small-v2-int8.onnx

## Deploy the application

In [None]:
from vespa.deployment import VespaDocker

vespa_docker = VespaDocker()
app = vespa_docker.deploy_from_disk(application_name="wiki", application_root="pkg")

## Feed documents

Download documents:

In [None]:
! curl -s -H "Accept:application/vnd.github.v3.raw" \
  https://api.github.com/repos/vespa-engine/sample-apps/contents/multi-vector-indexing/ext/articles.jsonl.zst | \
  zstdcat - > articles.jsonl

Feed the documents using the [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html):

In [None]:
! vespa config set target local
! vespa feed articles.jsonl

## Simple retrieve all articles with undefined ranking

In [None]:
result = app.query(body={
  'yql': 'select * from wiki where true',
  'ranking.profile': 'unranked'
})
result.hits

## Traditional keyword search with BM25 ranking on the article level

In [None]:
result = app.query(body={
  'yql': 'select * from wiki where userQuery()',
  'query': 24,
  'ranking.profile': 'bm25'
})
result.hits

Notice the relevance, which is assigned by the rank-profile. Also note that keywords are highlighted in the paragraphs field.

## Semantic vector search on the paragraph level

In [None]:
result = app.query(body={
  'yql': 'select * from wiki where {targetHits:1}nearestNeighbor(paragraph_embeddings,q)',
  'input.query(q)': 'embed(what does 24 mean in the context of railways)',
  'ranking.profile': 'semantic'
})
result.hits

The closest (best semantic matching) paragraph has index 4.

    "matchfeatures": {
        "closest(paragraph_embeddings)": {"4": 1.0}
    }

This index corresponds to the following paragraph:

"In railway timetables 24:00 means the \"end\" of the day. For example, a train due to arrive at a station during the last minute of a day arrives at 24:00; but trains which depart during the first minute of the day go at 00:00."

## Hybrid search and ranking

Hybrid combining keyword search on the article level with vector search in the paragraph index:

In [None]:
result = app.query(body={
  'yql': 'select * from wiki where userQuery() or ({targetHits:1}nearestNeighbor(paragraph_embeddings,q))',
  'input.query(q)': 'embed(what does 24 mean in the context of railways)',
  'query': 'what does 24 mean in the context of railways',
  'ranking.profile': 'hybrid',
  'hits': 1
})
result.hits

This case combines exact search with nearestNeighbor search. The `hybrid` rank-profile also calculates several additional features using [tensor expressions](https://docs.vespa.ai/en/tensor-user-guide.html):

* firstPhase is the score of the first ranking phase, configured in the hybrid profile as cos(distance(field, paragraph_embeddings)).
* all_paragraph_similarities returns all the similarity scores for all paragraphs.
* avg_paragraph_similarity is the average similarity score across all the paragraphs.
* max_paragraph_similarity is the same as firstPhase, but computed using a tensor expression.

See the `hybrid` rank-profile in the schema for details.
The [Vespa Tensor Playground](https://docs.vespa.ai/playground/) is useful to play with tensor expressions.

These additional features are calculated during [second-phase](https://docs.vespa.ai/en/phased-ranking.html)
ranking to limit the number of vector computations.

## Hybrid search and filter

Filtering is also supported, also disable bolding.

In [None]:
result = app.query(body={
  'yql': 'select * from wiki where url contains "9985" and userQuery() or ({targetHits:1}nearestNeighbor(paragraph_embeddings,q))',
  'input.query(q)': 'embed(what does 24 mean in the context of railways)',
  'query': 'what does 24 mean in the context of railways',
  'ranking.profile': 'hybrid',
  'bolding': False
})
result.hits

## Cleanup

In [None]:
vespa_docker.container.stop()
vespa_docker.container.remove()