## Create an application package

The [application package](https://pyvespa.readthedocs.io/en/latest/reference-api.html#vespa.package.ApplicationPackage)
has all the Vespa configuration files -
create one from scratch:

In [14]:
from vespa.package import ApplicationPackage, Field, Schema, Document, RankProfile, HNSW, RankProfile, Component, Parameter, FieldSet, GlobalPhaseRanking, Function

package = ApplicationPackage(
        name="findmypasta",
        schema=[Schema(
            name="doc",
            document=Document(
                fields=[
                    Field(name="id", type="string", indexing=["summary"]),
                    Field(name="title", type="string", indexing=["index", "summary"], index="enable-bm25"),
                    Field(name="body", type="string", indexing=["index", "summary"], index="enable-bm25", bolding=True),
                    Field(name="embedding", type="tensor<float>(x[384])",
                        indexing=["input title . \" \" . input body", "embed", "index", "attribute"],
                        ann=HNSW(distance_metric="angular"),
                        is_document_field=False
                    )
                ]
            ),
            fieldsets=[
                FieldSet(name = "default", fields = ["title", "body"])
            ],
            rank_profiles=[
                RankProfile(
                    name="bm25", 
                    inputs=[("query(q)", "tensor<float>(x[384])")],
                    functions=[Function(
                        name="bm25sum", expression="bm25(title) + bm25(body)"
                    )],
                    first_phase="bm25sum"
                ),
                RankProfile(
                    name="semantic", 
                    inputs=[("query(q)", "tensor<float>(x[384])")],
                    first_phase="closeness(field, embedding)"
                ),
                RankProfile(
                    name="fusion", 
                    inherits="bm25",
                    inputs=[("query(q)", "tensor<float>(x[384])")],
                    first_phase="closeness(field, embedding)",
                    global_phase=GlobalPhaseRanking(
                        expression="reciprocal_rank_fusion(bm25sum, closeness(field, embedding))",
                        rerank_count=1000
                    )
                )                
            ]
        )
        ],
        components=[Component(id="e5", type="hugging-face-embedder",
            parameters=[
                Parameter("transformer-model", {"url": "https://github.com/vespa-engine/sample-apps/raw/master/simple-semantic-search/model/e5-small-v2-int8.onnx"}),
                Parameter("tokenizer-model", {"url": "https://raw.githubusercontent.com/vespa-engine/sample-apps/master/simple-semantic-search/model/tokenizer.json"})
            ]
        )]
    ) 

Note that the name cannot have `-` or `_`.

## Deploy the Vespa application 

Deploy `package` on the local machine using Docker,
without leaving the notebook, by creating an instance of
[VespaDocker](https://pyvespa.readthedocs.io/en/latest/reference-api.html#vespa.deployment.VespaDocker). `VespaDocker` connects
to the local Docker daemon socket and starts the [Vespa docker image](https://hub.docker.com/r/vespaengine/vespa/). 

If this step fails, please check
that the Docker daemon is running, and that the Docker daemon socket can be used by clients (Configurable under advanced settings in Docker Desktop).

In [15]:
from vespa.deployment import VespaDocker

vespa_docker = VespaDocker()
app = vespa_docker.deploy(application_package=package)

Waiting for configuration server, 0/300 seconds...
Waiting for configuration server, 5/300 seconds...
Using plain http against endpoint http://localhost:8080/ApplicationStatus
Waiting for application status, 0/300 seconds...
Using plain http against endpoint http://localhost:8080/ApplicationStatus
Waiting for application status, 5/300 seconds...
Using plain http against endpoint http://localhost:8080/ApplicationStatus
Waiting for application status, 10/300 seconds...
Using plain http against endpoint http://localhost:8080/ApplicationStatus
Waiting for application status, 15/300 seconds...
Using plain http against endpoint http://localhost:8080/ApplicationStatus
Waiting for application status, 20/300 seconds...
Using plain http against endpoint http://localhost:8080/ApplicationStatus
Waiting for application status, 25/300 seconds...
Using plain http against endpoint http://localhost:8080/ApplicationStatus
Waiting for application status, 30/300 seconds...
Using plain http against endpoin

`app` now holds a reference to a [Vespa](https://pyvespa.readthedocs.io/en/latest/reference-api.html#vespa.application.Vespa) instance.

## Feeding documents to Vespa

In this example we use the [HF Datasets](https://huggingface.co/docs/datasets/index) library to stream the
[BeIR/nfcorpus](https://huggingface.co/datasets/BeIR/nfcorpus) dataset and index in our newly deployed Vespa instance. Read
more about the [NFCorpus](https://www.cl.uni-heidelberg.de/statnlpgroup/nfcorpus/):

>NFCorpus is a full-text English retrieval data set for Medical Information Retrieval. 

The following uses the [stream](https://huggingface.co/docs/datasets/stream) option of datasets to stream the data without
downloading all the contents locally. The `map` functionality allows us to convert the
dataset fields into the expected feed format for `pyvespa` which expects a dict with the keys `id` and `fields`:

` { "id": "vespa-document-id", "fields": {"vespa_field": "vespa-field-value"}} `

In [16]:
# importing the csvs from the .archive folder, in a way that they are a dataset of the same format as the one used in the training of the model
import pandas as pd

types = {
    "contributor_id": "string",
    "name": "string",
    "id": "string",
    "minutes": "int",
    "tags": "string",
    "nutrition": "string",
    "n_steps": "int",
    "n_ingredients": "int",
    "steps": "string",
    "description": "string",
    "ingredients": "string",
    "submitted": "string"
}
df = pd.read_csv('archive/RAW_recipes.csv', dtype=types)
df = df.dropna()
df = df.reset_index(drop=True)

# creating a body field that is the concatenation of the fields
df['body'] = df['name']
df['body'] = df['body'] + ',\nminutes to cook: ' + df['minutes'].astype(str) 
df['body'] = df['body'] + ', submitted in ' + df["submitted"]
df['body'] = df['body'] + " by " + df["contributor_id"]
df['body'] = df['body'] + ", \n" + df["tags"]
df['body'] = df['body'] + " \n " + df['ingredients']
df['body'] = df['body'] + '\n' + df['steps']
df['body'] = df['body'] + '\n' + df['description']
df['body'] = df['body'] + '\n' + df['n_steps'].astype(str) + ' steps'
df['body'] = df['body'] + '\n' + df['n_ingredients'].astype(str) + ' ingredients'
df['body'] = df['body'] + '\n - nutrition: ' + df['nutrition']

df['title'] = df['name']

# creating a dataframe with the same format as the one used in the training of the model
df = df[['id', 'title', 'body']]
df = df.rename(columns={"id": "id", "title": "title", "body": "body"})
df = df.dropna()
df = df.reset_index(drop=True)

# now converting to IterableDataset format that vespa expects
def to_vespa_format(x):
    return {"id": x["id"], "fields": { "title": x["title"], "body": x["body"], "id": x["id"]}}

# creating the vespa_feed
vespa_feed = df.apply(to_vespa_format, axis=1)

Now we can feed to Vespa using `feed_iterable` which accepts any `Iterable` and an optional callback function where we can
check the outcome of each operation. The application is configured to use [embedding](https://docs.vespa.ai/en/embedding.html)
functionality, that produce a vector embedding using a concatenation of the title and the body input fields. This step is computionally expensive. Read more
about embedding inference in Vespa in the [Accelerating Transformer-based Embedding Retrieval with Vespa](https://blog.vespa.ai/accelerating-transformer-based-embedding-retrieval-with-vespa/).

In [None]:
vespa_feed_slice = vespa_feed[0:6000]

In [42]:
from ipywidgets import IntProgress, VBox, Label, Layout  # Import Layout for styling
from IPython.display import display
import time

def callback(response: VespaResponse, id: str):
    if not response.is_successful():
        print(f"Error when feeding document {id}: {response.get_json()}")
    
    # Update the progress bar value
    progress_bar.value += 1
    progress_label.value = f"Feeding documents: {progress_bar.value}/{progress_bar.max} ({progress_bar.value * 100 / progress_bar.max:.2f}%)"
    update_estimated_time()

def update_estimated_time():
    if progress_bar.value > 0:
        progress_bar.bar_style = 'info'
        progress_bar.style.bar_color = '#00AA00'
        progress_bar.style.description_width = 'initial'
        remaining_documents = progress_bar.max - progress_bar.value
        time_per_document = (time.time() - start_time) / progress_bar.value
        estimated_remaining_time = remaining_documents * time_per_document
        progress_bar.description = f'Progress: (ETA: {format_time(estimated_remaining_time)})'


def format_time(time_in_seconds: float) -> str:
    hours = int(time_in_seconds // 3600)
    time_in_seconds = time_in_seconds - (hours * 3600)
    minutes = int(time_in_seconds // 60)
    seconds = int(time_in_seconds - (minutes * 60))
    return f"{hours:02d}:{minutes:02d}:{seconds:02d}"

# Create a progress bar widget
progress_bar = IntProgress(min=0, max=len(vespa_feed_slice), description='Progress:', layout=Layout(width='50%'))
progress_label = Label(value="Feeding documents: 0/{}".format(len(vespa_feed_slice)))


In [43]:
from vespa.io import VespaResponse

display(VBox([progress_bar, progress_label]))
start_time = time.time()

# Call the feeding function
app.feed_iterable(vespa_feed_slice, schema="doc", namespace="findmypasta", callback=callback)


VBox(children=(IntProgress(value=0, description='Progress:', layout=Layout(width='50%'), max=400), Label(value…

## Querying Vespa

Using the [Vespa Query language](https://docs.vespa.ai/en/query-language.html) we can query the indexed data. 

- Using a context manager `with app.syncio() as session` to handle connection pooling ([best practices](https://cloud.vespa.ai/en/http-best-practices))
- The query method accepts any valid Vespa [query api parameter](https://docs.vespa.ai/en/reference/query-api-reference.html) in `**kwargs`
- Vespa api parameter names that contains `.` must be sent as `dict` parameters in the `body` method argument

different retrieval and [ranking](https://docs.vespa.ai/en/ranking.html) strategies.

In [44]:
import pandas as pd
def display_hits_as_df(response:VespaQueryResponse, fields) -> pd.DataFrame:
    records = []
    for hit in response.hits:
        record = {}
        for field in fields:
            record[field] = hit['fields'][field]
        records.append(record)
    return pd.DataFrame(records)    

### Plain Keyword search 
The following uses plain keyword search functionality with [bm25](https://docs.vespa.ai/en/reference/bm25.html) ranking, the `bm25` rank-profile was configured in the 
application package to use a linear combination of the bm25 score of the query terms against the title and the body field. 


In [45]:
with app.syncio(connections=1) as session:
  query = "What is the best way to prepare pasta?"
  response:VespaQueryResponse = session.query(
    yql="select * from sources * where userQuery() limit 5", 
    query=query, 
    ranking="bm25"
  )
  assert(response.is_successful())
  print(display_hits_as_df(response, ["id", "title"]))

       id                                              title
0  474258     you want me to do what to the buttered noodles
1  383339                    world s best  macaroni   cheese
2  103009                                   30 minute dinner
3  228627  a quartet of english and french cheese flavour...
4  356359                   absolutely delicious pasta sauce


In [46]:
with app.syncio(connections=1) as session:
  query = "How do I cook spaghetti?"
  response:VespaQueryResponse = session.query(
    yql="select * from sources * where userQuery() limit 5", 
    query=query, 
    ranking="bm25"
  )
  assert(response.is_successful())
  print(display_hits_as_df(response, ["id", "title"]))

       id                                 title
0  422277                     spaghetti squares
1  100870            leftovers  spaghetti sauce
2  121107           fooled ya   spaghetti sauce
3   16356  amazing spaghetti with seafood sauce
4  128830               t w a   spaghetti sauce


In [47]:
with app.syncio(connections=1) as session:
  query = "What's the carbonara recipe?"
  response:VespaQueryResponse = session.query(
    yql="select * from sources * where userQuery() limit 5", 
    query=query, 
    ranking="bm25"
  )
  assert(response.is_successful())
  print(display_hits_as_df(response, ["id", "title"]))

       id                                        title
0  160696                   ally style pasta carbonara
1   25400              alexander s spaghetti carbonara
2  405737                           2 minute carbonara
3  140173        15 minute shrimp carbonara fettuccine
4  167435  amy s creamy jalapeo pimiento cheese spread


### Plain Semantic Search 
The following uses dense vector representations of the query and the document and matching is performed and accelerated by Vespa's support for
[approximate nearest neighbor search](https://docs.vespa.ai/en/approximate-nn-hnsw.html). 
The vector embedding representation of the text is obtained using Vespa's [embedder functionality](https://docs.vespa.ai/en/embedding.html#embedding-a-query-text).


In [48]:
with app.syncio(connections=1) as session:
  query = "Whats the best fast meal to a breakfest?"
  response:VespaQueryResponse = session.query(
    yql="select * from sources * where ({targetHits:1000}nearestNeighbor(embedding,q)) limit 5", 
    query=query, 
    ranking="semantic", 
    body = {
      "input.query(q)": f"embed({query})"
    }
  )
  assert(response.is_successful())
  print(display_hits_as_df(response, ["id", "title"]))

       id                       title
0  339108       anne s nuts and bolts
1  324047                 3  2  1 dip
2  472363  amazingly perfect popovers
3  114033              anne s tabouli
4  232880                       amlou


### Hybrid Search

This is one approach to combine the two retrieval strategies and where we use Vespa's support for 
[cross-hits feature normalization and reciprocal rank fusion](https://docs.vespa.ai/en/phased-ranking.html#cross-hit-normalization-including-reciprocal-rank-fusion). This
functionality is exposed in the context of `global` re-ranking, after the distributed query retrieval execution which might span 1000s of nodes. 

#### Hybrid search with the OR query operator

This combines the two methods using logical disjunction (OR). Note that the first-phase expression in our `fusion` expression is only using the semantic score, this 
because usually semantic search provides better recall than sparse keyword search alone. 



In [49]:
with app.syncio(connections=1) as session:
  query = "I want to eat something gourmet, what do you suggest?"
  response:VespaQueryResponse = session.query(
    yql="select * from sources * where userQuery() or ({targetHits:1000}nearestNeighbor(embedding,q)) limit 5", 
    query=query, 
    ranking="fusion", 
    body = {
      "input.query(q)": f"embed({query})"
    }
  )
  assert(response.is_successful())
  print(display_hits_as_df(response, ["id", "title"]))

       id                                            title
0   14912                   5 star gourmet sauce for steak
1  217201             i thought i had nothing to eat  rice
2  400587  amazing cabbage  you ll want to eat a whole one
3   59952                   global gourmet  taco casserole
4  476655                      7 day soup diet  my version


#### Hybrid search with the RANK query operator

This combines the two methods using the [rank](https://docs.vespa.ai/en/reference/query-language-reference.html#rank) query operator. In this case
we express that we want to retrieve the top-1000 documents using vector search, and then have sparse features like BM25 calculated as well (second operand 
of the rank operator). Finally the hits are re-ranked using the reciprocal rank fusion


In [50]:
with app.syncio(connections=1) as session:
  query = "Give me a recipe for a dinner with fish"
  response:VespaQueryResponse = session.query(
    yql="select * from sources * where rank({targetHits:1000}nearestNeighbor(embedding,q), userQuery()) limit 5", 
    query=query, 
    ranking="fusion", 
    body = {
      "input.query(q)": f"embed({query})"
    }
  )
  assert(response.is_successful())
  print(display_hits_as_df(response, ["id", "title"]))

       id                         title
0  361646        alyssa s favorite fish
1  454944      never fail  fish   chips
2   35774   amazing tuna fish casserole
3  495344               fish herb crust
4  165856  chicken fried   fish fingers


#### Hybrid search with filters

In this example we add another query term to the yql, restricting the nearest neighbor search to only consider documents that have vegetable in the title.

In [51]:
with app.syncio(connections=1) as session:
  query = "I want to eat something with vegetables that is healthy and tasty"
  response:VespaQueryResponse = session.query(
    yql="select * from sources * where title contains \"vegetable\" and rank({targetHits:1000}nearestNeighbor(embedding,q), userQuery()) limit 5", 
    query=query, 
    ranking="fusion", 
    body = {
      "input.query(q)": f"embed({query})"
    }
  )
  assert(response.is_successful())
  print(display_hits_as_df(response, ["id", "title"]))

       id                                    title
0  150851         almost vegetarian vegetable soup
1  315721  angel hair pasta with garden vegetables
2   71550           african spiced vegetable salad
3  175922                 another vegetable paella
4   31683                    3 vegetable casserole


## Cleanup

In [52]:
vespa_docker.container.stop()
vespa_docker.container.remove()

## Next steps

This is just an intro into the capabilities of Vespa and pyvespa.
Browse the site to learn more about schemas, feeding and queries - 
find more complex applications in
[examples](https://pyvespa.readthedocs.io/en/latest/examples.html).