![Vespa logo](https://vespa.ai/assets/vespa-logo-color.png)

# Application packages

Vespa is configured using an [application package](https://docs.vespa.ai/en/application-packages.html).
Pyvespa provides an API to generate a deployable application package.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vespa-engine/pyvespa/blob/master/docs/sphinx/source/application-packages.ipynb)

An application package has at a minimum a [schema](https://docs.vespa.ai/en/schemas.html)
and [services.xml](https://docs.vespa.ai/en/reference/services.html).
Example - create an empty application package:

In [None]:
from vespa.package import ApplicationPackage

app_package = ApplicationPackage(name="myschema")

To inspect an application package, dump it to disk using
[to_files](https://pyvespa.readthedocs.io/en/latest/reference-api.html#vespa.package.ApplicationPackage.to_files):

In [None]:
import tempfile, os

temp_dir = tempfile.TemporaryDirectory()
os.environ["TMP_APP_DIR"] = temp_dir.name
app_package.to_files(temp_dir.name)
print(temp_dir.name)

In [None]:
!cd $TMP_APP_DIR && find . -type f

./services.xml
./schemas/myschema.sd
./search/query-profiles/types/root.xml
./search/query-profiles/default.xml


Ignore these files for now:

    ./search/query-profiles/types/root.xml
    ./search/query-profiles/default.xml

## Schema

Use a schema to create fields, fieldsets and a ranking function. Dump the empty schema (an empty schema is created, with the same name as the application package):

In [None]:
!cat $TMP_APP_DIR/schemas/myschema.sd

schema myschema {
    document myschema {
    }
}

Add fields, a fieldset and a ranking function:

In [None]:
from vespa.package import Field, FieldSet, RankProfile

app_package.schema.add_fields(
    Field(name = "id",    type = "string", indexing = ["attribute", "summary"]),
    Field(name = "title", type = "string", indexing = ["index", "summary"], index = "enable-bm25"),
    Field(name = "body",  type = "string", indexing = ["index", "summary"], index = "enable-bm25")
)

app_package.schema.add_field_set(
    FieldSet(name = "default", fields = ["title", "body"])
)

app_package.schema.add_rank_profile(
    RankProfile(name = "default", first_phase = "bm25(title) + bm25(body)")
)

Dump application package again, show schema:

In [None]:
app_package.to_files(temp_dir.name)
!cat $TMP_APP_DIR/schemas/myschema.sd

schema myschema {
    document myschema {
        field id type string {
            indexing: attribute | summary
        }
        field title type string {
            indexing: index | summary
            index: enable-bm25
        }
        field body type string {
            indexing: index | summary
            index: enable-bm25
        }
    }
    fieldset default {
        fields: title, body
    }
    rank-profile default {
        first-phase {
            expression {
                bm25(title) + bm25(body)
            }
        }
    }
}

Note how the indexing settings are written to the schema.

> **_NOTE: pyvespa generally does not support all indexing options in Vespa - it is made for easy experimentation._**
  **_To configure setting an unsupported indexing option (or any other unsupported option),_**
  **_dump the application package, modify the schema file_**
  **_and deploy the application package from the directory, or as a zipped file._**
  **_[Read more](https://pyvespa.readthedocs.io/en/latest/deploy-docker.html)._**

At this point, review the Vespa documentation:

* [field](https://docs.vespa.ai/en/schemas.html#field)
* [fieldset](https://docs.vespa.ai/en/schemas.html#fieldset)
* [rank-profile](https://docs.vespa.ai/en/ranking.html#rank-profiles)

## Services

In `services.xml` you will find a container and content cluster -
see the [Vespa Overview](https://docs.vespa.ai/en/overview.html).
This is a file you will normally not change or need to know much about - dump the default file:

In [None]:
!cat $TMP_APP_DIR/services.xml

<?xml version="1.0" encoding="UTF-8"?>
<services version="1.0">
    <container id="myschema_container" version="1.0">
        <search></search>
        <document-api></document-api>
    </container>
    <content id="myschema_content" version="1.0">
        <redundancy reply-after="1">1</redundancy>
        <documents>
            <document type="myschema" mode="index"></document>
        </documents>
        <nodes>
            <node distribution-key="0" hostalias="node1"></node>
        </nodes>
    </content>
</services>

Observe:
* A content cluster (this is where the index is stored) called `myschema_content` is created.
  This is information not normally needed, unless using
  [delete_all_docs](https://pyvespa.readthedocs.io/en/latest/reference-api.html#vespa.application.Vespa.delete_all_docs)
  to quickly remove all documents from a schema

Remove the temporary application package file dump:

In [None]:
temp_dir.cleanup()

## Next step: Deploy, feed and query

Once the schema is ready for deployment, decide deployment option and deploy the application package:
* [Deploy to local container](https://pyvespa.readthedocs.io/en/latest/deploy-docker.html)
* [Deploy to Vespa Cloud](https://pyvespa.readthedocs.io/en/latest/deploy-vespa-cloud.html)

Use the guides on the pyvespa site to feed and query data.