![Vespa logo](https://vespa.ai/assets/vespa-logo-color.png)

# Application packages

Vespa is configured using an [application package](https://docs.vespa.ai/en/application-packages.html).
Pyvespa provides an API to generate a deployable application package.

**Note:** Pyvespa does not support all Vespa features.
See the end of this notebook for how to export files to modify the schema and deploy.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vespa-engine/pyvespa/blob/master/docs/sphinx/source/application-packages.ipynb)

An application package has at a minimum a [schema](https://docs.vespa.ai/en/schemas.html)
and [services.xml](https://docs.vespa.ai/en/reference/services.html).
Example - create an empty application package:

In [None]:
from vespa.package import ApplicationPackage

app_package = ApplicationPackage(name="myschema", create_query_profile_by_default=False)

In this notebook, the application package is exported to disk for inspection - example:

In [None]:
import os, tempfile
from pathlib import Path

temp_dir = tempfile.TemporaryDirectory()
os.environ["TMP_APP_DIR"] = temp_dir.name
app_package.to_files(temp_dir.name)

for p in Path(temp_dir.name).rglob('*'):
    if p.is_file():
        print(p)

> **_NOTE: pyvespa generally does not support all indexing options in Vespa - it is made for easy experimentation._**
  **_To configure setting an unsupported indexing option (or any other unsupported option),_**
  **_export the application package like above, modify the schema or other files_**
  **_and deploy the application package from the directory, or as a zipped file._**
  **_Find more details at the end of this notebook._**

## Schema

Use a schema to create fields, fieldsets and a ranking function. Export the empty schema (an empty schema is created, with the same name as the application package):

In [None]:
!cat $TMP_APP_DIR/schemas/myschema.sd

Add fields, a fieldset and a ranking function:

In [None]:
from vespa.package import Field, FieldSet, RankProfile

app_package.schema.add_fields(
    Field(name = "id",    type = "string", indexing = ["attribute", "summary"]),
    Field(name = "title", type = "string", indexing = ["index", "summary"], index = "enable-bm25"),
    Field(name = "body",  type = "string", indexing = ["index", "summary"], index = "enable-bm25")
)

app_package.schema.add_field_set(
    FieldSet(name = "default", fields = ["title", "body"])
)

app_package.schema.add_rank_profile(
    RankProfile(name = "default", first_phase = "bm25(title) + bm25(body)")
)

Export the application package again, show schema:

In [None]:
app_package.to_files(temp_dir.name)
!cat $TMP_APP_DIR/schemas/myschema.sd

Note how the indexing settings are written to the schema. At this point, review the Vespa documentation:

* [field](https://docs.vespa.ai/en/schemas.html#field)
* [fieldset](https://docs.vespa.ai/en/schemas.html#fieldset)
* [rank-profile](https://docs.vespa.ai/en/ranking.html#rank-profiles)

## Services

In `services.xml` you will find a container and content cluster -
see the [Vespa Overview](https://docs.vespa.ai/en/overview.html).
This is a file you will normally not change or need to know much about - dump the default file:

In [None]:
!cat $TMP_APP_DIR/services.xml

Observe:

* A content cluster (this is where the index is stored) called `myschema_content` is created.
  This is information not normally needed, unless using
  [delete_all_docs](https://pyvespa.readthedocs.io/en/latest/reference-api.html#vespa.application.Vespa.delete_all_docs)
  to quickly remove all documents from a schema

## Deploy from modified files

This example adds custom configuration to the `services.xml` file above and deploys it:

In [None]:
%%sh
cat << EOF > $TMP_APP_DIR/services.xml
<?xml version="1.0" encoding="UTF-8"?>
<services version="1.0">
    <container id="myschema_container" version="1.0">
        <search></search>
        <document-api></document-api>
    </container>
    <content id="myschema_content" version="1.0">
        <redundancy reply-after="1">1</redundancy>
        <documents>
            <document type="myschema" mode="index"></document>
        </documents>
        <nodes>
            <node distribution-key="0" hostalias="node1"></node>
        </nodes>
        <tuning>
            <resource-limits>
                <disk>0.90</disk>
            </resource-limits>
        </tuning>
    </content>
</services>
EOF

The [resource-limits](https://docs.vespa.ai/en/reference/services-content.html#resource-limits) in `tuning/resource-limits/disk` configuration setting allows a higher disk usage.

Deploy using the exported files:

In [None]:
from vespa.deployment import VespaDocker

vespa_container = VespaDocker()
vespa_connection = vespa_container.deploy_from_disk(application_name="myapp", application_root=temp_dir.name)

One can also export a deployable zip-file, which can be deployed using the Vespa Cloud Console:

In [None]:
Path.mkdir(Path(temp_dir.name) / "zip", exist_ok=True, parents=True)
app_package.to_zipfile(temp_dir.name + "/zip/application.zip")

! find "$TMP_APP_DIR/zip" -type f

### Cleanup

Remove the container resources and temporary application package file export:

In [None]:
temp_dir.cleanup()
vespa_container.container.stop()
vespa_container.container.remove()

## Next step: Deploy, feed and query

Once the schema is ready for deployment, decide deployment option and deploy the application package:

* [Deploy to local container](https://pyvespa.readthedocs.io/en/latest/getting-started-pyvespa.html)
* [Deploy to Vespa Cloud](https://pyvespa.readthedocs.io/en/latest/getting-started-pyvespa-cloud.html)

Use the guides on the pyvespa site to feed and query data.