Skip to content
Branch: master
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
bin
msmarco/sample
src
.gitignore
README.md
pom.xml

README.md

Vespa sample application - text search tutorial

This sample application contains the code for the text search tutorial. Please refer to the text search tutorial for more information.

Executable example:

$ git clone --depth 1 https://github.com/vespa-engine/sample-apps.git
$ VESPA_SAMPLE_APPS=`pwd`/sample-apps
$ cd $VESPA_SAMPLE_APPS/text-search && mvn clean package
$ docker run --detach --name vespa --hostname vespa-container --privileged \
  --volume $VESPA_SAMPLE_APPS:/apps --publish 8080:8080 vespaengine/vespa

Wait for the configserver to start:

$ docker exec vespa bash -c 'curl -s --head http://localhost:19071/ApplicationStatus'

Deploy the application:

$ docker exec vespa bash -c '/opt/vespa/bin/vespa-deploy prepare /apps/text-search/target/application.zip && \
    /opt/vespa/bin/vespa-deploy activate'

Wait for the application to start:

$ curl -s --head http://localhost:8080/ApplicationStatus

Create data feed:

To use the entire MS MARCO data set, use the download script. Here we use the sample data.

$ ./bin/convert-msmarco.sh

Feed data:

$ docker exec vespa bash -c 'java -jar /opt/vespa/lib/jars/vespa-http-client-jar-with-dependencies.jar \
    --file /apps/text-search/msmarco/vespa.json --host localhost --port 8080'

Test the application:

$ curl -s 'http://localhost:8080/search/?query=what+is+dad+bod' 

Browse the site:

http://localhost:8080/site

Install python dependencies:

pip3 install -qqq --upgrade pip
pip3 install -qqq -r src/python/requirements.txt

Collect training data:

./src/python/collect_training_data.py msmarco/sample collect_rank_features 99

Train TF-Ranking models:

./src/python/tfrank.py msmarco/sample

Shutdown and remove the container:

$ docker rm -f vespa
You can’t perform that action at this time.