Skip to content
Branch: master
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
src
.gitignore
README.md

README.md

Vespa sample applications - album recommendations

See getting started for troubleshooting.

Vespa is used for online Big Data serving, which means ranking (large) data sets using query data. Below is an example of how to rank music albums using a user profile - match albums with scores for a set of categories with a user's preference:

User profile

{
  { cat:pop  }: 0.8,
  { cat:rock }: 0.2,
  { cat:jazz }: 0.1
}

Albums

{
  "fields": {
    "album": "A Head Full of Dreams",
    "artist": "Coldplay",
    "year": 2015,
    "category_scores": {
      "cells": [
        { "address": { "cat": "pop"},  "value": 1  },
        { "address": { "cat": "rock"}, "value": 0.2},
        { "address": { "cat": "jazz"}, "value": 0  }
      ]
    }
  }
}

{
  "fields": {
    "album": "Love Is Here To Stay",
    "artist": "Diana Krall",
    "year": 2018,
    "category_scores": {
      "cells": [
        { "address": { "cat": "pop" },  "value": 0.4 },
        { "address": { "cat": "rock" }, "value": 0   },
        { "address": { "cat": "jazz" }, "value": 0.8 }
      ]
    }
  }
}

{
  "fields": {
    "album": "Hardwired...To Self-Destruct",
    "artist": "Metallica",
    "year": 2016,
    "category_scores": {
      "cells": [
        { "address": { "cat": "pop" },  "value": 0 },
        { "address": { "cat": "rock" }, "value": 1 },
        { "address": { "cat": "jazz" }, "value": 0 }
      ]
    }
  }
}

Rank profile

A rank profile calculates a relevance score per document. This is defined by the application author - in this case, it is the tensor product. The data above is represented using tensors. As the tensor is one-dimensional (the cat dimension), this a vector, hence this is the dot product of the user profile and album categories:

rank-profile rank_albums inherits default {
    first-phase {
        expression: sum(query(user_profile) * attribute(category_scores))
    }
}

Hence, the expected scores are:

Album pop rock jazz total
A Head Full of Dreams 0.8*1.00.2*0.20.1*0.00.84
Love Is Here To Stay 0.8*0.40.2*0.00.1*0.80.4
Hardwired...To Self-Destruct0.8*0.00.2*1.00.1*0.00.2

Build and test the application, and validate that the document's relevance is the expected value, and the results are returned in descending relevance order.

Executable example

  1. Get a X.509 certificate. To create a self-signed certificate (more details in in Data Plane, see Client certificate), do

    $ openssl req -x509 -nodes -days 14 -newkey rsa:4096 \
      -subj "/C=NO/ST=Trondheim/L=Trondheim/O=My Company/OU=My Department/CN=example.com" \
      -keyout data-plane-private-key.pem -out data-plane-public-cert.pem
  2. Go to http://console.vespa.ai/, click "Create application"

  3. Download sample app:

    $ git clone https://github.com/vespa-engine/sample-apps.git && cd sample-apps/album-recommendation
  4. Create the application package

    $ mkdir -p src/main/application/security && cp data-plane-public-cert.pem src/main/application/security/clients.pem
    $ cd src/main/application && zip -r ../../../application.zip . && cd ../../..
  5. In the Vespa console, click Deploy on the application created in the start of this guide. In the "Deploy to dev" section, upload application.zip - click Deploy. Now is a good time to read http://cloud.vespa.ai/automated-deployments, as first time deployments takes a few minutes. Seeing CERTIFICATE_NOT_READY / PARENT_HOST_NOT_READY / LOAD_BALANCER_NOT_READY is normal. The endpoint URL is printed in the Install application section when the deployment is successful - copy this for the next step.

  6. Click "Instances" at the top, then "endpoints". Try the endpoint to validate it is up:

    $ ENDPOINT=https://end.point.name
    $ curl --cert data-plane-public-cert.pem --key data-plane-private-key.pem $ENDPOINT
  7. Feed documents

    $ curl --cert data-plane-public-cert.pem --key data-plane-private-key.pem \
      -H "Content-Type:application/json" --data-binary @src/test/resources/A-Head-Full-of-Dreams.json \
      $ENDPOINT/document/v1/mynamespace/music/docid/1
    $ curl --cert data-plane-public-cert.pem --key data-plane-private-key.pem \
      -H "Content-Type:application/json" --data-binary @src/test/resources/Love-Is-Here-To-Stay.json \
      $ENDPOINT/document/v1/mynamespace/music/docid/2
    $ curl --cert data-plane-public-cert.pem --key data-plane-private-key.pem \
      -H "Content-Type:application/json" --data-binary @src/test/resources/Hardwired...To-Self-Destruct.json \
      $ENDPOINT/document/v1/mynamespace/music/docid/3
  8. Recommend albums, send user profile in query

    $ curl --cert data-plane-public-cert.pem --key data-plane-private-key.pem \
      "$ENDPOINT/search/?ranking=rank_albums&yql=select%20%2A%20from%20sources%20%2A%20where%20sddocname%20contains%20%22music%22%3B&ranking.features.query(user_profile)=%7B%7Bcat%3Apop%7D%3A0.8%2C%7Bcat%3Arock%7D%3A0.2%2C%7Bcat%3Ajazz%7D%3A0.1%7D"
You can’t perform that action at this time.