We built and deployed a news recommendation system based on user and news embeddings in [part 2](https://blog.vespa.ai/build-news-recommendation-app-from-python-with-vespa/) of this tutorial series. The embeddings were trained using the model depicted in the figure below and described in detail [here](https://docs.vespa.ai/en/tutorials/news-4-embeddings.html). In this tutorial we show how to check if the embeddings deployed in the recommendation system are working as expected. Checking evaluation metrics of the final search application is at least as important as cheking evaluation metrics when training a model. We want `pyvespa` to make this process as easy as possible.

![Embedding's model](data/2021-04-26-evaluation-news-recommendation/embeddings.png)

When training the model we observed the following training and evaluation metrics over the course of 10 epochs.

In [None]:
$ python3 src/python/train_cold_start.py mind 10
Total loss after epoch 1: 920.5855102539062 (0.703811526298523 avg)
{'auc': 0.5391, 'mrr': 0.2367, 'ndcg@5': 0.2464, 'ndcg@10': 0.3059}
{'auc': 0.5131, 'mrr': 0.2239, 'ndcg@5': 0.2296, 'ndcg@10': 0.2933}
Total loss after epoch 2: 761.7719116210938 (0.5823944211006165 avg)
{'auc': 0.647, 'mrr': 0.2992, 'ndcg@5': 0.3246, 'ndcg@10': 0.3829}
{'auc': 0.5656, 'mrr': 0.2447, 'ndcg@5': 0.2604, 'ndcg@10': 0.3255}
...
Total loss after epoch 10: 517.16748046875 (0.3953879773616791 avg)
{'auc': 0.8758, 'mrr': 0.5074, 'ndcg@5': 0.5818, 'ndcg@10': 0.6316}
{'auc': 0.6249, 'mrr': 0.2842, 'ndcg@5': 0.3114, 'ndcg@10': 0.3733}

Once we have deployed the embeddings in our recommendation system, we should check if we can recover similar evaluation metrics when sending the appropriate queries to the application.

In [1]:
import requests, json

validation_impressions = json.loads(
    requests.get("https://data.vespa.oath.cloud/blog/news/valid_impressions_parsed.json").text
)
validation_impressions[11]

{'query_id': 11,
 'query': 'U2505',
 'relevant_docs': [{'id': 'N26508', 'score': 0}, {'id': 'N20150', 'score': 1}]}

In [2]:
validation_impressions[11]

{'query_id': 11,
 'query': 'U2505',
 'relevant_docs': [{'id': 'N26508', 'score': 0}, {'id': 'N20150', 'score': 1}]}

In [None]:
max([len(d["relevant_docs"]) for idx, d in enumerate(validation_impressions)])

In [3]:
from vespa.application import Vespa

app = Vespa(url = "http://localhost", port = 8080)

In [4]:
def parse_embedding(hit_json):
    embedding_json = hit_json["fields"]["embedding"]["cells"]
    embedding_vector = [0.0] * len(embedding_json)
    for val in embedding_json:
        embedding_vector[int(val["address"]["d0"])] = val["value"]
    return embedding_vector

def query_user_embedding(query):
    result = app.query(body={"yql": "select * from sources user where user_id contains '{}';".format(query)})
    embedding = parse_embedding(result.hits[0])
    return embedding

def create_relevant_docs_per_query(data):
    return {(x["query"]["query_id"], x["query"]["user_id"]):x["relevant_docs"] for x in data}

In [5]:
def body_function(query, relevant_docs_per_query):
    relevant_docs = relevant_docs_per_query[(query["query_id"], query["user_id"])]
    user_embedding = query_user_embedding(query["user_id"]) 
    hits = len(relevant_docs)
    nn_annotations = [
        '"targetHits":{}'.format(hits)
    ]
    nn_annotations = "{" + ",".join(nn_annotations) + "}"
    nn_search = "([{}]nearestNeighbor(embedding, user_embedding))".format(nn_annotations)

    news_id_filter = [ 'news_id contains "{}"'.format(i["id"]) for i in relevant_docs ]
    news_id_filter = " OR ".join(news_id_filter)

    data = {
        "hits": hits,
        "yql": 'select * from sources news where {} AND ({});'.format(nn_search, news_id_filter),
        "ranking.features.query(user_embedding)": str(user_embedding),
        "ranking.profile": "recommendation",
        "timeout": 10
    }
    return data

In [6]:
from vespa.evaluation import ReciprocalRank, NormalizedDiscountedCumulativeGain

eval_metrics = [NormalizedDiscountedCumulativeGain(at=5), NormalizedDiscountedCumulativeGain(at=10)]

In [7]:
validation_impressions2 = [{"query_id": d["query_id"], "query": {"query_id": d["query_id"], "user_id": d["query"]}, "relevant_docs": d["relevant_docs"]} for d in validation_impressions]

In [None]:
validation_impressions2[:2]

In [8]:
from vespa.query import QueryModel

evaluation = app.evaluate(
    labeled_data=validation_impressions2, 
    eval_metrics=eval_metrics, 
    query_model=QueryModel(
        body_function = lambda query: body_function(query, create_relevant_docs_per_query(validation_impressions2))
    ), 
    id_field="news_id",
    per_query=False
)

In [9]:
evaluation

Unnamed: 0,model,default_name
ndcg_5,mean,0.311162
ndcg_5,median,0.237198
ndcg_5,std,0.347625
ndcg_10,mean,0.373119
ndcg_10,median,0.356207
ndcg_10,std,0.320826


In [None]:
evaluation[evaluation.ndcg_5 > 0]

In [None]:
query_data = validation_impressions[2]
evaluation_query = app.evaluate_query(
    eval_metrics=eval_metrics,
    query_model=QueryModel(
        body_function = lambda query: body_function(query, create_relevant_docs_per_query(validation_impressions))
    ),
    query_id=query_data["query_id"],
    query=query_data["query"],
    id_field="news_id",
    relevant_docs=query_data["relevant_docs"],
    default_score=0,
    detailed_metrics=False,
)


In [None]:
evaluation_query

In [None]:
validation_impressions[2]

In [None]:
for d in validation_impressions:
    if d["query"] == "U28498":
        print(d)
        print("\n")

In [None]:
!sample-apps/news/src/python/evaluate.py data/2021-03-02-news/mind 0

In [None]:
from pandas import read_csv

lesters_ndcg = read_csv("data/2021-03-02-news/mind/lesters_ndcg.txt", names=["query_id", "query", "ndcg5"])

In [None]:
lesters_ndcg.head()

In [None]:
validation_impressions[:3]

In [None]:
evaluation[:3]

In [None]:
validation_impressions2[14]

In [None]:
from pandas import merge

test = merge(left=lesters_ndcg, right=evaluation, how="left", on=["query_id"])
test[test.ndcg5 != test.ndcg_5]

In [None]:
%config Completer.use_jedi = False

In [None]:
import io, csv

def read_impressions_file(file_name):
    impressions = []
    if not os.path.exists(file_name):
        print("{} not found.".format(file_name))
        sys.exit(1)
    print("Reading impressions data from " + file_name)

    with io.open(file_name, "r", encoding="utf-8") as f:
        field_list = ["id", "user_id", "timestamp", "history", "impressions"]
        reader = csv.DictReader(f, delimiter="\t", fieldnames=field_list)
        for line in reader:
            user_id = line["user_id"]
            impression = {
                "user_id": user_id,
                "news_ids": [],
                "labels": []
            }
            for i in line["impressions"].split(" "):
                news_id, label = i.split("-")
                impression["news_ids"].append(news_id)
                impression["labels"].append(int(label))
            impressions.append(impression)
    return impressions

In [None]:
import os

data_dir = "data/2021-03-02-news/mind/"
train_impressions_file = os.path.join(data_dir, "train", "behaviors.tsv")
valid_impressions_file = os.path.join(data_dir, "dev", "behaviors.tsv")

In [None]:
train_impressions = read_impressions_file(train_impressions_file)
valid_impressions = read_impressions_file(valid_impressions_file)

In [None]:
for d in valid_impressions:
    if d["user_id"] == "U28498":
        print(d)

In [None]:
parsed_train_impressions = [{"query_id": idx, "query": d["user_id"], "relevant_docs": [{"id": news_id, "score": label} for news_id, label in zip(d["news_ids"], d["labels"])]} for idx, d in enumerate(train_impressions)]

In [None]:
parsed_valid_impressions = [{"query_id": idx, "query": d["user_id"], "relevant_docs": [{"id": news_id, "score": label} for news_id, label in zip(d["news_ids"], d["labels"])]} for idx, d in enumerate(valid_impressions)]

In [None]:
len(parsed_train_impressions)

In [None]:
len(parsed_valid_impressions)

In [None]:
import json

with open("./data/2021-04-26-evaluation-news-recommendation/train_impressions_parsed.json", "w") as f:
    f.write(json.dumps(parsed_train_impressions))

In [None]:
import json

with open("./data/2021-04-26-evaluation-news-recommendation/valid_impressions_parsed.json", "w") as f:
    f.write(json.dumps(parsed_valid_impressions))

In [None]:
with open("./data/2021-04-26-evaluation-news-recommendation/train_impressions_parsed.json", "r") as f:
    x = json.load(f)
len(x)

In [None]:
with open("./data/2021-04-26-evaluation-news-recommendation/valid_impressions_parsed.json", "r") as f:
    x = json.load(f)
len(x)