# "TREC-COVID round 1 submission"
> "Reproducing the submission with Vespa python API"

- toc: true 
- badges: true
- comments: true
- categories: [COVID-19, vespa]
#- image: images/diagram.png # display image on social networks when sharing the URL


In [1]:
# hide
%load_ext autoreload
%autoreload 2

## Vespa

Connect to the [CORD-19 Vespa](https://cord19.vespa.ai/) API.

In [2]:
from vespa.application import Vespa

app = Vespa(url = "https://api.cord19.vespa.ai")

Define the query model used for the submission.

In [3]:
from vespa.query import Query, OR, RankProfile

query_model = Query(
    match_phase = OR(),
    rank_profile = RankProfile(name="bm25t5")
)

## Submission

Load the topics provided by the organizers.

In [4]:
import requests
import json

topics = json.loads(requests.get("https://thigm85.github.io/data/covid19/topics-annotated.json").text)

Generate the submissions by querying the Vespa application,

**TODO**: 

- Include ranking.softtimeout.enable = 'false' on RankProfile?
- Do we need all the arguments currently on app.query?
- Set hits to 1000
- Find why I don't get any cord_uid

In [10]:
from pandas import DataFrame

submission = []
for t in topics:
    id = t['id']
    question = t['question']
    query = t['query']
    narrative = t['narrative']

    query = question + ' ' + query + ' ' + narrative 
    result = app.query(
          query=query, 
          query_model=query_model, 
          hits = 2, 
          model = {'defaultIndex': 'allt5'}, 
          summary = 'default',
          timeout = '15s',
          collapsefield = 'cord_uid',
          bolding = 'false'
      )

    i = 0
    for h in result['root']['children']:
        i+=1       
        submission.append(
            {"topicid": id,
             "Q0": "Q0",
             "docid": h.get('cord_uid'),
             "rank": i,
             "score": h['relevance'],
             "run-tag": query_model.rank_profile.name
            })

submission = DataFrame.from_records(submission)

In [11]:
submission

Unnamed: 0,topicid,Q0,docid,rank,score,run-tag
0,1,Q0,,1,69.748002,bm25t5
1,1,Q0,,2,65.733267,bm25t5
2,2,Q0,,1,90.335492,bm25t5
3,2,Q0,,2,85.744186,bm25t5
4,3,Q0,,1,80.218932,bm25t5
...,...,...,...,...,...,...
65,33,Q0,,2,93.861975,bm25t5
66,34,Q0,,1,80.500815,bm25t5
67,34,Q0,,2,77.159716,bm25t5
68,35,Q0,,1,108.192795,bm25t5
