# PSS SciCatLive 4: scoring
## PaNOSC Search Scoring Workshop, Part 4
## SciCatLive PANOSC Search Scoring, scoring entries against a query

This notebook shows how score the items against a query.  
It assumes that you have the SciCatLive running on your machine and that you already run the previous three notebooks: 
- **PSS SciCatLive 1 populating**
- **PSS SciCatLive 2 weight computing**
- **PSS SciCatLive 3 managing**

This notebook leverages the *score* endpoint


**Important**: all the current items and weights already present in the database will be deleted.

**Disclaimer**:  
This notebook has been prepared within the context  of the PaNOSC Scoring Workshop.  
It is provided it as is, although you are free to re-use it for other purposed and modified it as you need.   
By using this notebook, you are releasing ESS and its team from any responsability.

In [None]:
%run PSS-SciCatLive-common.ipynb

## Query 1

The relevancy score is computed against a specific query

We would like to find the documents that are most relevant to the query:  
**tomographic extent**

This is the score endpoint for SciCat backend in SciCatLive

In [None]:
pss_score_url

We would like all the items scored independently from the group they belong to

In [None]:
res = requests.post(
    pss_score_url,
    json={
        "query" : "tomographic and extent five"
    }
)

In [None]:
res

In [None]:
response = res.json()

The response has the following structure:
```json
{
    'request' : {
        'query': 'tomographic extent five',
        'itemIds': [],
        'group': '',
        'limit': -1
    },
    'query': {
        'query': 'tomographic extent five',
        'terms': ['tomograph', 'extent', 'five']
    },
    'scores': [ 
        {
            'itemId': ...,
            'score': ...,
            'group': ...
        },
        ...
    ],
    'dimension': ..,
    'computeInProgress': True/False,
    'started': '2021-12-13T14:08:38.479001',
    'ended': '2021-12-13T14:08:38.496696'
}
```

In [None]:
response

Number of items scored

In [None]:
response['dimension']

Is a weights computation currently in progress

In [None]:
response['computeInProgress']

Items returned and associated scores

In [None]:
for item in response['scores']:
    print("{:10} {:30} : {}".format(item["group"],item["itemId"],item["score"]))

## Query 2

We would like to find the documents that are most relevant to the query:  
**cheetah beamenergy post**

We would like all the items scored independently from the group they belong to

In [None]:
res = requests.post(
    pss_score_url,
    json={
        "query" : "Cheetah and beamenergy and post"
    }
)

In [None]:
res

In [None]:
response = res.json()

Number of items scored

In [None]:
response['dimension']

Items returned and associated scores

In [None]:
for item in response['scores']:
    print("{:10} {:30} : {}".format(item["group"],item["itemId"],item["score"]))

## Query 3

We would like to run the same query as in **Query 2** but limit the results to the group **Datasets** and retrieve only 4 results

Here is the request with the correct json structure

In [None]:
res = requests.post(
    pss_score_url,
    json={
        "query" : "Cheetah beamenergy post",
        "group" : "datasets",
        "limit" : 4
    }
)

In [None]:
res

In [None]:
response = res.json()

Number of items scored

In [None]:
response['dimension']

Items returned and associated scores

In [None]:
for item in response['scores']:
    print("{:10} {:30} : {}".format(item["group"],item["itemId"],item["score"]))