## Evaluate Progress of Labellers

Use this notebook to evaluate the progress of labellers and to inspect their labels against imagery. 

## Set-up 

In [None]:
import os
import sys
import pandas as pd
module_path = os.path.abspath(os.path.join('src'))
sys.path.insert(0, module_path)

from labelreview import labelReview

## Get assignment data

In [None]:
lr = labelReview(config="config-db.yaml")
query = \
    "SELECT name,hit_id,assignment_id,worker_id,email,score,status,kml_type "\
    "FROM assignment_data "\
    "LEFT JOIN hit_data USING (hit_id) "\
    "LEFT JOIN kml_data USING (name) "\
    "LEFT JOIN users ON assignment_data.worker_id = users.id"
assignments = lr.get_data(query)

### Summarize assignment counts and score

In [None]:
counts = assignments[["worker_id", "email", "kml_type"]]\
    .groupby(["worker_id", "email", "kml_type"], as_index=False)\
    .value_counts()

scores = assignments[["worker_id", "score"]]\
    .groupby("worker_id", as_index=False)\
    .mean("score")

Assignment counts

In [None]:
counts.pivot_table(values="count", index=["worker_id", "email"], 
                   columns="kml_type")

Assignment counts plotted

In [None]:
counts.pivot_table(values="count", index="worker_id", columns="kml_type")\
    .plot(subplots=True, kind="bar")
None

Mean score against Q sites for each labeller

In [None]:
scores

Distribution of scores for each labeller, as box plots

In [None]:
# assignments.query("worker_id==13 & kml_type=='Q'")["score"]
assignments.query("kml_type=='Q'")[["worker_id", "score"]]\
    .boxplot(by='worker_id', column='score', grid=False)
None

Add custom queries of the retrieved data as needed, e.g. 

- Q scores for labeller (worker) 20

    ```python
    assignments.query("worker_id==20 & kml_type=='Q'")["score"]
    ```

- The first 10 F scores

    ```python
    assignments.query("kml_type=='F'").iloc[0:10]
    ```

## Review labels against imagery

Select and evaluate specific labeller's work at selected sites. 

Labels can be selected using the `get_labels` method as follows:

- By type of task, F or Q. If you select "Q" the labeller's maps will be shown against the expert maps for the same site.
- Through random choice, or for a particular site name.

For example, the following call will get one randomly selected Q type site for labeller 10. 

```python
labels = lr.get_labels(assignments, id=10, type="Q")
```

This one will get a specifically named Q type site completed by labeller 13  

```python
labels = lr.get_labels(assignments, id=13, type="Q", name="ET0472958")
```


In [None]:
id = 14
labels = lr.get_labels(assignments, id=int(id), type="F")

### Show map

In [None]:
lr.plot_labels(labels)

### Record review

In [None]:
lr.record_review(sample=labels["point"]["name"].loc[0], id=labels["id"],
                 expert_labels=True if labels["type"] == "Q" else False)