# Kaunt Performance Evaluation

Kaunt deliver coding proposals on dimensions.
The output comes in two formats:
1. Up to three coding proposals for each dimension type
2. Up to three coding proposals on the combination of dimensions

Furthermore, each coding proposal comes with a confidence level corresponding to a minimum precision (based on a cross-validation evaluation set).
Kaunt normally sets the precision-thresholds based on the customer type but a common setting would be:

* HIGH = Minimum 90 % precision
* MEDIUM = Minimum 75 % precision
* LOW = Minimum 50 % precision

This notebook only considers one proposal per dimension per invoice line, which represents the performance in an automation scenario. In the case of providing proposals to accountants, all coding proposals should be considered in an evaluation.

To evaluate the performance of Kaunt, follow these steps:
1. Install dependencies
2. Extract proposals
3. Apply label to proposals
4. Run evaluation

## 1. Install dependencies

The following packages are needed to run the notebook:

In [None]:
dnspython==2.3.0
jupyter==1.0.0
pandas==1.5.1

## 2. Extract proposals

Proposals are extracted through the API with the following code.


Output format:

- 1 row pr. invoice pr. invoice line pr. proposal

| ExternalInvoiceId| InvoiceLineNumber | Confidence | ExternalDimensionId | ExternalDimensionValueId |
| --- | --- | --- | --- | --- |
| 123 | 1 | High | CostCenter | 1000 |
| ... | ... | ... | ... | ... |
| 321 | 2 | Medium | GlAccount | 4000 |

In [None]:
# TO-DO: Fill in the invoice_ids

# THE FOLLOWING INVOICE IDs SERVE AS EXAMPLES. REPLACE WITH YOUR INVOICE IDs
invoice_ids = ["e75c5187-7a65-4697-9580-763f9c6a50da"
,"9572695e-9f60-4cb6-ac4d-f6f0cead6be2"]


In [None]:
# TO-DO: Fill in customer settings

customer_dict = {
    'base_url': 'https://api.kaunt.com',
    'company_name': '',
    'tenant_id': '',
    'client_id': '',
    'client_secret': '',
    'auth_base_address': 'https://api.kaunt.com/v1/oauth2/token'
}

In [None]:
import requests
import json


def get_token(part_dict: dict):
    token_req_payload = {'grant_type': 'client_credentials'}
    token_response = requests.post(
        url=part_dict['auth_base_address'],
        data=token_req_payload, verify=False, allow_redirects=False,
        auth=(part_dict['client_id'], part_dict['client_secret'])
    )
    if token_response.status_code != 200:
        logging.info('Failed to obtain token')
        raise Exception("Failed to obtain access token")
    tokens = json.loads(token_response.text)
    return tokens['access_token']

def get_proposals(retriever: dict, invoice_ids: list):
    invoice_coding_proposals = []
    for idx, invoice_id in enumerate(invoice_ids):
        if idx % 1000 == 0:
            api_call_headers = {'Authorization': 'Bearer ' + get_token(retriever)}
        
        url = "{}".format("/".join([
            retriever['base_url'], 'v1', 'tenants', retriever['tenant_id'], 'companies', retriever['company_name'], 'invoicecodingproposals', invoice_id
       ]))
        url += '?includeInvoice=true'
        response = requests.get(url=url, headers=api_call_headers)
        try:
            invoice_coding_proposals.append(json.loads(response.text))
        except:
            continue
    return invoice_coding_proposals

In [None]:
from collections import defaultdict
import json
import pandas as pd

pd_source = defaultdict(list)

def map_json_output(invoice):
    external_invoice_id = invoice['result']['invoice']['externalInvoiceId']
    dimension_lines = invoice['result']['invoiceCodingProposal']['invoiceLineCodingDimensionProposals']
    combination_lines = invoice['result']['invoiceCodingProposal']['invoiceLineCodingDimensionCombinationProposals']
    for dimension_line in dimension_lines:
        for dimension_proposal in dimension_line['dimensionProposals']:
            if len(dimension_proposal['dimensionValueProposals']) > 0:
                dimension_value_proposal = dimension_proposal['dimensionValueProposals'][0]
                pd_source['ExternalInvoiceId'].append(external_invoice_id)
                pd_source['InvoiceLineNumber'].append(dimension_line['invoiceLineNumber'])
                pd_source['Confidence'].append(dimension_value_proposal['confidence'])
                pd_source['ExternalDimensionId'].append(dimension_proposal['externalDimensionId'])
                pd_source['ExternalDimensionValueId'].append(dimension_value_proposal['externalDimensionValueId'])
    for combination_line in combination_lines:
        if len(combination_line['dimensionCombinationProposals']) > 0:
            combination_proposal = combination_line['dimensionCombinationProposals'][0]
            
            dimension_combination_str_val = "|".join(
                [
                    "{0}:{1}".format(d['externalDimensionId'], d['externalDimensionValueId']) for d in sorted(combination_proposal['dimensionsWithValues'], key=lambda x: x['externalDimensionId'])
                ]
            )
            pd_source['ExternalInvoiceId'].append(external_invoice_id)
            pd_source['InvoiceLineNumber'].append(combination_line['invoiceLineNumber'])
            pd_source['Confidence'].append(combination_proposal['confidence'])
            pd_source['ExternalDimensionId'].append("invoiceLineSuggestion") # Placeholder value
            pd_source['ExternalDimensionValueId'].append(dimension_combination_str_val)

api_output = get_proposals(customer_dict, invoice_ids)
for proposal in api_output:
    map_json_output(proposal)
proposals_df = pd.DataFrame.from_dict(pd_source)

In [None]:
proposals_df

## 3. Apply label to evaluation input

As Kaunt do not have the labels for the proposals, you need to apply the label yourself.
This is done by defining a label dataframe which is then joined on the proposal dataframe.

Proposal dataframe format:

- 1 row pr. invoice pr. invoice line pr. prediction

| ExternalInvoiceId| InvoiceLineNumber | Confidence | ExternalDimensionId | ExternalDimensionValueId |
| --- | --- | --- | --- | --- |
| 123 | 1 | High | CostCenter | 1000 |
| ... | ... | ... | ... | ... | ... |
| 321 | 2 | Medium | GlAccount | 4000 |


Label dataframe format:

- 1 row pr. invoice pr. invoice line pr. dimension (GlAccount is a dimension here)

| ExternalInvoiceId| InvoiceLineNumber | ExternalDimensionId | Label |
| --- | --- | --- | --- |
| 123 | 1 | CostCenter | 1000 |
| ... | ... | ... | ... |
| 321 | 2 | GlAccount | 4000 |

In [None]:
# TO-DO: Define label dataframe (label_df)

# Format: label_df[["ExternalInvoiceId", "InvoiceLineNumber", "ExternalDimensionId", "Label"]]
label_df = pd.DataFrame({
    "ExternalInvoiceId": ["8624260e-0875-4cc2-881a-a94fdc4d0dca", "8624260e-0875-4cc2-881a-a94fdc4d0dca", "8624260e-0875-4cc2-881a-a94fdc4d0dca"],
    "InvoiceLineNumber": ['0', '0', '0'],
    "ExternalDimensionId": ["10001","10002","10003"],
    "Label": ["111", "1", "3"]
}) # To be replaced

label_df


In [None]:
# Calculate combination

unique_dimensions = proposals_df['ExternalDimensionId'].unique()

grouped = label_df.groupby(by=['ExternalInvoiceId', 'InvoiceLineNumber'])
for _, group in grouped:
    external_dims = []
    for _, row in group.iterrows():
        external_dims.append({
            "ExternalDimensionId": row["ExternalDimensionId"],
            "Label": row["Label"]
        })
    for dimension in unique_dimensions:
        if not any(str(d['ExternalDimensionId']) == str(dimension) for d in external_dims) and str(dimension) != 'invoiceLineSuggestion':
            external_dims.append({
            "ExternalDimensionId": str(dimension),
            "Label": 'None'
                    })
    value_id_str = "|".join(
            [
                "{0}:{1}".format(str(d['ExternalDimensionId']), str(d['Label'])) for d in sorted(external_dims, key=lambda a: str(a['ExternalDimensionId']))
            ]
        )
    label_df = label_df.append({
        "ExternalInvoiceId": group.iloc[0]["ExternalInvoiceId"],
        "InvoiceLineNumber": group.iloc[0]["InvoiceLineNumber"],
        "ExternalDimensionId": "invoiceLineSuggestion",
        "Label": value_id_str,
    }, ignore_index=True)

In [None]:
# Joining
label_df['InvoiceLineNumber']=label_df['InvoiceLineNumber'].astype(str)
label_df['ExternalInvoiceId']=label_df['ExternalInvoiceId'].astype(str)
label_df['ExternalDimensionId']=label_df['ExternalDimensionId'].astype(str)
label_df['Label']=label_df['Label'].astype(str)

proposals_df['InvoiceLineNumber']=proposals_df['InvoiceLineNumber'].astype(str)
proposals_df['ExternalInvoiceId']=proposals_df['ExternalInvoiceId'].astype(str)
proposals_df['ExternalDimensionId']=proposals_df['ExternalDimensionId'].astype(str)
proposals_df['ExternalDimensionValueId']=proposals_df['ExternalDimensionValueId'].astype(str)

labels_distinct_invoices_df = label_df[['ExternalInvoiceId', 'InvoiceLineNumber']].value_counts().reset_index(name='count')

# First we inner join to only evaluate the proposals where we have a label
proposals_with_labels_df_pre = pd.merge(proposals_df, labels_distinct_invoices_df[["ExternalInvoiceId", "InvoiceLineNumber"]], how="inner", left_on=["ExternalInvoiceId", "InvoiceLineNumber"], right_on=["ExternalInvoiceId", "InvoiceLineNumber"])

# Apply label to proposals
proposals_with_labels_df = pd.merge(proposals_with_labels_df_pre, label_df, how="left", left_on=["ExternalInvoiceId", "InvoiceLineNumber", "ExternalDimensionId"], right_on=["ExternalInvoiceId", "InvoiceLineNumber", "ExternalDimensionId"])

## 3. Run evaluation

The code below evaluates the prediction output.

Input format:

- 1 row pr. invoice pr. invoice line pr. prediction

| ExternalInvoiceId | InvoiceLineNumber | Confidence | ExternalDimensionId | ExternalDimensionValueId | Label |
| --- | --- | --- | --- | --- | --- |
| 123 | 1 | High | CostCenter | 1000 | 1000 |
| ... | ... | ... | ... | ... | ... |
| 321 | 2 | Medium | GlAccount | 4000 | 4000 |


Output format:

- 1 row pr. confidence pr. unique ExternalDimensionId

| Confidence | ExternalDimensionId | N_Samples | N_Predictions | CorrectPredictions | WrongPredictions | Precision | Recall | PredictionRate (PPCR) |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| HIGH | 1000 | 10 | 8 | 7 | 1 | 87.5 | 70 | 80 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| LOW | 4000 | 10 | 10 | 5 | 5 | 50 | 50 | 100 |

In the code section below, you can comment in a specific dataframe from an existing Kaunt customer:

In [None]:
import pandas as pd
from IPython.display import display, HTML

input_df = proposals_with_labels_df
input_df.fillna('None', inplace=True)


unique_confidence = ['HIGH', 'MEDIUM', 'LOW']
unique_dimensions = input_df['ExternalDimensionId'].unique()

# The number of invoice lines sent through prediction service
number_of_invoice_lines = len(input_df[['ExternalInvoiceId', 'InvoiceLineNumber']].value_counts())

# The final output format:
# 1 line pr. confidence pr. dimension
validation_output: [{}] = []

for dim in unique_dimensions:
    correct_high = len(input_df.loc[input_df['Confidence'] == 'High'].loc[input_df['ExternalDimensionId'] == dim].loc[input_df['ExternalDimensionValueId'] == input_df['Label']])
    wrong_high = len(input_df.loc[input_df['Confidence'] == 'High'].loc[input_df['ExternalDimensionId'] == dim].loc[input_df['ExternalDimensionValueId'] != input_df['Label']])

    correct_medium = len(input_df.loc[input_df['Confidence'] == 'Medium'].loc[input_df['ExternalDimensionId'] == dim].loc[input_df['ExternalDimensionValueId'] == input_df['Label']])
    wrong_medium = len(input_df.loc[input_df['Confidence'] == 'Medium'].loc[input_df['ExternalDimensionId'] == dim].loc[input_df['ExternalDimensionValueId'] != input_df['Label']])

    correct_low = len(input_df.loc[input_df['Confidence'] == 'Low'].loc[input_df['ExternalDimensionId'] == dim].loc[input_df['ExternalDimensionValueId'] == input_df['Label']])
    wrong_low = len(input_df.loc[input_df['Confidence'] == 'Low'].loc[input_df['ExternalDimensionId'] == dim].loc[input_df['ExternalDimensionValueId'] != input_df['Label']])

    # We add up correct/wrong predictions from higher prioritized confidence levels 
    for confidence in unique_confidence:
        if confidence == 'HIGH':
            correct = correct_high
            wrong = wrong_high
        elif confidence == 'MEDIUM':
            correct = correct_high + correct_medium
            wrong = wrong_high + wrong_medium
        elif confidence == 'LOW':
            correct = correct_high + correct_medium + correct_low
            wrong = wrong_high + wrong_medium + wrong_low
        else:
            continue

        n_predictions = correct + wrong
        
        # Calculate output measures
        precision = correct / n_predictions * 100
        recall = correct / number_of_invoice_lines * 100
        ppcr = n_predictions / number_of_invoice_lines * 100

        # Add output entry
        validation_output.append({
            'Confidence': confidence,
            'ExternalDimensionId': dim,
            'N_Samples': number_of_invoice_lines,
            'N_Predictions': n_predictions,
            'CorrectPredictions': correct,
            'WrongPredictions': wrong,
            'Precision': precision,
            'Recall': recall,
            'PredictionRate (PPCR)': ppcr
        })

        
confidence_order = pd.CategoricalDtype(['HIGH', 'MEDIUM', 'LOW'], ordered=True)
output_df = pd.DataFrame.from_dict(validation_output)
output_df['Confidence'] = output_df['Confidence'].astype(confidence_order)

# print(output_df.sort_values('Confidence'))

display(output_df.sort_values(['Confidence', 'ExternalDimensionId']))

