In [13]:
%load_ext autoreload
%autoreload 2

  """Produces a collection and splits it into elements.


The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


# Using Vulkan

This notebook will take you through all the steps in using Vulkan backtest a policy.\
Backtests are useful in estimating the performance of your policies.\
You can simulate different changes in the rules and compare their results before going to production.

By the end of this tutorial, you will have:

1. Created a new policy, which can be used for online and batch evaluation,
2. Created a backtest using the batch interface,
3. Downloaded and analysed the results.

Let's dive in!

In [14]:
import pandas as pd
from pprint import pprint

import vulkan_public.cli.client as vulkan
from vulkan_public.cli.context import Context

In [15]:
ctx = Context()

## Creating a Policy

We'll start by creating a simple policy, with no dependencies.\
In the "create-policy" notebook, we go through the details of this.\
The important part is: you can use the exact same code to create a policy for 1-by-1 runs and for backtesting.

In [16]:
policy_id = vulkan.policy.create_policy(
    ctx,
    name="Test Policy",
    description="Test Policy Description",
)

2024-12-06 08:42:21 Antonios-MacBook-Air.local urllib3.connectionpool[23207] DEBUG Starting new HTTP connection (1): localhost:6001
2024-12-06 08:42:22 Antonios-MacBook-Air.local urllib3.connectionpool[23207] DEBUG http://localhost:6001 "POST /policies HTTP/11" 401 14


ValueError: Failed to create policy: b'"Unauthorized"'

In [9]:
policy_version_id = vulkan.policy.create_policy_version(
    ctx,
    policy_id=policy_id,
    version_name="v0.0.1",
    repository_path="../examples/policies/simple/",
)

2024-11-28 15:28:53 DESKTOP-GLELM79 vulkan_public.cli.context[81340] INFO Creating workspace v0.0.1. This may take a while...
2024-11-28 15:28:53 DESKTOP-GLELM79 urllib3.connectionpool[81340] DEBUG Resetting dropped connection: localhost
2024-11-28 15:29:55 DESKTOP-GLELM79 urllib3.connectionpool[81340] DEBUG http://localhost:6001 "POST /policies/e2411e6b-daf8-4b4a-b5f3-7615dcca9308/versions HTTP/1.1" 200 145
2024-11-28 15:29:55 DESKTOP-GLELM79 vulkan_public.cli.context[81340] INFO Created workspace v0.0.1 with policy version 5267f325-8de4-4662-aded-87be0ebc5d3e


## Running the Backtest

Now that our policy is ready, we can create a backtest.\
To do that, we just need some data. The file below has some samples in the format our policy expects:

In [10]:
df = pd.read_csv("../example_data/simple_bkt.csv")
df.head()

Unnamed: 0,tax_id,score,default
0,1,100,1
1,2,350,0
2,3,700,1
3,4,400,1
4,5,850,0


Now we just pass that data to the Vulkan Engine.\
A job will be created to evaluate the policy on each row of your data.

The first time we backtest a policy, it'll take a few minutes to prepare the environment.\
After that, creating a new backtest is almost instant.


In [11]:
file_info = vulkan.backtest.upload_backtest_file(
    ctx,
    policy_version_id=policy_version_id,
    file_path="../example_data/simple_bkt.csv",
    file_format="CSV",
    schema={"tax_id": "str", "score": "int", "default": "int"},
)
file_id = file_info["uploaded_file_id"]

2024-11-28 15:30:29 DESKTOP-GLELM79 urllib3.connectionpool[81340] DEBUG Resetting dropped connection: localhost
2024-11-28 15:30:31 DESKTOP-GLELM79 urllib3.connectionpool[81340] DEBUG http://localhost:6001 "POST /backtests/files HTTP/1.1" 200 163


In [12]:
vulkan.policy_version.create_backtest_workspace(ctx, policy_version_id)

2024-11-28 15:30:38 DESKTOP-GLELM79 urllib3.connectionpool[81340] DEBUG Resetting dropped connection: localhost
2024-11-28 15:34:32 DESKTOP-GLELM79 urllib3.connectionpool[81340] DEBUG http://localhost:6001 "POST /policy-versions/5267f325-8de4-4662-aded-87be0ebc5d3e/backtest-workspace HTTP/1.1" 200 74


{'policy_version_id': '5267f325-8de4-4662-aded-87be0ebc5d3e', 'status': 'OK'}

In [13]:
backtest_info = vulkan.backtest.create_backtest(
    ctx,
    policy_version_id=policy_version_id,
    input_file_id=file_id,
    config_variables=[
        {"SCORE_CUTOFF": 500},
        {"SCORE_CUTOFF": 700},
    ],
    metrics_config={
        "target_column": "default",
    }
)

backtest_id = backtest_info["backtest_id"]

2024-11-28 15:34:32 DESKTOP-GLELM79 vulkan_public.cli.context[81340] INFO Creating backtest. This may take a while...
2024-11-28 15:34:36 DESKTOP-GLELM79 urllib3.connectionpool[81340] DEBUG http://localhost:6001 "POST /backtests/ HTTP/1.1" 200 431
2024-11-28 15:34:36 DESKTOP-GLELM79 vulkan_public.cli.context[81340] INFO Created backtest with id 51bfe95a-6a37-4da6-bbde-131d6a69598e


In [14]:
vulkan.backtest.poll_backtest_status(ctx, backtest_id)

2024-11-28 15:34:41 DESKTOP-GLELM79 urllib3.connectionpool[81340] DEBUG http://localhost:6001 "GET /backtests/51bfe95a-6a37-4da6-bbde-131d6a69598e/status HTTP/1.1" 200 315
2024-11-28 15:34:41 DESKTOP-GLELM79 vulkan_public.cli.context[81340] INFO {'backtest_id': '51bfe95a-6a37-4da6-bbde-131d6a69598e', 'status': 'PENDING', 'backfills': [{'backfill_id': 'e294c820-3f80-4904-99ce-32ab5f759844', 'status': 'PENDING', 'config_variables': {'SCORE_CUTOFF': 500}}, {'backfill_id': '3e7e8823-a6e5-4874-8b4e-38d3a48e95ba', 'status': 'PENDING', 'config_variables': {'SCORE_CUTOFF': 700}}]}
2024-11-28 15:35:11 DESKTOP-GLELM79 urllib3.connectionpool[81340] DEBUG Resetting dropped connection: localhost
2024-11-28 15:35:12 DESKTOP-GLELM79 urllib3.connectionpool[81340] DEBUG http://localhost:6001 "GET /backtests/51bfe95a-6a37-4da6-bbde-131d6a69598e/status HTTP/1.1" 200 315
2024-11-28 15:35:12 DESKTOP-GLELM79 vulkan_public.cli.context[81340] INFO {'backtest_id': '51bfe95a-6a37-4da6-bbde-131d6a69598e', 'statu

{'backtest_id': '51bfe95a-6a37-4da6-bbde-131d6a69598e',
 'status': 'PENDING',
 'backfills': [{'backfill_id': 'e294c820-3f80-4904-99ce-32ab5f759844',
   'status': 'PENDING',
   'config_variables': {'SCORE_CUTOFF': 500}},
  {'backfill_id': '3e7e8823-a6e5-4874-8b4e-38d3a48e95ba',
   'status': 'PENDING',
   'config_variables': {'SCORE_CUTOFF': 700}}]}

## Getting the results 

Backtest jobs are optimized for scalability.\
This means that they don't run instantaneously, but can run for large volumes of data.\
To make it easier to use, we have a function that waits until a job is finished and gets it's results.

To get the outputs, we can query Vulkan using the Backtest ID.\
This will give us the results for all runs of this individual backtest.

In [15]:
output = vulkan.backtest.get_results(ctx, backtest_id)
output_data = pd.DataFrame(output)
output_data.head(10)

2024-11-28 15:39:47 DESKTOP-GLELM79 urllib3.connectionpool[81340] DEBUG Resetting dropped connection: localhost
2024-11-28 15:39:50 DESKTOP-GLELM79 urllib3.connectionpool[81340] DEBUG http://localhost:6001 "GET /backtests/51bfe95a-6a37-4da6-bbde-131d6a69598e/results HTTP/1.1" 200 1547


Unnamed: 0,backfill_id,key,status,input_node
0,e294c820-3f80-4904-99ce-32ab5f759844,36f5db8887fabefe46b32f5a4235e65c,APPROVED,"{'tax_id': '3', 'score': 700}"
1,e294c820-3f80-4904-99ce-32ab5f759844,0cef52d3c995b51c8c968a91a0a0b98e,DENIED,"{'tax_id': '2', 'score': 350}"
2,e294c820-3f80-4904-99ce-32ab5f759844,fd5ba68fbc2f614a24c4c3b16e1fc3b7,DENIED,"{'tax_id': '4', 'score': 400}"
3,e294c820-3f80-4904-99ce-32ab5f759844,05a5698ec8a4387005f6df33423853bb,DENIED,"{'tax_id': '1', 'score': 100}"
4,e294c820-3f80-4904-99ce-32ab5f759844,c4a8a0b0001f84851349dd25c0fbd4ac,APPROVED,"{'tax_id': '5', 'score': 850}"
5,3e7e8823-a6e5-4874-8b4e-38d3a48e95ba,36f5db8887fabefe46b32f5a4235e65c,DENIED,"{'tax_id': '3', 'score': 700}"
6,3e7e8823-a6e5-4874-8b4e-38d3a48e95ba,0cef52d3c995b51c8c968a91a0a0b98e,DENIED,"{'tax_id': '2', 'score': 350}"
7,3e7e8823-a6e5-4874-8b4e-38d3a48e95ba,fd5ba68fbc2f614a24c4c3b16e1fc3b7,DENIED,"{'tax_id': '4', 'score': 400}"
8,3e7e8823-a6e5-4874-8b4e-38d3a48e95ba,05a5698ec8a4387005f6df33423853bb,DENIED,"{'tax_id': '1', 'score': 100}"
9,3e7e8823-a6e5-4874-8b4e-38d3a48e95ba,c4a8a0b0001f84851349dd25c0fbd4ac,APPROVED,"{'tax_id': '5', 'score': 850}"


## Automated Metrics for Backtests

Vulkan can calculate a bunch a useful metrics about your backtests.\
This will happen automatically for each backtest, and can be useful to analyze your results.

To start, you just need to tell Vulkan to calculate some metrics for the backtest you're creating.\
In each backtest you can specify:

- A target variable: a reference value for each row. For now, we only support binary targets (0 or 1).
- A time variable: a column that identifies the reference time for your data. For instance, this can be a "month of entry". This will be used to group results by time, allowing you to see how the results would have evolved.
- Any number of columns to group by, which can be used to have more granular analyses.

The metrics calculated depend on how you configure the backtest, and on what data you have.\
Let's look at an example where we only have a `target` column.\
Here, for each configuration in our backtest (identified by `backfill_id`) and for each different result (`status`), we can see the distribution of outcomes.

In [17]:
metrics_job = vulkan.backtest.poll_backtest_metrics_job_status(ctx, backtest_id)
metrics_df = pd.DataFrame(metrics_job["metrics"])
metrics_df.head()

2024-11-28 15:46:02 DESKTOP-GLELM79 urllib3.connectionpool[81340] DEBUG Resetting dropped connection: localhost
2024-11-28 15:46:02 DESKTOP-GLELM79 urllib3.connectionpool[81340] DEBUG http://localhost:6001 "GET /backtests/51bfe95a-6a37-4da6-bbde-131d6a69598e/metrics HTTP/1.1" 200 497
2024-11-28 15:46:02 DESKTOP-GLELM79 vulkan_public.cli.context[81340] INFO {'backtest_id': '51bfe95a-6a37-4da6-bbde-131d6a69598e', 'status': 'SUCCESS', 'metrics': [{'backfill_id': 'e294c820-3f80-4904-99ce-32ab5f759844', 'count': 3, 'ones': 2, 'status': 'DENIED', 'zeros': 1}, {'backfill_id': 'e294c820-3f80-4904-99ce-32ab5f759844', 'count': 2, 'ones': 1, 'status': 'APPROVED', 'zeros': 1}, {'backfill_id': '3e7e8823-a6e5-4874-8b4e-38d3a48e95ba', 'count': 4, 'ones': 3, 'status': 'DENIED', 'zeros': 1}, {'backfill_id': '3e7e8823-a6e5-4874-8b4e-38d3a48e95ba', 'count': 1, 'ones': 0, 'status': 'APPROVED', 'zeros': 1}]}


Unnamed: 0,backfill_id,count,ones,status,zeros
0,e294c820-3f80-4904-99ce-32ab5f759844,3,2,DENIED,1
1,e294c820-3f80-4904-99ce-32ab5f759844,2,1,APPROVED,1
2,3e7e8823-a6e5-4874-8b4e-38d3a48e95ba,4,3,DENIED,1
3,3e7e8823-a6e5-4874-8b4e-38d3a48e95ba,1,0,APPROVED,1
