# Local API Emulator is now a module

This notebook shows you how to use the new `local_api` module. The module emulates the `gresearch_crypto` timeseries API locally. It gives you the flexibility to:
+ feed different slices of train.csv into it
+ create as many instances as you want within one session
+ calculate your LB score (weighted correlation) locally.

It enforces similar constraints to the real API and produces realistic error messages. This code extends and updates a previous [notebook version](https://www.kaggle.com/jagofc/local-api-emulator).

For a quick introduction to importing utility scripts see [this intro video](https://www.youtube.com/watch?v=C4h88PfN5jA&ab_channel=Kaggle).


# Demo

To use this in your notebook you need to:
1. In the notebook menu select File > Add Utility Script
2. Search for "local_api" and click Add.
3. Import `local_api` as you would any module.

In [None]:
import time
import numpy as np
import pandas as pd
import gresearch_crypto
from tqdm import tqdm

import local_api as la

`local_api` has some utility functions, constants and the main `API` class. Check out its contents using tab completion. Here we use the utility function `read_csv_slice`.

In [None]:
train_df = la.read_csv_slice('../input/g-research-crypto-forecasting/train.csv')

The (approx.) public LB timestamp window is available as a constant:

In [None]:
la.LB_WINDOW

Create an example API instance using the public LB window:

In [None]:
api = la.API(train_df, use_window=la.LB_WINDOW)

Get the first batch of data.

In [None]:
(data_df, pred_df) = next(api)
data_df.head()

We'll get an error if we try to continue on to the next batch without making our predictions for the current batch. *Commented out so that the notebook doesn't Fail.*

In [None]:
# next(api)

Let's make a dummy prediction using `pred_df`.

In [None]:
api.predict(pred_df)

Now you can continue to iterate. Lets get another slice of data and make another dummy prediction:

In [None]:
(data_df, pred_df) = next(api)
api.predict(pred_df)

Your predictions are stored by the API. Let's just look at the first two prediction batches we made:

In [None]:
api.predictions

The API also has a length method, which tracks the number of timestamps still to be served:

In [None]:
len(api)

Note that you don't need to to restart the notebook kernel in order to make a new emulator (or to refresh the current one), in contrast with the `gresearch_crypto` env. Here is an instance on a different date window:

In [None]:
example_window = (la.datestring_to_timestamp("2020-05-01T00:00"),
                  la.datestring_to_timestamp("2020-05-14T00:00"))

api2 = la.API(train_df, use_window=example_window)

## Example main loop

An example loop on `example_window` (making dummy predictions of Target=0) with a timing estimate for 100 days worth of data.

In [None]:
api = la.API(train_df, use_window=example_window)

start_time = time.time()

for (data_df, pred_df) in tqdm(api):
    pred_df['Target'] = 0.
    api.predict(pred_df)
    
finish_time = time.time()

total_time = finish_time - start_time
iter_speed = api.init_num_times/total_time

print(f"Iterations/s = {round(iter_speed, 2)}.")
test_iters = 60 * 24 * 100
print(f"Expected number of iterations in test set is approx. {test_iters}",
      f"which will take {round(test_iters / (iter_speed * 3600), 2)} hours",
      "using this API emulator while making dummy predictions.")

The API has a `score` method. This returns:
+ a dataframe containing your predictions, the targets, and weights,
+ the LB score: weighted correlation between predictions and targets.

In [None]:
df, score = api.score()
print(f"Your LB score is {round(score, 4)}")

## TL;DR example loop with random predictions

In [None]:
api = la.API(train_df, use_window=la.LB_WINDOW)

for (data_df, pred_df) in tqdm(api):
    pred_df['Target'] = np.random.randn(len(pred_df), 1)
    api.predict(pred_df)
    
df, score = api.score()

print(f"Your LB score is {round(score, 4)}")