# TRINE Tutorial

This notebook walks through how TRINE can be used to predict the date of occurence of a recurrent event given partial noisy information about the event.

In [1]:
# rename this
from marginal_inference_list import marginal_inference as trine_infer
from marginal_inference_list import get_dists

from collections import OrderedDict
from IPython.display import Image

We will illustrate the usage with the example of Super Bowl which is shown in Fig 3 of the paper (reproduced below):

<center>
<img src="superbowl.png" width="400">
</center>

The schedule extractor has extracted the broad schedule "FEB" for the event. The instance extractor has two instances for Super Bowl 2016 with differing confidences, and once extraction for Super Bowl 2015. In our knowledge base, we have the actual occurence date of Super Bowl 2014. The date "June 23 2016" is clearly an erroneous extraction from the instance extractor.

Let us use this data to get a ranked list of predictions for the year 2018.

In [2]:
# all instances with confidences
instances = [ [(2014, 2, 2)], [(2015, 1, 25)], [(2016, 6, 23), (2016, 2, 7)] ]
confidences = [ [1.0], [0.7], [0.6, 0.9] ]

# extracted schedule
sched_ext = 'm02' # FEB

query_year = 2018

Notice that we folded in the knowledge base entry (2 Feb 2014) into our list of extractions, while keeping its confidence equal to 1. We know perform inference to return a ranked list of predictions for 2018.

In [3]:
dists = get_dists(instances, query_year) # loading distributions for all relevant years

pred = trine_infer(instances, [sched_ext], dists, query_year, confidences)

In [6]:
pred[:5] # day of year (35 = 4th Feb)

[35, 33, 38, 34, 36]

TRINE returns a ranked list of predictions, with 4 Feb 2018 (35th day of the year) being the top prediction. This corroborates with the actual date of the event (https://en.wikipedia.org/wiki/Super_Bowl_LII). We can use the following code to convert day of the year to dates.

In [7]:
import datetime 

def day_to_date(day, year):
    return datetime.datetime(year, 1, 1) + datetime.timedelta(day - 1)

In [8]:
pred_dates = [day_to_date(day, query_year) for day in pred]
pred_dates[:5]

[datetime.datetime(2018, 2, 4, 0, 0),
 datetime.datetime(2018, 2, 2, 0, 0),
 datetime.datetime(2018, 2, 7, 0, 0),
 datetime.datetime(2018, 2, 3, 0, 0),
 datetime.datetime(2018, 2, 5, 0, 0)]