# Traces

Taking measurements at irregular intervals is common, but most tools are primarily designed for evenly-spaced measurements. Also, in the real world, time series have missing observations or you may have multiple series with different frequencies: it can be useful to model these as unevenly-spaced.

Traces was designed by the team at [Datascope](https://datascopeanalytics.com/) based on several practical applications in different domains, because it turns out unevenly-spaced data is actually pretty great, particularly for sensor data analysis.

<img src="./traces.png" width="700" alt="traces">

In [41]:
import sys
import glob

from datetime import datetime, timedelta

import traces
from traces.utils import datetime_range

# Basic Data Structure: `traces.TimeSeries`

To see a basic use of traces, let's look at these data from a light switch, also known as Big Data from the Internet of Things.

<img src="./trace.svg" width="700" alt="traces">

The main object in traces is a TimeSeries, which you create just like a dictionary, adding the five measurements at 6:00am, 7:45:56am, etc.

In [42]:
time_series = traces.TimeSeries()
time_series[datetime(2042, 2, 1,  6,  0,  0)] = 0 #  6:00:00am
time_series[datetime(2042, 2, 1,  7, 45, 56)] = 1 #  7:45:56am
time_series[datetime(2042, 2, 1,  8, 51, 42)] = 0 #  8:51:42am
time_series[datetime(2042, 2, 1, 12,  3, 56)] = 1 # 12:03:56am
time_series[datetime(2042, 2, 1, 12,  7, 13)] = 0 # 12:07:13am


In [47]:
time_series

<TimeSeries>
{datetime.datetime(2042, 2, 1, 6, 0): 0,
 datetime.datetime(2042, 2, 1, 7, 45, 56): 1,
 datetime.datetime(2042, 2, 1, 8, 51, 42): 0,
 datetime.datetime(2042, 2, 1, 12, 3, 56): 1,
 datetime.datetime(2042, 2, 1, 12, 7, 13): 0}
</TimeSeries>

# Lookups

What if you want to know if the light was on at 11am? Unlike a python dictionary, you can look up the value at any time even if it's not one of the measurement times.

In [48]:
time_series[datetime(2042, 2, 1, 11,  0, 0)] # 11:00am

0

Above we've looked up the 11:00am timestamp and determined that the light was in fact not on, even though that timestamp didn't exist the time series

# Distribution

The `distribution` function gives you the fraction of time that the `TimeSeries` is in each state.

In [46]:
time_series.distribution(start=datetime(2042, 2, 1,  6,  0,  0), # 6:00am
                         end  =datetime(2042, 2, 1,  13,  0,  0))   # 1:00pm

Histogram({0: 0.8355952380952381, 1: 0.16440476190476191})

The light was on about 16% of the time between 6am and 1pm.