# tRacket - Noise Analysis Starter Notebook

This notebook is meant to serve as a starting point to analyse noise data collected by the tRacket project and is set up to run on Google Colab. It shows you how to fetch data from the API and use some of our plotting functions to look at the data.

Happy exploring!

PS: If you wish to run the notebook locally, you will need to change the Setup code below to clone the Git repo and set up the required packages. We recommend creating a new Python environment for this or even better, using a Docker container. You can find more info on this approach [here](
https://github.com/CivicTechTO/tRacket-dashboard/blob/main/DEV-README.md). Alternatively, you we show below the API endpoints where you can find the data and you can write your own functions to request and format dataframes.


----


## Setup

We'll fetch the tRacket dashboard repo firts which comes with a number of useful utilities to interact with the tRacket API.

In [1]:
!git clone https://github.com/CivicTechTO/tRacket-dashboard.git

fatal: destination path 'tRacket-dashboard' already exists and is not an empty directory.


In [2]:
!pip install -r /content/tRacket-dashboard/requirements.txt



In [3]:
%cd /content/tRacket-dashboard/app/

/content/tRacket-dashboard/app


Once the packages are set up, we are ready to request the data from the API.

## Data Loading

We'll start by importing some utilities.

In [4]:
from src.data_loading.main import AppDataManager
from src.data_loading.models import Granularity
from src.utils import COLUMN



The following object will be our main tool to issue API calls:

In [5]:
data_manager = AppDataManager()

The call below will fetch the location data in an attribute of the `data_manager` object:

In [6]:
data_manager.load_and_format_locations()

The locations are being pulled from this endpoint: `https://api.tracket.info/v1/locations/`

In [7]:
data_manager.locations

Unnamed: 0,COLUMN.DEVICEID,COLUMN.LABEL,COLUMN.LAT,COLUMN.LON,COLUMN.ACTIVE,COLUMN.RADIUS,COLUMN.LATEST_TIMESTAMP,COLUMN.SENDING_DATA
0,572227,Scott St. & Wellington St.,43.648156,-79.376049,True,75.452213,2024-07-06 13:03:00-05:00,False
1,572234,Ossington Ave & Dupont St,43.669501,-79.428834,True,84.156117,2024-08-08 14:04:30-05:00,False
2,572250,Kingsley Ave & Symington Ave,43.6685,-79.4524,True,50.0,2024-03-27 16:54:30-05:00,False
3,637773,Scott St. & Wellington St.,43.648,-79.376,False,50.0,1892-01-03 00:11:00-05:00,False
4,664429,Bayview Ave & Sheppard Ave E,43.7666,-79.388,True,50.0,2024-08-17 09:39:32-05:00,False
5,747559,Ossington & Dupont,43.669626,-79.428974,True,67.131135,2025-02-14 05:38:43-05:00,False
6,747580,Ossington & Dupont (9),43.673224,-79.436632,False,15.0,1892-01-03 00:11:00-05:00,False
7,753346,Cosburn & Pape,43.689612,-79.34787,False,220.0,1892-01-03 00:11:00-05:00,False
8,753385,The Queensway and South Kingsway,43.635518,-79.473455,False,200.0,1892-01-03 00:11:00-05:00,False
9,756264,Dupont and Spadina,43.674757,-79.407054,True,50.0,2025-02-13 10:00:04-05:00,False


Given the `DEVICEID`, we can issue a query for the location stats that will tell us the first and last recording date:

In [45]:
device_id = '896325'

In [46]:
data_manager.load_and_format_location_stats(device_id)

0   2024-08-11 13:25:56
Name: COLUMN.START, dtype: datetime64[ns]
0   2025-02-16 10:13:45
Name: COLUMN.END, dtype: datetime64[ns]


Unnamed: 0,COLUMN.MIN,COLUMN.MAX,COLUMN.MEAN,COLUMN.COUNT,COLUMN.START,COLUMN.END
0,52.0,104.0,60.945,47716,2024-08-11 13:25:56-05:00,2025-02-16 10:13:45-05:00


Finally, we can query data for a given period which can be fetched at different granularity levels between two given dates:
- `raw` will fetch the 5-minute, raw data,
- `hourly` will aggregate to 1-hour intervals,
- `life_time` will aggregate statistics to the specified period.

If the `start` and `end` params for the timeframe are left empty, then the last 7 days of recorded data is fetched (whenever that falls for the sensor) - if specified, they need to be `datetime` objects.

The results are stored as an attribut of the `data_manager` object, as shown below.

In [47]:
# options for Granularity
list(Granularity)

[<Granularity.raw: 'raw'>,
 <Granularity.hourly: 'hourly'>,
 <Granularity.life_time: 'life-time'>]

In [48]:
data_manager.load_and_format_location_noise(
    device_id,
    granularity=Granularity.raw,
)

We can also specify dates manually and if `start` is not added, the function will fetch 7-days back from the `end` date.



In [49]:
from datetime import datetime, timedelta

data_manager.load_and_format_location_noise(
    device_id,
    granularity=Granularity.raw,
    end=datetime(2025, 2, 15),
)

# data_manager.load_and_format_location_noise(
#     device_id,
#     granularity=Granularity.raw,
#     start=datetime(2025, 2, 8),
#     end=datetime(2025, 2, 15),
# )

In [50]:
noise_data_raw = data_manager.location_noise[Granularity.raw].copy()
noise_data_raw.head(10)

Unnamed: 0,COLUMN.MIN,COLUMN.MAX,COLUMN.MEAN,COLUMN.TIMESTAMP
0,58.0,67.0,60.0,2025-02-07 19:00:55-05:00
1,58.0,61.0,59.0,2025-02-07 19:05:55-05:00
2,58.0,61.0,59.0,2025-02-07 19:10:55-05:00
3,58.0,65.0,59.0,2025-02-07 19:15:55-05:00
4,58.0,63.0,59.0,2025-02-07 19:20:55-05:00
5,58.0,62.0,59.0,2025-02-07 19:25:55-05:00
6,58.0,62.0,60.0,2025-02-07 19:30:55-05:00
7,58.0,69.0,59.0,2025-02-07 19:35:55-05:00
8,58.0,61.0,59.0,2025-02-07 19:40:55-05:00
9,58.0,65.0,59.0,2025-02-07 19:45:55-05:00


For reference, we add a sample API endpoint with parameters for this call:

`https://api.tracket.info/v1/locations/896325/noise?granularity=raw&start=2024-11-03%2000%3A00%3A00&end=2024-11-11T00%3A00%3A00-04%3A00`

Let's grab some hourly data as well for a longer period.

In [51]:
data_manager.load_and_format_location_noise(
    device_id,
    granularity=Granularity.hourly,
    start=datetime(2024, 12, 15),
    end=datetime(2025, 2, 15),
)

In [52]:
noise_data_hourly = data_manager.location_noise[Granularity.hourly].copy()
noise_data_hourly.head(10)

Unnamed: 0,COLUMN.MIN,COLUMN.MAX,COLUMN.MEAN,COLUMN.TIMESTAMP
0,58.0,83.0,60.333,2024-12-14 19:00:00-05:00
1,58.0,84.0,60.083,2024-12-14 20:00:00-05:00
2,58.0,83.0,59.917,2024-12-14 21:00:00-05:00
3,58.0,80.0,60.0,2024-12-14 22:00:00-05:00
4,58.0,81.0,60.0,2024-12-14 23:00:00-05:00
5,58.0,83.0,60.5,2024-12-15 00:00:00-05:00
6,58.0,85.0,59.667,2024-12-15 01:00:00-05:00
7,58.0,83.0,59.5,2024-12-15 02:00:00-05:00
8,58.0,66.0,59.25,2024-12-15 03:00:00-05:00
9,58.0,64.0,58.833,2024-12-15 04:00:00-05:00


Now, we are all set up to start exploring the data.

# Data Visualization

The `src.plotting` module has a number of useful functions to visualize the data in different ways.

In [53]:
from src.plotting import TimeseriesPlotter, HeatmapPlotter
from src.utils import HEATMAP_VALUE

In [54]:
plotter = TimeseriesPlotter(noise_data_raw)

In [55]:
plotter.plot()

We need to add a few columns to the hourly data so the heatmap utility function will work properly:

In [56]:
import pandas as pd

noise_data_hourly[COLUMN.HOUR] = noise_data_hourly[COLUMN.TIMESTAMP].dt.hour
noise_data_hourly[COLUMN.DATE] = pd.to_datetime(noise_data_hourly[COLUMN.TIMESTAMP].dt.date)
noise_data_hourly[COLUMN.MINNOISE] = noise_data_hourly[COLUMN.MIN]
noise_data_hourly[COLUMN.MAXNOISE] = noise_data_hourly[COLUMN.MAX]
noise_data_hourly.head()

Unnamed: 0,COLUMN.MIN,COLUMN.MAX,COLUMN.MEAN,COLUMN.TIMESTAMP,COLUMN.HOUR,COLUMN.DATE,COLUMN.MINNOISE,COLUMN.MAXNOISE
0,58.0,83.0,60.333,2024-12-14 19:00:00-05:00,19,2024-12-14,58.0,83.0
1,58.0,84.0,60.083,2024-12-14 20:00:00-05:00,20,2024-12-14,58.0,84.0
2,58.0,83.0,59.917,2024-12-14 21:00:00-05:00,21,2024-12-14,58.0,83.0
3,58.0,80.0,60.0,2024-12-14 22:00:00-05:00,22,2024-12-14,58.0,80.0
4,58.0,81.0,60.0,2024-12-14 23:00:00-05:00,23,2024-12-14,58.0,81.0


In [57]:
heatmap_plotter = HeatmapPlotter(noise_data_hourly)

You can call this function for `HEATMAP_VALUE.MIN` and `HEATMAP_VALUE.MAX`:

In [58]:
heatmap_plotter.plot(pivot_value=HEATMAP_VALUE.MAX)

In [59]:
heatmap_plotter.plot(pivot_value=HEATMAP_VALUE.MIN)

Feel free to re-use and modify any functions you find in the utilities and share you work with us.