This notebook includes several parts of code to generate several analysis for the ski camera detections, bot for spatial and temporal analysis. In particular:

- generate heatmaps for the detections, based on the bounding boxes;
- generate heatmaps for a partitioned map of the image;
- generate temporal analysis for average detections for each day of the weekand respective heatmaps.

All sections have a flag SAVE_IMAGES which can be modified in case one wants to save the output images or only visualize them, and some have a NORMALIZE_HEATMAPS flag to let the heatmap output be normalized (meaning that it sums to 1)

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

cams = ('jervskogen_1', 'jervskogen_2', 'nilsbyen_2', 'nilsbyen_3', 'skistua')

data = dict()
for cam in cams:
    data[cam] = pd.read_csv('../data/datasets/'+cam+'.csv')

# Spatial analysis

## Detection heatmap (whole image)

This section generates a heatmap of detections in the whole image, i.e., it starts with an all zero matrix and iterates over all detections, incrementing the region of the matrix inside the respective bounding box. One can use any resolution for the matrix as the detections are normalized in the dataset.

In [None]:
SAVE_IMAGES = True

min_conf = 0.5
max_conf = 1

width = 640
height = 480

#cams = ('jervskogen_1', 'jervskogen_2', 'nilsbyen_2', 'nilsbyen_3', 'skistua')

for cam in cams:
    df = data[cam]
    detect_heatmap_bbox = np.zeros([height, width])

    for index, row in df.iterrows():
        if row['class']=='person' and min_conf < row['conf'] <= max_conf:
            p1x = round(row.p1x*width)
            p1y = round(row.p1y*height)
            p2x = round(row.p2x*width)
            p2y = round(row.p2y*height)
    
            for x in range(p1x - 1, p2x):
                for y in range(p1y - 1, p2y):
                    detect_heatmap_bbox[y, x] += 1
                
    ax = sns.heatmap(detect_heatmap_bbox, xticklabels=False, yticklabels=False, cmap='Reds')
    plt.title(cam)
    #plt.show()

    if SAVE_IMAGES:
        plt.savefig('outputs/' + 'global_heatmap_' + cam + '.jpg', dpi=300, format='jpg')
        
    plt.clf()

## Detection heatmap (partitioned; incremented with all detections; "fast")

Similar to the previous section, but here we assume a partition of the image into a small number of sections and increment all the sections which are touched by the bounding box.

Note: For large number of partitions this method outputs similar results to the previous, but not for smaller number of partitions. I consider this to be more accurate as it uses a partition of the space based on equal sized bins, while the other rounds the position in the image. I am keeping both but will probably remove the first method and stick to this one.

In [5]:
SAVE_IMAGES = True
NORMALIZE_HEATMAPS = False

min_conf = 0.5
max_conf = 1

width = 11
height = 6

#cams = ('jervskogen_1', 'jervskogen_2', 'nilsbyen_2', 'nilsbyen_3', 'skistua')

for cam in cams:

    df = data[cam]
    df = df[(df['class']=='person') & (df['conf']>min_conf)]

    detect_heatmap_partition = np.zeros([height, width])

    # Create bins between 0 and 1 to use normalized detection points
    bins_x = np.linspace(0, 1, num=width+1, endpoint=True)
    bins_y = np.linspace(0, 1, num=height+1, endpoint=True)
    
    df['pd1x'] = np.digitize(df.p1x,bins_x,right=True)-1
    df['pd1y'] = np.digitize(df.p1y,bins_y,right=True)-1
    df['pd2x'] = np.digitize(df.p2x,bins_x,right=True)-1
    df['pd2y'] = np.digitize(df.p2y,bins_y,right=True)-1
    
    for index, row in df.iterrows():
        detect_heatmap_partition[np.ix_(np.arange(row['pd1y'],row['pd2y']+1),np.arange(row['pd1x'],row['pd2x']+1))] += 1

    if NORMALIZE_HEATMAPS:
        detect_heatmap_partition = detect_heatmap_partition/detect_heatmap_partition.sum()
        
    ax = sns.heatmap(detect_heatmap_partition, xticklabels=False, yticklabels=False, cmap='Reds')
    plt.title(cam)

    if SAVE_IMAGES:
        plt.savefig('outputs/' + 'partition_heatmap_fast_' + cam + '.jpg', dpi=300, format='jpg')
    
    plt.clf()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['pd1x'] = np.digitize(df.p1x,bins_x,right=True)-1
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['pd1y'] = np.digitize(df.p1y,bins_y,right=True)-1
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['pd2x'] = np.digitize(df.p2x,bins_x,right=True)-1
A value is trying to be set on a copy of a sli

<Figure size 432x288 with 0 Axes>

## Detection heatmap (partitioned; incremented with all detections; "slow")

In [None]:
SAVE_IMAGES = True
NORMALIZE_HEATMAPS = True

min_conf = 0.5
max_conf = 1

width = 11
height = 6

#cams = ('jervskogen_1', 'jervskogen_2', 'nilsbyen_2', 'nilsbyen_3', 'skistua')

for cam in cams:

    df = data[cam]

    detect_heatmap_partition = np.zeros([height, width])

    # Create bins between 0 and 1 to use normalized detection points
    bins_x = np.arange(0,1+1/width,1/width)
    bins_y = np.arange(0,1+1/height,1/height)

    for index, row in df.iterrows():
        if row['class'] == 'person' and min_conf < row['conf'] <= max_conf:
            # Top left point
            p1 = [row.p1x, row.p1y]

            pd1 = [np.digitize(p1[0],bins_x, right=True) - 1, np.digitize(p1[1],bins_y,right=True) - 1]
            
            # Bottom right point
            
            p2 = [row.p2x, row.p2y]

            pd2 = [np.digitize(p2[0],bins_x,right=True) - 1, np.digitize(p2[1],bins_y,right=True) - 1]
            
            # Increment heatmap matrix
            detect_heatmap_partition[np.ix_(np.arange(pd1[1],pd2[1]+1),np.arange(pd1[0],pd2[0]+1))] += 1

    if NORMALIZE_HEATMAPS:
        detect_heatmap_partition = detect_heatmap_partition/detect_heatmap_partition.sum()
        
    ax = sns.heatmap(detect_heatmap_partition, xticklabels=False, yticklabels=False, cmap='Reds')
    plt.title(cam)

    if SAVE_IMAGES:
        plt.savefig('outputs/' + 'partition_heatmap_' + cam + '.jpg', dpi=300, format='jpg')
    
    plt.clf()

## Detection heatmap (partitioned; incremented with yes/no)

Pretty much the same as the previous partitioned heatmap, but heatmap objective is to measure whether there was a detection or not independently of how many people are actually detected.

In [4]:
SAVE_IMAGES = True
NORMALIZE_HEATMAPS = False

min_conf = 0.5
max_conf = 1

width = 11
height = 6

#cams = ('jervskogen_1', 'jervskogen_2', 'nilsbyen_2', 'nilsbyen_3', 'skistua')

for cam in cams:

    df = data[cam]
    df = df[(df['class']=='person') & (df['conf']>min_conf)]

    detect_heatmap_partition = np.zeros([height, width])

    # Create bins between 0 and 1 to use normalized detection points
    bins_x = np.linspace(0, 1, num=width+1, endpoint=True)
    bins_y = np.linspace(0, 1, num=height+1, endpoint=True)
    
    df['pd1x'] = np.digitize(df.p1x,bins_x,right=True)-1
    df['pd1y'] = np.digitize(df.p1y,bins_y,right=True)-1
    df['pd2x'] = np.digitize(df.p2x,bins_x,right=True)-1
    df['pd2y'] = np.digitize(df.p2y,bins_y,right=True)-1
    
    current_time = df.iloc[0]['timestamp']
    time_detections = np.zeros([height, width])
    for index, row in df.iterrows():
        if row['timestamp'] != current_time:
            current_time = row['timestamp']
            detect_heatmap_partition += time_detections
            time_detections = np.zeros([height, width])
        
        time_detections[np.ix_(np.arange(row['pd1y'],row['pd2y']+1),np.arange(row['pd1x'],row['pd2x']+1))] += 1
        time_detections[time_detections>0]=1

    if NORMALIZE_HEATMAPS:
        detect_heatmap_partition = detect_heatmap_partition/detect_heatmap_partition.sum()
        
    ax = sns.heatmap(detect_heatmap_partition, xticklabels=False, yticklabels=False, cmap='Reds')
    plt.title(cam)

    if SAVE_IMAGES:
        plt.savefig('outputs/' + 'partition_heatmap_single_' + cam + '.jpg', dpi=300, format='jpg')
    
    plt.clf()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['pd1x'] = np.digitize(df.p1x,bins_x,right=True)-1
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['pd1y'] = np.digitize(df.p1y,bins_y,right=True)-1
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['pd2x'] = np.digitize(df.p2x,bins_x,right=True)-1
A value is trying to be set on a copy of a sli

<Figure size 432x288 with 0 Axes>

# Temporal analysis

This section has code to analyse the detection of skiers along temporal axis. It starts by resampling the dataframe to obtain the number of detections per time period (for now, 1 hour is used) and prints averages of detections per hour for every weekday. Then, this information is summarized in a temporal heatmap, with hour and weekday in the axis.

In [None]:
SAVE_IMAGES = True
NORMALIZE_HEATMAPS = True

# cams = ('jervskogen_1', 'jervskogen_2', 'nilsbyen_2', 'nilsbyen_3', 'skistua')

weekdays = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']

for cam in cams:
    df_cam = data[cam]
    df_cam = df_cam[(df_cam['class']=='person') & (df_cam['conf']>0.5)]
    df_cam['timestamp'] = pd.to_datetime(df_cam['timestamp'])

    df_tmp = pd.DataFrame()
    df_tmp['value'] = df_cam['timestamp'].sort_values().value_counts(sort=False).resample('H',label='right').sum()
    df_tmp = df_tmp.groupby([df_tmp.index.weekday,df_tmp.index.hour]).mean()
    df_tmp.index = df_tmp.index.set_levels(levels=['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'], level=0)
    df_tmp = df_tmp.unstack(level=0)
    
    df_agg = pd.DataFrame()
    df_agg['value'] = df_cam['timestamp'].sort_values().value_counts(sort=False).resample('H',label='right').sum()
    df_stats = pd.DataFrame()
    df_stats['mean'] = df_agg['value'].groupby([df_agg.index.weekday,df_agg.index.hour]).mean()
    df_stats['std'] = df_agg['value'].groupby([df_agg.index.weekday,df_agg.index.hour]).std()
    df_stats['median'] = df_agg['value'].groupby([df_agg.index.weekday,df_agg.index.hour]).median()
    df_stats.index = df_stats.index.set_levels(levels=['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'], level=0)
    df_stats = df_stats.unstack(level=0)

    ax = df_stats['mean'].plot(kind='bar', subplots=True, rot=0, figsize=(20, 10), layout=(3, 3), sharey=True,  yerr=df_stats['std'], capsize=4)

    if SAVE_IMAGES:
        plt.savefig('outputs/' + 'temporal_analysis_' + cam + '.jpg', dpi=300, format='jpg')
        
    plt.clf()
    
    plt.figure()
    
    if NORMALIZE_HEATMAPS:
        df_stats['mean'] = df_stats['mean']/df_stats['mean'].sum()
        
    sns.heatmap(df_stats['mean'], xticklabels=True, yticklabels=True, cmap='Reds')
    plt.savefig('outputs/' +'temporal_heatmap_' + cam + '.jpg', dpi=300, format='jpg')
    
    plt.clf()
    
    df_agg['weekday'] = df_agg.index.weekday
    df_agg['weekday'] = df_agg['weekday'].replace({0: 'Monday',
                                                   1: 'Tuesday',
                                                   2: 'Wednesday',
                                                   3: 'Thursday',
                                                   4: 'Friday',
                                                   5: 'Saturday',
                                                   6: 'Sunday'})
    df_agg['hour'] = df_agg.index.hour
    sns.catplot(x='hour',y='value',data=df_agg, col='weekday', orient="v", kind='box', color='r',  col_wrap=3, height=5, aspect=1.5, col_order=['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'])
    plt.savefig('outputs/' +'temporal_boxplot_' + cam + '.jpg', dpi=300, format='jpg')
    
    plt.clf()

# Time plot of weekday/hour

In [None]:
cam='jervskogen_1'

df_cam = data[cam]
df_cam = df_cam[(df_cam['class']=='person') & (df_cam['conf']>0.5)]
df_cam['timestamp'] = pd.to_datetime(df_cam['timestamp'])
#df_cam = df_cam[df_cam['timestamp'] < '2022-02-17']
    
df_agg = pd.DataFrame()
df_agg['value'] = df_cam['timestamp'].sort_values().value_counts(sort=False).resample('H',label='right').sum()
df_agg['weekday'] = df_agg.index.weekday
df_agg['weekday'] = df_agg['weekday'].replace({0: 'Monday',
                                                1: 'Tuesday',
                                                2: 'Wednesday',
                                                3: 'Thursday',
                                                4: 'Friday',
                                                5: 'Saturday',
                                                6: 'Sunday'})
df_agg['hour'] = df_agg.index.hour

In [None]:
day = 'Monday'
hour = 5

df_agg[(df_agg['weekday'] == day) & (df_agg['hour'] == hour)]['value'].plot()

# Playground

In [15]:
cam='jervskogen_1'

df_cam = data[cam]
df_cam = df_cam[(df_cam['class']=='person') & (df_cam['conf']>0.5)]
df_cam['timestamp'] = pd.to_datetime(df_cam['timestamp'])

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_cam['timestamp'] = pd.to_datetime(df_cam['timestamp'])


In [24]:
df_cam['timestamp'].sort_values().value_counts(sort=False).resample('10T',label='right').sum()

2021-12-11 11:50:00    2
2021-12-11 12:00:00    0
2021-12-11 12:10:00    2
2021-12-11 12:20:00    2
2021-12-11 12:30:00    2
                      ..
2022-03-02 20:30:00    0
2022-03-02 20:40:00    0
2022-03-02 20:50:00    0
2022-03-02 21:00:00    0
2022-03-02 21:10:00    1
Freq: 10T, Name: timestamp, Length: 11721, dtype: int64

In [19]:
df_cam

Unnamed: 0,timestamp,p1x,p1y,p2x,p2y,conf,class
59,2021-12-11 11:40:03,0.541712,0.177989,0.557701,0.252529,0.686277,person
60,2021-12-11 11:40:03,0.598070,0.206958,0.611124,0.252961,0.684787,person
71,2021-12-11 12:00:03,0.280408,0.151343,0.291415,0.202466,0.562436,person
75,2021-12-11 12:00:03,0.313848,0.134304,0.328462,0.197118,0.600085,person
78,2021-12-11 12:10:03,0.462418,0.125656,0.471604,0.154197,0.648706,person
...,...,...,...,...,...,...,...
15818,2022-03-01 19:40:04,0.497933,0.101128,0.507380,0.148893,0.541385,person
15824,2022-03-01 19:50:03,0.492877,0.092347,0.505089,0.149029,0.513668,person
15835,2022-03-01 20:10:04,0.154104,0.185068,0.171353,0.256689,0.588613,person
15910,2022-03-02 20:10:03,0.301589,0.127657,0.314845,0.167388,0.658516,person
