# Site and Device Detections Exploration

This notebook explores detection patterns across AudioMoth devices and sites.
We summarise detection counts at the device and site level, examine temporal patterns (week, month, hour of day), and compare device-level detections with aggregated site-level activity.

## This will cover:

- No. of detections by device, with overall % and site level %

- No. of detections by week combined

- No. of detections by week per device

- No. of detections by month combined

- No. of detections by month per device

- Overall Daily detection patterns

- Daily detection patterns per device.


## Setup System Path And Get Data

In [None]:
import sys
import os
from pathlib import Path
import pandas as pd


# Go up one level to .../audiomoth
PROJECT_ROOT = Path(os.getcwd()).resolve().parent

# Add project root to sys.path so `src` is importable
sys.path.insert(0, str(PROJECT_ROOT))

PROCESSED_DATA_PATH = Path(PROJECT_ROOT) / "data_processed" / "analysis_df.parquet"
analysis_df = pd.read_parquet(PROCESSED_DATA_PATH)

# Make pandas show more columns/rows while exploring
pd.set_option("display.max_columns", 50)
pd.set_option("display.width", 120)

## Detections by device

In [None]:
DEVICE_COL = "device"


device_detections_summary = (
    analysis_df.groupby(["device", "site"])
    .size()
    .rename("detections")
    .reset_index()
    .sort_values("site", ascending=True)
)


total = len(analysis_df)
device_detections_summary["%"] = (
    device_detections_summary["detections"] / total * 100
).round(2)

device_detections_summary["% within site"] = (
    device_detections_summary["detections"]
    / device_detections_summary.groupby("site")["detections"].transform("sum")
    * 100
).round(2)


device_detections_summary.reset_index(drop=True)

In [None]:
import src.data_store as data_store

data_store.save_dataframe_to_csv(
    device_detections_summary,
    Path(PROJECT_ROOT) / "outputs",
    "device_detections_summary",
)

## Detection By Week

### Overall

In [None]:
weekly_detections_summary = (
    analysis_df.groupby("week").size().rename("detections").reset_index()
)

In [None]:
analysis_df.groupby("week")["device"].nunique().reset_index(name="active_devices")

weekly_summary = (
    analysis_df.groupby("week")
    .agg(detections=("device", "size"), active_devices=("device", "nunique"))
    .reset_index()
)

weekly_summary["detections_per_device"] = (
    weekly_summary["detections"] / weekly_summary["active_devices"]
).round(1)

weekly_summary

In [None]:
""" Useful line of code to check specific weeks daily detection counts to be
compared with known battery loss/change dates. """

analysis_df.loc[analysis_df["week"] == 15, "date"].value_counts().sort_index()

Weekly detection totals varied substantially across the study period. However, when normalised by the number of active recording devices, a clear increase in detections per device was observed from late February into late March, peaking in weeks 13â€“14. This pattern is consistent with increased vocal activity during the spring breeding period. Weeks 10, 11, and 15 likely represent partial recording periods due to reduced device availability, most likely resulting from battery drop-outs.

### Per Device

In [None]:
weekly_detections_per_device = (
    analysis_df.groupby(["site", "device", "week"])
    .size()
    .rename("detections")
    .reset_index()
    .sort_values(["site", "device", "week"])
    .reset_index(drop=True)
)

# Add a column with relative detections per device (normalized by max detections for that device)
weekly_detections_per_device["detections_relative"] = (
    weekly_detections_per_device.groupby(
        "device"
    )["detections"].transform(lambda x: x / x.max())
)

weekly_detections_per_device

In [None]:
# Pivot the data for a more visually pleasing format.
weekly_detections_wide = weekly_detections_per_device.pivot_table(
    index="device", columns="week", values="detections", fill_value=0
)

weekly_detections_wide

In [None]:
weekly_device_activity = (
    analysis_df.groupby(["site", "week", "device"])
    .size()
    .rename("detections")
    .reset_index()
)

weekly_site_effort = (
    weekly_device_activity.groupby(["site", "week"])
    .agg(
        total_detections=("detections", "sum"),
        active_devices=("device", "nunique"),
    )
    .reset_index()
)

weekly_site_effort["detections_per_device"] = (
    weekly_site_effort["total_detections"] / weekly_site_effort["active_devices"]
)

In [None]:
import matplotlib.pyplot as plt

plt.figure(figsize=(10, 6))

for site, g in weekly_site_effort.groupby("site"):
    plt.plot(g["week"], g["detections_per_device"], label=site)

plt.xlabel("Week")
plt.ylabel("Detections per active device")
plt.title("Weekly detections per active device (effort-adjusted)")
plt.legend(bbox_to_anchor=(1.05, 1), loc="upper left")
plt.tight_layout()
plt.show()