Skip to content

feat: add synthetic surveillance event generator utility#139

Merged
Devnil434 merged 2 commits into
Devnil434:mainfrom
upasana-2006:synthetic-event-generator
Jun 2, 2026
Merged

feat: add synthetic surveillance event generator utility#139
Devnil434 merged 2 commits into
Devnil434:mainfrom
upasana-2006:synthetic-event-generator

Conversation

@upasana-2006
Copy link
Copy Markdown
Contributor

@upasana-2006 upasana-2006 commented Jun 1, 2026

Description

This PR introduces a synthetic surveillance event generator utility to help developers create realistic test data for reasoning, analytics, and future AI workflows without requiring real surveillance footage.

Changes Made

  • Added utils/synthetic_event_generator.py

  • Added support for generating synthetic surveillance events:

    • Restricted Zone Intrusion
    • Loitering
    • Object Abandonment
    • Crowd Formation
    • Normal Movement
  • Added configurable event generation with timestamps and metadata

  • Added JSON export functionality for generated events

  • Added unit tests for event creation, validation, and export behavior

  • Added package initialization file (utils/__init__.py)

Why This Change?

Currently, testing surveillance-related workflows requires manually creating event payloads or relying on external datasets. This utility provides a reusable way to generate realistic surveillance events for development, testing, demonstrations, and future benchmarking.

Testing

pytest tests/test_synthetic_event_generator.py -v

All tests pass successfully.

Related Issue

Fixes #138

Summary by CodeRabbit

  • New Features

    • Added synthetic event generation utilities for creating test data, including individual event creation, batch generation with configurable intervals, and JSON export functionality.
  • Tests

    • Added comprehensive test suite validating event generation, invalid input handling, and export functionality.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

Warning

Review limit reached

@upasana-2006, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 53 minutes. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e0d7d470-7251-4957-993e-900bb728a198

📥 Commits

Reviewing files that changed from the base of the PR and between 11caa1e and 73020de.

📒 Files selected for processing (1)
  • tests/test_synthetic_event_generator.py
📝 Walkthrough

Walkthrough

This PR adds a synthetic event generator utility module (utils/synthetic_event_generator.py) that creates realistic surveillance event payloads for testing and development, along with comprehensive pytest test coverage. The module provides functions to generate single events, batch events with configurable intervals, and export events to JSON.

Changes

Synthetic Event Generation Utility

Layer / File(s) Summary
Event generator implementation
utils/synthetic_event_generator.py
Defines EVENT_TYPES constant listing supported event types. generate_event() creates a single event dict with randomized person_id, location, confidence, and UTC timestamp, plus event-type-specific metadata. _build_metadata() helper returns conditional metadata based on event type. generate_events() creates a list of events spaced by configurable interval seconds. export_events_to_json() writes events list to JSON file with UTF-8 encoding and indentation.
Event generator test coverage
tests/test_synthetic_event_generator.py
Validates generate_event() returns required keys, rejects invalid event types with ValueError. Verifies generate_events() returns exact count, rejects zero/negative counts with ValueError, and confirms all returned events have types in EVENT_TYPES. Tests export_events_to_json() writes valid JSON file with correct event count.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 A generator springs to life, so spry,
Creating events that never lie,
With metadata bright and timestamps true,
Test data flows like morning dew,
Surveillance dreams now within reach! 🎯

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change—adding a synthetic surveillance event generator utility. It is clear, concise, and directly related to the changeset.
Linked Issues check ✅ Passed The PR fully addresses issue #138 requirements: generates realistic surveillance events for loitering, restricted-zone intrusion, crowd formation, and object abandonment; provides configurable event payloads with required schema (person_id, event_type, timestamp, location); exports to JSON; and includes comprehensive unit tests.
Out of Scope Changes check ✅ Passed All changes are directly related to the stated objectives. The new files (utils/synthetic_event_generator.py and tests/test_synthetic_event_generator.py) implement the requested event generator with configuration, validation, and JSON export—no out-of-scope modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tests/test_synthetic_event_generator.py (1)

29-53: ⚡ Quick win

Add tests for deterministic output and interval validation.

Given the feature goal, include a deterministic-generation test (same seed ⇒ same payload sequence) and a negative/zero interval_seconds validation test to prevent regressions.

Suggested test additions
+def test_generate_events_is_reproducible_with_seed():
+    events_a = generate_events(count=3, seed=123)
+    events_b = generate_events(count=3, seed=123)
+    assert events_a == events_b
+
+def test_generate_events_rejects_invalid_interval():
+    with pytest.raises(ValueError):
+        generate_events(count=3, interval_seconds=0)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_synthetic_event_generator.py` around lines 29 - 53, Add two tests
to tests/test_synthetic_event_generator.py: one that verifies deterministic
output by calling generate_events with a fixed seed (e.g., seed=123) twice and
asserting the two returned event lists are identical (use generate_events
function and EVENT_TYPES to validate payloads), and another that asserts
generate_events raises ValueError when called with interval_seconds=0 or a
negative value to enforce interval validation (test the function generate_events
for invalid interval_seconds). Ensure the tests use the same helpers already
present (generate_events, EVENT_TYPES) and follow the existing pytest patterns
(with pytest.raises for the negative case).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@utils/synthetic_event_generator.py`:
- Around line 17-24: generate_event currently uses global randomness which
prevents reproducible datasets; add an optional rng: Optional[random.Random] =
None (or seed: Optional[int]) parameter to generate_event and use
rng.choice/rng.random instead of random.*; update any batch functions in this
file (e.g., the batch generation functions that call generate_event in the 28-44
and 81-95 regions) to accept/propagate the same rng or seed and construct a
local random.Random when a seed is provided so callers can pass a deterministic
RNG instance for reproducible output.
- Around line 84-95: The generator currently allows interval_seconds <= 0 which
can produce duplicate or reversed timestamps; add a validation at the start of
the function that checks interval_seconds is an int > 0 and raise
ValueError("interval_seconds must be greater than 0") (or similar) if not; place
this check alongside the existing count check in the same function (the one
returning the list comprehension that calls generate_event with
timestamp=start_time + timedelta(seconds=i * interval_seconds)) and leave the
rest of the logic (start_time defaulting and list comprehension) unchanged.

---

Nitpick comments:
In `@tests/test_synthetic_event_generator.py`:
- Around line 29-53: Add two tests to tests/test_synthetic_event_generator.py:
one that verifies deterministic output by calling generate_events with a fixed
seed (e.g., seed=123) twice and asserting the two returned event lists are
identical (use generate_events function and EVENT_TYPES to validate payloads),
and another that asserts generate_events raises ValueError when called with
interval_seconds=0 or a negative value to enforce interval validation (test the
function generate_events for invalid interval_seconds). Ensure the tests use the
same helpers already present (generate_events, EVENT_TYPES) and follow the
existing pytest patterns (with pytest.raises for the negative case).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 59fb0757-95ff-4ff7-a8d9-7815e657d223

📥 Commits

Reviewing files that changed from the base of the PR and between aaad7cc and 11caa1e.

📒 Files selected for processing (3)
  • tests/test_synthetic_event_generator.py
  • utils/__init__.py
  • utils/synthetic_event_generator.py

Comment on lines +17 to +24
def generate_event(
event_type: Optional[str] = None,
person_id: Optional[int] = None,
timestamp: Optional[datetime] = None,
) -> Dict[str, Any]:
if event_type is None:
event_type = random.choice(EVENT_TYPES)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add deterministic generation support to meet the reproducibility objective.

Event generation always uses global randomness, so callers cannot reproduce the same dataset across runs. Add an optional seed or RNG parameter and thread it through batch generation.

Proposed fix
-def generate_event(
+def generate_event(
     event_type: Optional[str] = None,
     person_id: Optional[int] = None,
     timestamp: Optional[datetime] = None,
+    rng: Optional[random.Random] = None,
 ) -> Dict[str, Any]:
+    rng = rng or random
     if event_type is None:
-        event_type = random.choice(EVENT_TYPES)
+        event_type = rng.choice(EVENT_TYPES)
@@
     if person_id is None:
-        person_id = random.randint(1, 50)
+        person_id = rng.randint(1, 50)
@@
             "x": random.randint(0, 1920),
             "y": random.randint(0, 1080),
         },
-        "confidence": round(random.uniform(0.65, 0.99), 2),
-        "metadata": _build_metadata(event_type),
+        "confidence": round(rng.uniform(0.65, 0.99), 2),
+        "metadata": _build_metadata(event_type, rng),
     }
 
-def _build_metadata(event_type: str) -> Dict[str, Any]:
+def _build_metadata(event_type: str, rng: random.Random) -> Dict[str, Any]:
@@
-            "zone_id": random.choice(["restricted_lab", "server_room", "staff_only"]),
-            "severity": random.choice(["medium", "high"]),
+            "zone_id": rng.choice(["restricted_lab", "server_room", "staff_only"]),
+            "severity": rng.choice(["medium", "high"]),
@@
-def generate_events(
+def generate_events(
     count: int = 10,
     start_time: Optional[datetime] = None,
     interval_seconds: int = 30,
+    seed: Optional[int] = None,
 ) -> List[Dict[str, Any]]:
+    rng = random.Random(seed) if seed is not None else random
@@
     return [
-        generate_event(timestamp=start_time + timedelta(seconds=i * interval_seconds))
+        generate_event(
+            timestamp=start_time + timedelta(seconds=i * interval_seconds),
+            rng=rng,
+        )
         for i in range(count)
     ]

Also applies to: 28-44, 81-95

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@utils/synthetic_event_generator.py` around lines 17 - 24, generate_event
currently uses global randomness which prevents reproducible datasets; add an
optional rng: Optional[random.Random] = None (or seed: Optional[int]) parameter
to generate_event and use rng.choice/rng.random instead of random.*; update any
batch functions in this file (e.g., the batch generation functions that call
generate_event in the 28-44 and 81-95 regions) to accept/propagate the same rng
or seed and construct a local random.Random when a seed is provided so callers
can pass a deterministic RNG instance for reproducible output.

Comment on lines +84 to +95
interval_seconds: int = 30,
) -> List[Dict[str, Any]]:
if count <= 0:
raise ValueError("count must be greater than 0")

if start_time is None:
start_time = datetime.now(timezone.utc)

return [
generate_event(timestamp=start_time + timedelta(seconds=i * interval_seconds))
for i in range(count)
]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate interval_seconds to prevent invalid timestamp sequences.

interval_seconds currently accepts 0 or negative values, which can create duplicate or reverse-ordered timelines.

Proposed fix
 def generate_events(
@@
 ) -> List[Dict[str, Any]]:
     if count <= 0:
         raise ValueError("count must be greater than 0")
+    if interval_seconds <= 0:
+        raise ValueError("interval_seconds must be greater than 0")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
interval_seconds: int = 30,
) -> List[Dict[str, Any]]:
if count <= 0:
raise ValueError("count must be greater than 0")
if start_time is None:
start_time = datetime.now(timezone.utc)
return [
generate_event(timestamp=start_time + timedelta(seconds=i * interval_seconds))
for i in range(count)
]
interval_seconds: int = 30,
) -> List[Dict[str, Any]]:
if count <= 0:
raise ValueError("count must be greater than 0")
if interval_seconds <= 0:
raise ValueError("interval_seconds must be greater than 0")
if start_time is None:
start_time = datetime.now(timezone.utc)
return [
generate_event(timestamp=start_time + timedelta(seconds=i * interval_seconds))
for i in range(count)
]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@utils/synthetic_event_generator.py` around lines 84 - 95, The generator
currently allows interval_seconds <= 0 which can produce duplicate or reversed
timestamps; add a validation at the start of the function that checks
interval_seconds is an int > 0 and raise ValueError("interval_seconds must be
greater than 0") (or similar) if not; place this check alongside the existing
count check in the same function (the one returning the list comprehension that
calls generate_event with timestamp=start_time + timedelta(seconds=i *
interval_seconds)) and leave the rest of the logic (start_time defaulting and
list comprehension) unchanged.

@upasana-2006
Copy link
Copy Markdown
Contributor Author

Thanks for the review and suggestions.

I've addressed the requested changes, including:

  • Added validation for interval_seconds
  • Improved deterministic test coverage
  • Updated the implementation and tests accordingly

All tests are now passing. Please let me know if any further changes are needed.

@Devnil434 Devnil434 merged commit 34e5879 into Devnil434:main Jun 2, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Add Google Colab Notebook for Scene Graph Surveillance Reasoning

2 participants