-
Notifications
You must be signed in to change notification settings - Fork 12
Description
Bug Description
In netsecgame/game/coordinator.py:563, the defender's false positive penalty has an inverted sign, causing false positives to increase the defender's reward instead of decreasing it.
The reward configuration defines false_positive as a negative value (e.g., -5 in the example config). The code then subtracts the product of that negative value:
# Line 563 — BUGGY
self._agent_rewards[agent] -= self._agent_false_positives[agent] * self._rewards["false_positive"]With false_positive = -5 and 2 false positives:
rewards -= 2 * (-5)
rewards -= (-10)
rewards += 10 # False positives INCREASE the reward!
The comment on line 561 says "dicrease the reward for false positives" — confirming the intent is to penalize, but the math does the opposite.
This silently corrupts the reward signal for every defender agent that produces false positive detections, which directly undermines RL training quality. Defenders learn that false positives are beneficial rather than costly.
Steps to Reproduce
- Start a game with the example task configuration (
false_positive: -5in rewards) - Register an Attacker and a Defender agent
- Have the Defender perform
BlockIPactions that result in false positives (blocking IPs that are not actually attackers) - Complete the episode (e.g., Attacker times out)
- Observe the Defender's final reward — each false positive adds +5 instead of subtracting 5
Expected Behavior
False positives should decrease the defender's reward. The fix is to use += instead of -= (since the config value is already negative), consistent with how all other rewards are applied on lines 493, 547, 550, 556, and 559:
# Option A: use += with the negative config value (consistent with rest of codebase)
self._agent_rewards[agent] += self._agent_false_positives[agent] * self._rewards["false_positive"]Or alternatively, keep -= but take the absolute value:
# Option B: subtract the absolute penalty
self._agent_rewards[agent] -= self._agent_false_positives[agent] * abs(self._rewards["false_positive"])Option A is preferred as it matches how step, success, and fail rewards are all applied using +=.
Version
1.1
Installation / Deployment Method
Running locally from source