feat: Platform Improvements v3 - Clustering, Ownership, Correlation & More#68
feat: Platform Improvements v3 - Clustering, Ownership, Correlation & More#68TerrifiedBug merged 17 commits intomainfrom
Conversation
Navigate to edit mode instead of list after creating new correlation rule. Matches existing behavior for regular rule editor.
- Block creation of exact duplicate exceptions (same field+value+operator) - Warn when new exception overlaps with existing pattern - Add warning field to RuleExceptionResponse schema - Fix test database enum type creation in conftest.py - Fix auth test to expect 401 instead of 403 Closes: Platform Improvements v3 - Task 2
- Add optional alert_id to exception create request - Auto-set alert status to false_positive when exception created from it - Store exception reference on alert document in OpenSearch - Show exception badge with tooltip in alert detail view - Reload alert after exception creation to show updated status
- New AlertComment model with soft delete support - API endpoints: list, create, delete (admin only) - Comments section in alert detail page - Immutable for users, admins can delete
- Assign/unassign endpoints for alert ownership - Owner filter in alerts list (use 'me' for current user) - Take Ownership/Release buttons in alert detail - 'Assigned to me' filter checkbox on alerts page - 'My Alerts' quick link in sidebar navigation
- Fix ownership endpoints to use correct OpenSearch document ID (hit["_id"]) - Add owner_id, owner_username, owned_at to AlertResponse schema - Add exception_created and ti_enrichment to AlertResponse schema - Remove "My Alerts" nav link from Header (use filter on Alerts page instead) - Add Owner column to alerts table showing "Unassigned" for unassigned alerts
Dynamic mapping creates text+keyword multifield for strings. Term queries need to use .keyword suffix for exact matching.
Implement alert clustering feature that groups related alerts by rule_id and entity (host, user, IP) within a configurable time window. This reduces alert fatigue by consolidating repetitive alerts. Backend changes: - Add cluster_alerts() function with time-window based grouping - Add extract_entity_value() helper for entity field extraction - Add GET/PUT /settings/alert-clustering endpoints - Update alerts list endpoint with clustering support - Add AlertCluster and ClusteredAlertListResponse schemas Frontend changes: - Add Alerts tab in Settings with clustering configuration - Update Alerts page with expandable cluster rows - Add cluster selection and bulk actions support - Display cluster count badges and time ranges Tests: - Add 20 test cases covering clustering logic and edge cases
Adds PDF export functionality to the ATT&CK Matrix page: Backend (reports.py): - New POST /api/reports/attack-coverage endpoint - Generates PDF report with: - Summary statistics (total techniques, covered, coverage %) - Coverage breakdown by tactic - Top 10 detection gaps (uncovered techniques) - Top 10 best covered techniques - Uses landscape letter format with styled tables Frontend (AttackMatrix.tsx): - Export PDF button in header with loading state - Downloads PDF file with date-stamped filename - Toast notifications for success/error feedback Tests (test_attack_export.py): - Tests for PDF generation (magic bytes verification) - Authentication requirement test - Test with actual technique/rule data - Content-Disposition header verification - Default format test
Remove entity_fields configuration from alert clustering feature. Alerts are now clustered solely by rule_id within the time window, making the feature simpler to configure and understand. Changes: - Remove entity_fields from AlertClusteringSettings schema - Simplify cluster_alerts() to group by rule_id only - Remove extract_entity_value() helper function (no longer needed) - Update Settings UI to show only enable toggle and time window - Update tests to reflect simplified grouping logic
Add flag_modified() call when updating Setting.value to explicitly mark the JSONB column as modified. Without this, SQLAlchemy may not detect changes to JSON columns and skip the UPDATE statement.
The /{key} catch-all route was matching /alert-clustering requests
before the specific endpoints could handle them, causing saves to
return {"success": true} instead of the proper response and not
using the correct schema validation.
Update alert clustering to include complete alert objects in the response, not just IDs. This allows the frontend to display full alert details (rule title, severity, status, owner, tags, timestamp) when expanding a cluster, matching the non-clustered view. Changes: - Add alerts field to AlertCluster schema (backend and frontend) - Include full alert data in _create_cluster function - Update Alerts page to render full details in expanded cluster rows - Add test assertion for alerts field in cluster output
When clustering is enabled, override the client's page size (typically 25) and fetch up to 1000 alerts. This provides a much better clustering view since pagination doesn't make sense with clustering - we need to see the full picture to properly group alerts by time window.
- Strip .keyword and .text OpenSearch field type suffixes when extracting entity values from log documents (these are query hints, not document paths) - Fix correlation alert payload to include all correlation data (first_alert_id, second_alert_id, rule IDs, timestamps) instead of missing source_alerts key - Fix SQLAlchemy query to use scalars().first() instead of first() to get model object instead of Row - Rewrite correlation tests to properly test the state machine logic with mocked field resolution
- Add clickable "View Alert" link to Discord/Slack webhook notifications - Configure nginx to only log 4xx/5xx responses (reduce log noise) - Wrap Health page Indexes section in Card for design consistency - Persist "Assigned to Me" checkbox preference via localStorage - Add "Take Ownership" bulk action button to Alerts table
| @@ -0,0 +1,314 @@ | |||
| """Tests for alert clustering functionality.""" | |||
|
|
|||
| from datetime import UTC, datetime, timedelta | |||
Check notice
Code scanning / CodeQL
Unused import Note test
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 1 month ago
In general, the way to fix unused imports is to remove the identifiers that are not used anywhere in the file, or to delete the entire import statement if none of its symbols are used. This keeps the codebase cleaner and avoids unnecessary dependencies.
For this specific file backend/tests/services/test_alert_clustering.py, the best fix is to delete the from datetime import UTC, datetime, timedelta line entirely, because none of these three imported names are referenced in the shown tests, and they are likely not needed elsewhere in the file. No additional methods, imports, or definitions are required: this is a pure deletion that does not change any existing functionality, since the imported names are unused.
Concretely:
- In
backend/tests/services/test_alert_clustering.py, remove line 3:from datetime import UTC, datetime, timedelta. - Leave all the remaining imports (
import pytestandfrom app.services.alerts import cluster_alerts) and test code unchanged.
| @@ -1,7 +1,5 @@ | ||
| """Tests for alert clustering functionality.""" | ||
|
|
||
| from datetime import UTC, datetime, timedelta | ||
|
|
||
| import pytest | ||
|
|
||
| from app.services.alerts import cluster_alerts |
|
|
||
| from datetime import UTC, datetime, timedelta | ||
|
|
||
| import pytest |
Check notice
Code scanning / CodeQL
Unused import Note test
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 1 month ago
In general, an unused-import issue is fixed by deleting the import statement for the unused symbol. This reduces unnecessary dependencies and noise in the file.
Here, the best minimal fix is to remove the line import pytest from backend/tests/services/test_alert_clustering.py, leaving the rest of the file unchanged. No other code changes or additional imports are required, since the tests use only built-in assert and cluster_alerts from app.services.alerts. The change is confined to line 5 in the provided snippet.
| @@ -2,8 +2,6 @@ | ||
|
|
||
| from datetime import UTC, datetime, timedelta | ||
|
|
||
| import pytest | ||
|
|
||
| from app.services.alerts import cluster_alerts | ||
|
|
||
|
|
| from app.services.correlation import ( | ||
| check_correlation, | ||
| cleanup_expired_states, | ||
| resolve_entity_field, | ||
| get_nested_value, | ||
| ) |
Check notice
Code scanning / CodeQL
Unused import Note test
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 1 month ago
To fix an unused import in Python, you remove the unused symbol from the import statement (or, if none of the imported symbols are used, remove the entire import). This eliminates an unnecessary dependency and resolves the static analysis warning.
In this file (backend/tests/services/test_correlation.py), the single best fix is to edit the multi-line import on lines 10–15 so that it no longer imports resolve_entity_field. The other imported functions (check_correlation, cleanup_expired_states, get_nested_value) should remain untouched because they appear to be used in the tests. No new methods, imports, or definitions are required; this is purely a cleanup change to the existing import statement.
Concretely, in backend/tests/services/test_correlation.py, adjust the from app.services.correlation import (...) block to remove the resolve_entity_field line while keeping the rest exactly the same.
| @@ -10,7 +10,6 @@ | ||
| from app.services.correlation import ( | ||
| check_correlation, | ||
| cleanup_expired_states, | ||
| resolve_entity_field, | ||
| get_nested_value, | ||
| ) | ||
| from app.models.correlation_rule import CorrelationRule, CorrelationRuleVersion |
| onClick={loadHealth} | ||
| disabled={isLoading} | ||
| > | ||
| <RefreshCw className={`h-4 w-4 ${isLoading ? 'animate-spin' : ''}`} /> |
Check warning
Code scanning / CodeQL
Useless conditional Warning
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 1 month ago
In general, to fix a “useless conditional” where a boolean is always false, you either (1) correctly wire the boolean so it reflects real state changes, or (2) remove or replace it with a meaningful condition. Since we can’t safely infer or modify logic outside the shown snippet, the safest fix inside this file is to stop depending on a boolean that is effectively constant and instead derive a clear loading condition from data we know is present: the health array.
In this component, the refresh button for “Indexes” is always enabled in practice (since isLoading is always false), and the icon never spins. For the “External Services” card, the author already uses a proper loading flag serviceHealthLoading to disable the button and show a spinner. We can mirror that pattern by computing a local, meaningful loading condition for the Indexes section based on the known data: for example, treating the “loading” state as “we haven’t received any health data yet.” Specifically:
- Replace
disabled={isLoading}with a check that disables the button while no health data has been loaded, e.g.disabled={health.length === 0}. - Replace the spinner condition
isLoading ? 'animate-spin' : ''with the same condition so that the icon spins while we’re in that initial “no data” state. - This change is fully contained within the shown snippet in
frontend/src/pages/Health.tsxand requires no new imports or additional methods.
This preserves and clarifies functionality: the button is disabled and shows a spinner until health data exists, and once health data is present, the button is enabled and the icon is static. It also removes reliance on the dead isLoading flag, eliminating the useless conditional.
| @@ -374,9 +374,9 @@ | ||
| variant="ghost" | ||
| size="sm" | ||
| onClick={loadHealth} | ||
| disabled={isLoading} | ||
| disabled={health.length === 0} | ||
| > | ||
| <RefreshCw className={`h-4 w-4 ${isLoading ? 'animate-spin' : ''}`} /> | ||
| <RefreshCw className={`h-4 w-4 ${health.length === 0 ? 'animate-spin' : ''}`} /> | ||
| </Button> | ||
| </div> | ||
| </CardHeader> |
Summary
This PR implements Platform Improvements v3, adding several major features and bug fixes:
New Features
Bug Fixes
UI/UX Improvements
Technical Changes
Test plan