New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Engine - Research Feasibility of Using OpenSearch SQL Plugin for Event Correlation and Frequency Rules #23332
Comments
Assessing the Feasibility of Conducting Various Types of Correlations Using the OpenSearch SQL PluginIntroductionWe need to determine whether we can perform different types of correlations using the SQL plugin in OpenSearch, specifically:
For frequency correlations, it's essential to correlate events based on common attributes and the parameterization of the time span or total time. Using SQL, we can perform a 'GROUP BY' operation; however, we are unable to retrieve additional data from the original events, which also prevents us from calculating time spans or other metrics. Example Query:opensearchsql> SELECT b.event.type, b.event.code, b.event.start, b.event.reason
FROM wazuh-alerts-5.x-* b
WHERE b.event.type = 'corr_test'; Result: +--------------+--------------+-------------------------+----------------+
| event.type | event.code | event.start | event.reason |
|--------------+--------------+-------------------------+----------------|
| corr_test | A | 2024-05-07 07:48:20.279 | a |
| corr_test | A | 2024-05-07 07:48:20.279 | a |
| corr_test | B | 2024-05-07 07:49:20.279 | a |
| corr_test | A | 2024-05-07 07:50:20.279 | b |
| corr_test | B | 2024-05-07 07:51:20.279 | a |
| corr_test | A | 2024-05-07 07:52:20.279 | a |
| corr_test | B | 2024-05-07 07:53:20.279 | b |
| corr_test | A | 2024-05-07 07:54:20.279 | a |
| corr_test | B | 2024-05-07 07:55:20.279 | a |
| corr_test | A | 2024-05-07 07:56:20.279 | b |
| corr_test | C | 2024-05-07 07:57:20.279 | a |
+--------------+--------------+-------------------------+----------------+ Challenges in SQL ImplementationThe challenge begins when attempting to calculate occurrences of Subquery:SELECT b.event.code, COUNT(*) AS count
FROM wazuh-alerts-5.x-* b
WHERE b.event.type = 'corr_test'
GROUP BY b.event.code; Result:
However, the main issue arises when attempting to include additional fields to calculate the span or apply other conditions: SELECT a.event.code, a.event.start, a.event.reason, b.count
FROM wazuh-alerts-5.x-* a
INNER JOIN (
SELECT b.event.code, COUNT(*) AS count
FROM wazuh-alerts-5.x-* b
WHERE b.event.type = 'corr_test'
GROUP BY b.event.code
) AS b ON a.event.code = b.event.code; Result:
Exploring Known Limitations in OpenSearch SQL PluginUpon further investigation, it has been identified that there are known limitations within OpenSearch SQL capabilities, which are relevant to our issues. These limitations stem from a fundamental restriction: you can only join two indexes. This implies:
Additional constraints also limit the operational scope significantly:
Reference:
Conclusion: Limitations of Event Correlation Capabilities in OpenSearch SQLGiven the current state of the SQL plugin in OpenSearch and the documented limitations, it is not feasible to implement sophisticated event correlation directly using this tool. |
CorrelationA first viable proposal for local correlation: timeframe: 30 # Timeframe in seconds
shared_field: # [optional] Shared field between all events (Static value)
field_a: static value
same_field: # [optional] List of fields that must be the same in all events
- src.ip
- agent.id
sequence:
- pre_filter: # Pre-filter to fetch the events of the sequence,
- category = login-failed
- rule_id = 1001
check: # [optional] Condition to match the event (Exp or list), with helpers functions
- Expression or list condition
frequency: 3 # Hits needed to avance to the next sequence
eq_field: # [optional] List of fields that must be the same value in all events
- user.name
- pre_filter:
- category = login-ok
check:
- Expression or list condition
frequency: 1
eq_field:
- client.name # Same value as the user.name in the previous event
negate: true
- pre_filter:
- rule_id = 2020
check:
- Expression or list condition
frequency: 3
eq_field:
- user.name # Same value as the user.name in the first event Future improvements:
Algorithm:
|
Second Approach: Optimized Query and State ManagementIn our previous strategy, we focused on retrieving events and processing sequence detection locally. While this method is practical and ensures efficient sequence detection, it places a considerable burden on network resources due to the high volume of data transfers involved. In this revised approach, we explore an alternative strategy aimed at minimizing data retrieval and reducing network load. By maintaining the state locally and optimizing our queries, we aim to retrieve the least amount of data necessary while shifting more computational responsibilities to the indexer. CompromisesThis approach to sequence detection emphasizes efficiency in network usage at the expense of certain other factors:
Algorithm Overwiev
StepsThe sequence detection process is initiated by a query triggered with an initial timestamp
Computational costsTo calculate the theoretical total number of queries Where:
Average Number of Expected QueriesWhile Where:
ComplexityFurther simplifying the average queries, we have: Where:
As |
After a comprehensive investigation of various methods to perform rule correlation within the Wazuh engine, we have decided to develop a custom syntax for correlation rules. This approach provides users with clear guidance in creating correlation rules. With this new syntax, the Wazuh engine will construct the necessary queries to OpenSearch on the back end, ensuring better control over the query types. Additionally, a functions helpers will be available to assist users in creating correlations based on query results. The next step is to develop a proof of concept (PoC) to define this syntax and conduct benchmarks to evaluate the cost of queries and subsequent processing. |
Description
This issue is focused on exploring the potential integration of the OpenSearch SQL plugin to enhance Wazuh's event correlation and frequency rule capabilities. The goal is to determine if this plugin can be effectively utilized to correlate events processed by the wazuh-engine and stored in OpenSearch indices.
Objective
Tasks
Expected Outcomes
Notes
This research is crucial for advancing Wazuh's capabilities in handling complex event correlations efficiently and could lead to significant improvements in how security events are processed and analyzed.
The text was updated successfully, but these errors were encountered: