New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dataset: cleanup datasets that hit the memcap while loading #10155
Conversation
Datasets that hit the memcap limit need to be discarded if the memcap is hit or otherwise the datasets are still loaded with partial data while the signature is not loaded due to the memcap error. Ticket: OISF#6678
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #10155 +/- ##
==========================================
- Coverage 82.19% 82.08% -0.12%
==========================================
Files 974 974
Lines 271825 271825
==========================================
- Hits 223416 223115 -301
- Misses 48409 48710 +301
Flags with carried forward coverage won't be shown. Click here to find out more. |
Information: ERROR: QA failed on SURI_TLPW2_autofp_suri_time.
Pipeline 17495 |
How did you test this ? |
I generated a dataset with random HTTP URLs and added entries of those to the list until I triggered the warning but I was still able to check for those URLs via I will add the example set file to redmine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your patch makes sense to me. Thanks! :)
Do you have an idea about how to have such stuff tested in our test suite?
We could start with what we have on redmine so one single rule, the dataset file and let it start, check for the error message and afterwards use the socket to check for an entry being available (which it should not). Not sure if that is fully being covered by suricata-verify options. |
Thanks for the replies. Can it be tested with a suricata-verify test ? with a rule using a dataset with a small memcap..? |
My guess is no. Because, we will get error messages in both the cases (before and after this patch) and the only real test is checking if the dataset still exists after there has been an error (i.e. the hash cleanup wasn't performed). Since we don't integrate |
Is there a memory leak ? |
@@ -406,10 +406,6 @@ int DetectDatasetSetup (DetectEngineCtx *de_ctx, Signature *s, const char *rawst | |||
SCLogError("failed to set up dataset '%s'.", name); | |||
return -1; | |||
} | |||
if (set->hash && SC_ATOMIC_GET(set->hash->memcap_reached)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about detect-data rep.c ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that should be covered as well since this also uses the DatasetGet
function and would now benefit from the same handling within src/dataset.c
so even a good additional argument to move it like we plan to :)
So in detect-datarep.c
it was currently missing completely to check for memcap reached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good indeed.
Should that be reflected in the commit message ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's already quite generic to datasets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's already quite generic to datasets
From what I understand, it is different :
- this commit adds the check for datarep, it did not exist before (it should be testable by suricata-verify)
- this commit moves the check for dataset to avoid dangling memory leaking before final cleanup
If this is correct, I think the commit message should reflect the behavior change for the datarep keyword...
Not exactly. See this code path here in master: https://github.com/OISF/suricata/blob/master/src/datasets.c#L752 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work
CI : ✅
Code : cool
Commits segmentation : ok
Commit messages : 🟠 This fixes more than the commit message says as it also fixes completely memcap
for datarep
keyword
Git ID set : looks fine for me
CLA : ok
Doc update : not needed
Redmine ticket : ok
Rustfmt : not needed
Tests : 🟠 maybe some test can be added (SV test with datarep ? )
Dependencies added: none
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you rebase with an improved commit message Andreas ?
done via #10858 |
Datasets that hit the memcap limit need to be discarded if the memcap is hit or otherwise the datasets are still loaded with partial data while the signature is not loaded due to the memcap error.
Ticket: #6678
Make sure these boxes are signed before submitting your Pull Request -- thank you.
https://docs.suricata.io/en/latest/devguide/contributing/contribution-process.html
https://suricata.io/about/contribution-agreement/ (note: this is only required once)
Link to redmine ticket:
https://redmine.openinfosecfoundation.org/issues/6678
Describe changes:
DatasetGet
function to ensure a proper cleanupProvide values to any of the below to override the defaults.
To use a pull request use a branch name like
pr/N
whereN
is thepull request number.
Alternatively,
SV_BRANCH
may also be a link to anOISF/suricata-verify pull-request.