config_file: fix quadratic behaviour when adding config multivars #4799
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In case where we add multiple configuration entries with the same key to
a diskfile backend, we always need to iterate the list of this key to
find the last entry due to the list being a singly-linked list. This
is obviously quadratic behaviour, and this has sure enough been found by
oss-fuzz by generating a configuration file with 50k lines, where most
of them have the same key. While the issue will not arise with "sane"
configuration files, an adversary may trigger it by providing a crafted
".gitmodules" file, which is delivered as part of the repo and also
parsed by the configuration parser.
The fix is trivial: store a pointer to the last entry of the list in its
head. As there are only two locations now where we append to this data
structure, mainting this pointer is trivial, too. We can also optimize
retrieval of a single value via
config_get
, where we previously had tochase the
next
pointer to find the last entry that was added.Using our configuration file fozzur with a corpus that has a single file
with 50000 "-=" lines previously took around 21s. With this optimization
the same file scans in about 0.053s, which is a nearly 400-fold
improvement. But in most cases with a "normal" amount of same-named keys
it's not going to matter anyway.