Skip to content

Conversation

@Anilm3
Copy link
Collaborator

@Anilm3 Anilm3 commented Mar 30, 2025

This PR updates the ddwaf_object layout with the following goals:

  • Reduce the memory footprint of a single object as well as the footprint of containers.
  • Split keys and values so they can be used independently as objects without the need for intermediate.
  • Update all code using object internals to account for the new changes.

This is a first iteration of the layout changes, attempting to keep the change small and reviewable.

This PR excludes the following future changes:

  • Constant strings (i.e. strings that shouldn't be freed).
  • Small string optimisation support.
  • Rewrite of the ddwaf_object API.
  • Allocator support, which requires the point above.

@Anilm3 Anilm3 changed the title First iteration of object layout changes [v2] First iteration of object layout changes Mar 30, 2025
@pr-commenter
Copy link

pr-commenter bot commented Mar 30, 2025

Benchmarks clang

Benchmark execution time: 2025-05-23 16:50:32

Comparing candidate commit ce3d8d6 in PR branch anilm3/object-layout-first-iteration with baseline commit 91c7b91 in branch anilm3/v2.

Found 1 performance improvements and 0 performance regressions! Performance is the same for 0 metrics, 0 unstable metrics.

scenario:global-benchmark.random.clang

  • 🟩 execution_time [-18.752ms; -18.703ms] or [-7.223%; -7.204%]

@pr-commenter
Copy link

pr-commenter bot commented Mar 30, 2025

Benchmarks gcc

Benchmark execution time: 2025-05-23 16:52:25

Comparing candidate commit ce3d8d6 in PR branch anilm3/object-layout-first-iteration with baseline commit 91c7b91 in branch anilm3/v2.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 1 metrics, 0 unstable metrics.

@pr-commenter
Copy link

pr-commenter bot commented Mar 30, 2025

Benchmarks clang-pgo

Benchmark execution time: 2025-05-23 10:25:27

Comparing candidate commit cc1a813 in PR branch anilm3/object-layout-first-iteration with baseline commit 91c7b91 in branch anilm3/v2.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 1 metrics, 0 unstable metrics.

@codecov-commenter
Copy link

codecov-commenter commented Mar 30, 2025

Codecov Report

Attention: Patch coverage is 80.82192% with 28 lines in your changes missing coverage. Please review.

Project coverage is 85.52%. Comparing base (91c7b91) to head (ce3d8d6).

Files with missing lines Patch % Lines
src/object.hpp 73.86% 1 Missing and 22 partials ⚠️
src/iterator.cpp 72.72% 0 Missing and 3 partials ⚠️
src/iterator.hpp 60.00% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##           anilm3/v2     #394      +/-   ##
=============================================
+ Coverage      85.31%   85.52%   +0.21%     
=============================================
  Files            174      174              
  Lines           8967     8920      -47     
  Branches        3808     3801       -7     
=============================================
- Hits            7650     7629      -21     
+ Misses           517      503      -14     
+ Partials         800      788      -12     
Flag Coverage Δ
waf_test 85.52% <80.82%> (+0.21%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Anilm3 Anilm3 requested a review from Copilot April 16, 2025 19:08
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 46 out of 49 changed files in this pull request and generated no comments.

Files not reviewed (3)
  • fuzzer/CMakeLists.txt: Language not supported
  • fuzzer/global/run.sh: Language not supported
  • libddwaf.def: Language not supported

@Anilm3 Anilm3 marked this pull request as ready for review May 23, 2025 10:22
@Anilm3 Anilm3 requested a review from a team as a code owner May 23, 2025 10:22
case DDWAF_OBJ_STRING:
HSTRING_APPEND_CONST(str, "<STRING> ");
_hstring_append(str, pwargs->stringValue, pwargs->nbEntries);
_hstring_append(str, pwargs->via.str, pwargs->size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if it's a small string? are they not supported yet? In general, I think directly accessing via directly (or, for that matter, any other fields) is not a good idea for forward compatibility. You can add some inline functions or, if you must, macros.

I also think you should split the concepts of the union tag and the logical type. This is because there are two types of string and you might need to add large strings in the future too, since you took the option of keeping capacity and reduce the size to 16bit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't add small / large string support yet, that'll come in one of the next PRs as I wanted to keep this one reasonable in size.

In general, I think directly accessing via directly (or, for that matter, any other fields) is not a good idea for forward compatibility. You can add some inline functions or, if you must, macros.

Internally, within the object abstraction, I think it's perfectly fine. Externally, there's an API, but I haven't updated the smoke test to use it.

I also think you should split the concepts of the union tag and the logical type. This is because there are two types of string and you might need to add large strings in the future too, since you took the option of keeping capacity and reduce the size to 16bit.

Yes, indeed, in fact I think the small string won't even be within the union, I'm not sure about the large one though.

realloc(static_cast<void *>(obj.array), size * sizeof(object)));
if (new_array == nullptr) [[unlikely]] {
// NOLINTNEXTLINE(hicpp-no-malloc)
auto *data = static_cast<T *>(calloc(size, sizeof(T)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might as well use posix_memalign since you know T and therefore its alignment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could but since allocators are coming soon (when I change the API) I don't want to bother with these functions.

friend Derived;
};

// Temporary abstraction, this will be removed once the keys and values are
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a big fan of the removal of this abstraction. It could be useful to encode that certain objects are keys in the type system; additionally, from what I can tell, they do have the restriction that they must be strings, so it may make sense to keep the type to enforce this or future restrictions too

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not to worry, I plan to make the dynamic_string type be an object_string type for more convenience.

@Anilm3 Anilm3 merged commit 59f701d into anilm3/v2 May 23, 2025
50 of 51 checks passed
@Anilm3 Anilm3 deleted the anilm3/object-layout-first-iteration branch May 23, 2025 17:02
Anilm3 added a commit that referenced this pull request Oct 1, 2025
* Object view: read only abstraction to ddwaf_object (#341)
* [v2] Remove mingw builds (#381)
* Writable objects: owned and borrowed object and object limits removal (#378, #382)
* [v2] Refactor and improve object types (#387)
* [v2] Update unit tests to use new abstractions (#389)
* [v2] Update `raw_configuration` type to use `object_view` (#390)
* [v2] Remove remaining uses of ddwaf_object in `src` and `tests/unit` (#391)
* [v2] JWT Decoding Processor (#401)
* [v2] First iteration of object layout changes (#394)
* Split context data insertion from evaluation (#407)
* [v2] Second iteration of object layout changes (#408)
* [v2] Add new fingerprint and object view tests (#414)
* Reenable attribute collector unit test (#415)
* [v2] Container view types (#413)
* Exclude assertions from coverage (#416)
* [v2] Use allocators internally instead of malloc/free and stop generating zero-terminated strings (#418)
* Add memory resource to owned and borrowed objects (#428)
* [v2] Propagate allocators from context (#420)
* [v2] Update interface and expose allocators (#427)
* Refactor evaluation stages out of the context (#442)
* Subcontext: replace ephemerals with a new scope with user-defined lifetime derived from the context (#443)
* Validator: Add support for testing subcontexts and attributes (#451)
* [v2] Pass allocator to context and subcontext eval and add new allocators (#452)
* Update logger to avoid dependencies on ddwaf.h (#453)
* [v2] Return DDWAF_MATCH when there are events, attributes or actions (#455)
* [v2] Cleanup: remove exclusion namespace and some redundant references (#456)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants