Skip to content

Conversation

cataphract
Copy link
Contributor

@cataphract cataphract commented Mar 12, 2024

Adds WAF support and changes several aspects of the build process. The "build images" are gone; there is only one build image for each arch. The build image is musl/libc++ based and works on both musl and glibc. Increases the compiler requirements to C++20, since libddwaf requires that.

@cataphract cataphract force-pushed the glopes/waf branch 2 times, most recently from 492443d to cb7d395 Compare March 27, 2024 08:42
Copy link
Contributor

@dmehala dmehala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started to review 2-3 days ago under the assumption that the code was feature-complete. However, as I delved deeper, I found missing features, like HTTP/JSON template. It appears that new features are still being introduced, giving the impression that this should be treated as a draft PR.

I have not tested it but good job. However, I've notice a fundamental issue with the usage of single_trace, but that's probably because you are still actively working on the WAF implementation.

Here is a comprehensive list of areas we should address before delivering:

  • Dependency on rapidjson.
  • Clear separation between Tracing and WAF: Comment related to RequestTracing.
  • Thread Pool usage: Unless I'm mistaken, we should avoid created a thread pool per datadog_waf_thread_pool_name directives.
  • Thorough configuration testing: More extensive testing is necessary.
  • OWNERSHIP file.
  • Build
    • Introduce build argument to build either Tracing or Tracing + ASM.
    • Generate a single .so for ease of use for the customer.
  • Enhanced observability: Add a span on on_access including details on the config file used, the rule executed, client ip and all WAF tags?

I refrained from reviewing anything under securty/ because, ownership and potential extensive discussions on C++. Nevertheless, I recommend using comprehensible variable and function names and to stay consistent with the existing code.

Lastly, it's worth noting that while you've successfully generalized Remote Config, its usage here introduces an avoidable dependency. We should consider delivering first WAF without Remote Config, then, adding this feature later.

@cataphract
Copy link
Contributor Author

I started to review 2-3 days ago under the assumption that the code was feature-complete. However, as I delved deeper, I found missing features, like HTTP/JSON template. It appears that new features are still being introduced, giving the impression that this should be treated as a draft PR.

Yes, I was doing some work and refactoring, but it's feature complete at this point (maybe I'll add the metrics, idk).

  • Thorough configuration testing: More extensive testing is necessary.

Yes, I'm not finished writing the tests.

  • OWNERSHIP file.

Will do.

  • Build

    • Introduce build argument to build either Tracing or Tracing + ASM.

Will do. See also my comment about updating the toolchain and the minimum glibc version.

  • Generate a single .so for ease of use for the customer.
  • Enhanced observability: Add a span on on_access including details on the config file used, the rule executed, client ip and all WAF tags?

We have some standard metrics that track the time spent in the WAF, but I'm not sure I'll implement them in this first version. We also have reporting of client_ip and all the IP headers in some circumstances. That's also on the TODO list, but maybe not for now.

I refrained from reviewing anything under securty/ because, ownership and potential extensive discussions on C++. Nevertheless, I recommend using comprehensible variable and function names and to stay consistent with the existing code.

Can you indicate the conventions you prefer? I can them create a .clang-tidy to enforce these.

Lastly, it's worth noting that while you've successfully generalized Remote Config, its usage here introduces an avoidable dependency. We should consider delivering first WAF without Remote Config, then, adding this feature later.

I can create a version without remote config once I've addressed the comments here and added testing.

@dmehala
Copy link
Contributor

dmehala commented Mar 30, 2024

Cool, thanks.

I am on holiday and will be back April 10th, am I blocking you to move forward? If not, then I can revisit once I am back. My goal is to cut a new release and then we can start yoloing.

@cataphract
Copy link
Contributor Author

I am on holiday and will be back April 10th, am I blocking you to move forward? If not, then I can revisit once I am back. My goal is to cut a new release and then we can start yoloing.

The 10th is fine. This week we're working on random stuff (aka "Inovation Week") so I'm pausing work on this.

@@ -0,0 +1,2 @@
* @DataDog/apm-proxy
src/security/ @DataDog/asm-libddwaf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the @DataDog/asm-cpp group with only us two in it. @dmehala can you include us as maintainers or admin of both this and the dd-trace-cpp repository?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now have at least write access, which should be sufficient for now.

}

template <typename T = ddwaf_obj>
T get(std::string_view key) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at is also used on std maps to access using a key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer to use get to fetch a key, and at to fetch the map entry on a numeric index (which mirrors at on arrays).

}

template <typename T = ddwaf_obj>
T &get_entry_unchecked(std::size_t i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why get_entry_unchecked as opposed to get_unchecked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm changing it to at_unchecked, as mentioned above.

@cataphract cataphract force-pushed the glopes/waf branch 5 times, most recently from 0c672fb to 82ca78f Compare April 15, 2024 14:43
@dmehala dmehala requested a review from nayeem-kamal April 15, 2024 17:02
@cataphract
Copy link
Contributor Author

cataphract commented Apr 16, 2024

@dmehala I see you have a branch for cpp-20... I'm doing work that overlaps a bit (to support building libddwaf), with musl (glibc-compatible)/libc++. That way we don't need separate builds for alpine

@cataphract cataphract force-pushed the glopes/waf branch 10 times, most recently from ac51b90 to 8a49169 Compare April 18, 2024 10:13
@cataphract cataphract force-pushed the glopes/waf branch 4 times, most recently from 56b91d9 to 889c97c Compare April 23, 2024 14:34
@cataphract cataphract requested review from dmehala and Anilm3 April 23, 2024 17:36
@cataphract cataphract force-pushed the glopes/waf branch 4 times, most recently from eb94db1 to ea9df4a Compare April 26, 2024 12:33
} catch (const std::exception &e) {
ngx_log_error(NGX_LOG_ERR, cycle->log, 0,
"Initialising security library failed: %s", e.what());
return NGX_ERROR;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's somewhat debatable if this should prevent tracing from working, although perhaps we can reconsider later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue it should not prevent tracing from working.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have two options:

  • Since this is going to be a beta initially, we could fail for now since the original module containing only tracing is still a separate one.
  • We could override the relevant enabled configuration to off and allow tracing to continue.

Either way, in my view this isn't a blocker for now, but we should definitely reevaluate before GA.

@dmehala dmehala requested review from pablomartinezbernardo and removed request for nayeem-kamal May 3, 2024 13:14
Copy link
Contributor

@dmehala dmehala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a few questions/comments there and there, overall you did a good job.

You greatly improved our CI, which is something I wanted to do since I joined but never really got the time/invested enough time to do it. Thank you.

There's still a pending question regarding which version of dd-trace-cpp should be used. We agreed the first iteration with ASM should not use Remote Config, thus, dd-trace-cpp should point to the latest version v0.2.0.

Additionally, the main README should mention ASM is supported, how to build with ASM and also configuration.md should have a dedicated section introducing new ASM directives.

- checkout
- run: git submodule sync && git submodule update --init --recursive
- run:
command: 'make build-musl'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those that mean build-musl does cross-compilation? If yes, let's just call the target build instead of build-musl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It builds against musl in a linux VM. I think "build" should be reserved for using the toolchain on the host machine. It would be strange if "make build" would generate a library for Linux when run on Mac OS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow I was not focused when I wrote that. I meant, now, musl is the default libc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried running build-and-test-all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have. I then generated this draft release with the release.py.new.unmerged script

Copy link
Contributor

@dmehala dmehala May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing, I was afraid to encounter an issue on older versions. Thank you for simplifying that mess. 🙏

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this file expected to be part of the PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So. This file is my version of release.py before you changed released.py to replace quoting style and other stuff that caused conflicts. Rather than trying to resolve the conflicts, I'd prefer that you review this file to see if you agree with the naming given to the artifacts. Anil already said that WAF shouldn't be used, but rather "appsec". The names also change in two other ways: they're not image specific, the only thing that matters is version and architecture, so the name of the image is stripped from the name. Also I'm uploading the debug symbols as well,.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I resolved the conflict in e361779

Before GA I think it's worth revisiting if we should produce two versions of the module (TRACING and TRACING + APPSEC) or one module with all products but with good default and the possibility to disable/enable per products. What do you think @Anilm3 @cataphract ?

Copy link
Contributor

@Anilm3 Anilm3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No further concerns from me, aside from:

  • The unmerged script that needs to be reconciled, I think we can move on once @dmehala is satisfied.
  • There should be a small section in the README.md explaining the configuration changes required to get ASM working, with an example.

Overall, fantastic work as usual, my tests so far, while not exhaustive, have worked as expected, although the subsequent system tests PR will be exhaustive enough.

} catch (const std::exception &e) {
ngx_log_error(NGX_LOG_ERR, cycle->log, 0,
"Initialising security library failed: %s", e.what());
return NGX_ERROR;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have two options:

  • Since this is going to be a beta initially, we could fail for now since the original module containing only tracing is still a separate one.
  • We could override the relevant enabled configuration to off and allow tracing to continue.

Either way, in my view this isn't a blocker for now, but we should definitely reevaluate before GA.

Copy link
Contributor

@dmehala dmehala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work! :shipit:

Left to do before GA:

  • X module per product or one big module with all products?
  • Update public documentation (new artifact scheme).
  • Add a simple example in example folder.
  • Small section in the README.md explaining the configuration changes required to get ASM working.

@cataphract
Copy link
Contributor Author

@dmehala

  • X module per product or one big module with all products?

I think we can leave two modules for now. The thing I'm more afraid of is that someone may have a build without --threads and that the module fails to load due failing symbol resolution.

  • Update public documentation (new artifact scheme).

I'm going to open a PR there.

  • Add a simple example in example folder.

I've updated the example.

  • Small section in the README.md explaining the configuration changes required to get ASM working.

Done as well.

I also found some memory corruption doing benchmarking that I've addressed in 8a22ddf . I'm not 100% sure it's the best approach (as opposed to changing connection->read/write->handler) but we'll see.

@cataphract cataphract merged commit 0de7eb0 into master May 14, 2024
@cataphract cataphract deleted the glopes/waf branch May 14, 2024 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants