-
Notifications
You must be signed in to change notification settings - Fork 10
WAF #65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WAF #65
Conversation
492443d
to
cb7d395
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started to review 2-3 days ago under the assumption that the code was feature-complete. However, as I delved deeper, I found missing features, like HTTP/JSON template. It appears that new features are still being introduced, giving the impression that this should be treated as a draft PR.
I have not tested it but good job. However, I've notice a fundamental issue with the usage of single_trace
, but that's probably because you are still actively working on the WAF implementation.
Here is a comprehensive list of areas we should address before delivering:
- Dependency on
rapidjson
. - Clear separation between Tracing and WAF: Comment related to
RequestTracing
. - Thread Pool usage: Unless I'm mistaken, we should avoid created a thread pool per
datadog_waf_thread_pool_name
directives. - Thorough configuration testing: More extensive testing is necessary.
OWNERSHIP
file.- Build
- Introduce build argument to build either
Tracing
orTracing + ASM
. - Generate a single
.so
for ease of use for the customer.
- Introduce build argument to build either
- Enhanced observability: Add a span on
on_access
including details on the config file used, the rule executed, client ip and all WAF tags?
I refrained from reviewing anything under securty/
because, ownership and potential extensive discussions on C++. Nevertheless, I recommend using comprehensible variable and function names and to stay consistent with the existing code.
Lastly, it's worth noting that while you've successfully generalized Remote Config, its usage here introduces an avoidable dependency. We should consider delivering first WAF without Remote Config, then, adding this feature later.
Yes, I was doing some work and refactoring, but it's feature complete at this point (maybe I'll add the metrics, idk).
Yes, I'm not finished writing the tests.
Will do.
Will do. See also my comment about updating the toolchain and the minimum glibc version.
We have some standard metrics that track the time spent in the WAF, but I'm not sure I'll implement them in this first version. We also have reporting of client_ip and all the IP headers in some circumstances. That's also on the TODO list, but maybe not for now.
Can you indicate the conventions you prefer? I can them create a .clang-tidy to enforce these.
I can create a version without remote config once I've addressed the comments here and added testing. |
Cool, thanks. I am on holiday and will be back April 10th, am I blocking you to move forward? If not, then I can revisit once I am back. My goal is to cut a new release and then we can start yoloing. |
The 10th is fine. This week we're working on random stuff (aka "Inovation Week") so I'm pausing work on this. |
.github/CODEOWNERS
Outdated
@@ -0,0 +1,2 @@ | |||
* @DataDog/apm-proxy | |||
src/security/ @DataDog/asm-libddwaf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the @DataDog/asm-cpp group with only us two in it. @dmehala can you include us as maintainers or admin of both this and the dd-trace-cpp repository?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now have at least write access, which should be sufficient for now.
} | ||
|
||
template <typename T = ddwaf_obj> | ||
T get(std::string_view key) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at
is also used on std maps to access using a key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I prefer to use get
to fetch a key, and at
to fetch the map entry on a numeric index (which mirrors at
on arrays).
src/security/ddwaf_obj.h
Outdated
} | ||
|
||
template <typename T = ddwaf_obj> | ||
T &get_entry_unchecked(std::size_t i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why get_entry_unchecked
as opposed to get_unchecked
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm changing it to at_unchecked
, as mentioned above.
0c672fb
to
82ca78f
Compare
@dmehala I see you have a branch for cpp-20... I'm doing work that overlaps a bit (to support building libddwaf), with musl (glibc-compatible)/libc++. That way we don't need separate builds for alpine |
ac51b90
to
8a49169
Compare
56b91d9
to
889c97c
Compare
eb94db1
to
ea9df4a
Compare
} catch (const std::exception &e) { | ||
ngx_log_error(NGX_LOG_ERR, cycle->log, 0, | ||
"Initialising security library failed: %s", e.what()); | ||
return NGX_ERROR; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's somewhat debatable if this should prevent tracing from working, although perhaps we can reconsider later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue it should not prevent tracing from working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have two options:
- Since this is going to be a beta initially, we could fail for now since the original module containing only tracing is still a separate one.
- We could override the relevant
enabled
configuration tooff
and allow tracing to continue.
Either way, in my view this isn't a blocker for now, but we should definitely reevaluate before GA.
5128c38
to
3a6946e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a few questions/comments there and there, overall you did a good job.
You greatly improved our CI, which is something I wanted to do since I joined but never really got the time/invested enough time to do it. Thank you.
There's still a pending question regarding which version of dd-trace-cpp should be used. We agreed the first iteration with ASM should not use Remote Config, thus, dd-trace-cpp should point to the latest version v0.2.0
.
Additionally, the main README should mention ASM is supported, how to build with ASM and also configuration.md
should have a dedicated section introducing new ASM directives.
- checkout | ||
- run: git submodule sync && git submodule update --init --recursive | ||
- run: | ||
command: 'make build-musl' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those that mean build-musl does cross-compilation? If yes, let's just call the target build instead of build-musl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It builds against musl in a linux VM. I think "build" should be reserved for using the toolchain on the host machine. It would be strange if "make build" would generate a library for Linux when run on Mac OS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow I was not focused when I wrote that. I meant, now, musl is the default libc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tried running build-and-test-all
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have. I then generated this draft release with the release.py.new.unmerged
script
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing, I was afraid to encounter an issue on older versions. Thank you for simplifying that mess. 🙏
bin/release.py.new.unmerged
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this file expected to be part of the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So. This file is my version of release.py
before you changed released.py
to replace quoting style and other stuff that caused conflicts. Rather than trying to resolve the conflicts, I'd prefer that you review this file to see if you agree with the naming given to the artifacts. Anil already said that WAF shouldn't be used, but rather "appsec". The names also change in two other ways: they're not image specific, the only thing that matters is version and architecture, so the name of the image is stripped from the name. Also I'm uploading the debug symbols as well,.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I resolved the conflict in e361779
Before GA I think it's worth revisiting if we should produce two versions of the module (TRACING and TRACING + APPSEC) or one module with all products but with good default and the possibility to disable/enable per products. What do you think @Anilm3 @cataphract ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No further concerns from me, aside from:
- The unmerged script that needs to be reconciled, I think we can move on once @dmehala is satisfied.
- There should be a small section in the README.md explaining the configuration changes required to get ASM working, with an example.
Overall, fantastic work as usual, my tests so far, while not exhaustive, have worked as expected, although the subsequent system tests PR will be exhaustive enough.
} catch (const std::exception &e) { | ||
ngx_log_error(NGX_LOG_ERR, cycle->log, 0, | ||
"Initialising security library failed: %s", e.what()); | ||
return NGX_ERROR; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have two options:
- Since this is going to be a beta initially, we could fail for now since the original module containing only tracing is still a separate one.
- We could override the relevant
enabled
configuration tooff
and allow tracing to continue.
Either way, in my view this isn't a blocker for now, but we should definitely reevaluate before GA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work!
Left to do before GA:
- X module per product or one big module with all products?
- Update public documentation (new artifact scheme).
- Add a simple example in
example
folder. - Small section in the
README.md
explaining the configuration changes required to get ASM working.
I think we can leave two modules for now. The thing I'm more afraid of is that someone may have a build without
I'm going to open a PR there.
I've updated the example.
Done as well. I also found some memory corruption doing benchmarking that I've addressed in 8a22ddf . I'm not 100% sure it's the best approach (as opposed to changing |
Adds WAF support and changes several aspects of the build process. The "build images" are gone; there is only one build image for each arch. The build image is musl/libc++ based and works on both musl and glibc. Increases the compiler requirements to C++20, since libddwaf requires that.