Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add options support for the SQL obfuscator and update CGo bindings #8556

Merged
merged 23 commits into from
Jul 9, 2021

Conversation

alexbarksdale
Copy link
Member

@alexbarksdale alexbarksdale commented Jul 6, 2021

What does this PR do?

Adds support to provide options to configure the behavior of the SQL obfuscator, as well as update the python CGo bindings to consume the obfuscator options from the integration.

Motivation

The current iteration of the obfuscator is unable to handle unique use cases, and this PR aims to make it more generic. For example, there is an option to quantize SQL tables via a feature flag. If this flag is enabled, it's an all or nothing scenario, meaning if one agent is monitoring two database instances, and one needs obfuscation but the other explicitly does not, we cannot adapt to this requirement.

Additional Notes

Important context

The primary change in this PR includes the implementation of the newly added options to configure the obfuscator for the CGo bindings used by various integrations. As mentioned before, the quantize_sql_tables option is currently an all or nothing option, we need this to be configurable per database instance. This PR has removed the use of the feature flag and instead consumes the options to enable/disable it. For further details, please read below on this option will now be configured.

How the CGo bindings will consume options from the integration

  1. For each integration, there is an instance config that stores various config options for one particular instance. For the purpose of this explanation, the integrations we are focusing on is MySQL and PostgreSQL since quantize_sql_tables will be moved into their instance configs and passed into the CGo bindings.
  2. The database integrations utilize the obfuscation library in the agent to obfuscate a SQL statement via C bindings. The C bindings invoke the obfuscate library in Go when this Python method is used.
  3. In short the quantize_sql_tables option and future options that need to be at a per instance level will configure the obfuscator as such: Pull instance config at the integration -> pass the value to a Python method that implements a CGo binding -> trigger C code -> invoke Go code

Performance of the options unmarshalling in the CGo binding

To keep things as performant as possible, I opted to use the easyjson library to perform the unmarshalling of the options from the integration. The snippet below is a benchmark that compares the use of easyjson to encoding/json.

cpu: Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz

easyjson
---------------
BenchmarkSQLObfuscationConfigEasyJSONDeserialization/Range:1-8           3721394               311.1 ns/op             0 B/op          0 allocs/op
BenchmarkSQLObfuscationConfigEasyJSONDeserialization/Range:10-8          3860244               305.7 ns/op             0 B/op          0 allocs/op
BenchmarkSQLObfuscationConfigEasyJSONDeserialization/Range:100-8         3802209               304.6 ns/op             0 B/op          0 allocs/op
BenchmarkSQLObfuscationConfigEasyJSONDeserialization/Range:1000-8        3899506               311.0 ns/op             0 B/op          0 allocs/op
BenchmarkSQLObfuscationConfigEasyJSONDeserialization/Range:10000-8       3774478               307.1 ns/op             0 B/op          0 allocs/op
BenchmarkSQLObfuscationConfigEasyJSONDeserialization/Range:100000-8      3848799               319.3 ns/op             0 B/op          0 allocs/op

encoding/json
---------------
BenchmarkSQLObfuscationConfigRegularJSONDeserialization/Range:1-8        1282687               921.2 ns/op           208 B/op          4 allocs/op
BenchmarkSQLObfuscationConfigRegularJSONDeserialization/Range:10-8       1298893               920.0 ns/op           208 B/op          4 allocs/op
BenchmarkSQLObfuscationConfigRegularJSONDeserialization/Range:100-8      1302337               923.8 ns/op           208 B/op          4 allocs/op
BenchmarkSQLObfuscationConfigRegularJSONDeserialization/Range:1000-8     1295082               925.8 ns/op           208 B/op          4 allocs/op
BenchmarkSQLObfuscationConfigRegularJSONDeserialization/Range:10000-8    1289006               934.6 ns/op           208 B/op          4 allocs/op
BenchmarkSQLObfuscationConfigRegularJSONDeserialization/Range:100000-8   1283230               996.3 ns/op           208 B/op          4 allocs/op

Integration changes for added context

Describe how to test your changes

  1. Test the CGo bindings by running: invoke rtloader.test at the root dir.
  2. Test that the use of easyjson results in the same output as encoding/json by running: invoke test --targets=./pkg/trace/config --skip-linters at the root dir.
  3. Test to see if the obfuscator is reading the new config (that stores the options) by running: invoke test --targets=./pkg/trace/obfuscate --skip-linters at the root dir.

Checklist

  • A release note has been added or the changelog/no-changelog label has been applied.

@alexbarksdale alexbarksdale added this to the 7.30.0 milestone Jul 6, 2021
@alexbarksdale alexbarksdale requested a review from gbbr July 6, 2021 05:10
@alexbarksdale alexbarksdale marked this pull request as ready for review July 6, 2021 05:10
@alexbarksdale alexbarksdale requested review from a team as code owners July 6, 2021 05:10
@alexbarksdale alexbarksdale added the team/agent-apm trace-agent label Jul 6, 2021
@olivielpeau
Copy link
Member

Since lazyInitObfuscator only initializes the obfuscator once for the core agent process' lifetime, only the first traceconfig.SQLObfuscationConfig that's passed to it will be taken into account. So only the first integration instance that uses the obfuscator will see its obfuscation config taken into account. Am I missing something?

Addressing this either requires:

  • having the ability to update the global obfuscator's config on the fly on each obfuscation call (probably has a perf overhead, and will have to be made thread-safe for example with proper locking), or
  • instantiating one obfuscator instance per each check instance (for example, passing the check ID that's unique to an integration instance, similar to what's done for the metric submission calls that use a per-check "sender": ; note that when the integration instance is unscheduled, this obfuscator instance would need to be cleaned up as well, see how the "sender" is cleaned up:
    aggregator.DestroySender(c.id)
    )

@alexbarksdale
Copy link
Member Author

Since lazyInitObfuscator only initializes the obfuscator once for the core agent process' lifetime, only the first traceconfig.SQLObfuscationConfig that's passed to it will be taken into account. So only the first integration instance that uses the obfuscator will see its obfuscation config taken into account. Am I missing something?

Addressing this either requires:

  • having the ability to update the global obfuscator's config on the fly on each obfuscation call (probably has a perf overhead, and will have to be made thread-safe for example with proper locking), or
  • instantiating one obfuscator instance per each check instance (for example, passing the check ID that's unique to an integration instance, similar to what's done for the metric submission calls that use a per-check "sender":
    ; note that when the integration instance is unscheduled, this obfuscator instance would need to be cleaned up as well, see how the "sender" is cleaned up:
    aggregator.DestroySender(c.id)

    )

You're totally right, I updated it to where we pass the options directly to the method rather than trying to use the receiver struct to take/use options. a186696c9124afcb921358a7a06833d33c7b4246

Copy link
Contributor

@sgnn7 sgnn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexbarksdale I left some comments but since some of this is a bit above my head, I'll try to see if someone else from agent-core can review as well.

pkg/collector/python/datadog_agent.go Outdated Show resolved Hide resolved
rtloader/test/datadog_agent/datadog_agent.go Outdated Show resolved Hide resolved
rtloader/test/datadog_agent/datadog_agent.go Show resolved Hide resolved
pkg/collector/python/datadog_agent.go Outdated Show resolved Hide resolved
pkg/trace/config/apply_test.go Outdated Show resolved Hide resolved
pkg/trace/config/config_test.go Outdated Show resolved Hide resolved
pkg/trace/obfuscate/sql.go Outdated Show resolved Hide resolved
pkg/trace/obfuscate/sql_test.go Outdated Show resolved Hide resolved
pkg/trace/obfuscate/sql.go Outdated Show resolved Hide resolved
Copy link
Contributor

@sgnn7 sgnn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's pending feedback for this comment but overall LGTM

Copy link
Contributor

@remeh remeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me!
I've left a few comments, two of them not blocking anything but please make sure to review the one concerning the error return.

pkg/trace/obfuscate/sql.go Outdated Show resolved Hide resolved
pkg/trace/obfuscate/json.go Outdated Show resolved Hide resolved
pkg/trace/obfuscate/sql.go Outdated Show resolved Hide resolved
@alexbarksdale
Copy link
Member Author

Update: Tested things locally with the integration changes and everything is looking good!
PostgreSQL integration change: link
MySQL integration change: link

pkg/trace/config/config_test.go Outdated Show resolved Hide resolved
pkg/trace/obfuscate/sql.go Outdated Show resolved Hide resolved
pkg/trace/config/apply_test.go Outdated Show resolved Hide resolved
pkg/trace/config/apply_test.go Outdated Show resolved Hide resolved
pkg/trace/config/apply_test.go Outdated Show resolved Hide resolved
pkg/trace/obfuscate/sql.go Outdated Show resolved Hide resolved
pkg/trace/obfuscate/sql.go Outdated Show resolved Hide resolved
pkg/trace/obfuscate/sql_test.go Outdated Show resolved Hide resolved
pkg/trace/obfuscate/sql.go Show resolved Hide resolved
@alexbarksdale alexbarksdale requested a review from gbbr July 8, 2021 16:57
pkg/trace/obfuscate/obfuscate.go Outdated Show resolved Hide resolved
@alexbarksdale alexbarksdale requested a review from djova July 8, 2021 22:47
pkg/trace/config/apply_test.go Outdated Show resolved Hide resolved
pkg/trace/config/apply_test.go Outdated Show resolved Hide resolved
pkg/trace/obfuscate/sql.go Show resolved Hide resolved
pkg/trace/obfuscate/json.go Outdated Show resolved Hide resolved
pkg/trace/config/apply.go Outdated Show resolved Hide resolved
@alexbarksdale alexbarksdale requested a review from gbbr July 9, 2021 15:07
pkg/trace/obfuscate/sql.go Outdated Show resolved Hide resolved
Copy link
Contributor

@gbbr gbbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine to unblock you, but we should find a better way to do it. We can refactor in another PR.

@alexbarksdale alexbarksdale requested review from a team as code owners July 9, 2021 17:32
@alexbarksdale alexbarksdale force-pushed the alex.barksdale/obfuscator-options branch from b743e43 to 8d17878 Compare July 9, 2021 17:51
@alexbarksdale alexbarksdale removed request for a team July 9, 2021 17:51
@alexbarksdale
Copy link
Member Author

Apologies for the goof up, please ignore.

@alexbarksdale alexbarksdale merged commit 8a861ba into main Jul 9, 2021
@alexbarksdale alexbarksdale deleted the alex.barksdale/obfuscator-options branch July 9, 2021 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team/agent-apm trace-agent
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants