Redact properties before string-ifying or logging#49
Merged
gerlowskija merged 2 commits intoapache:crossdc-wipfrom Oct 11, 2022
Merged
Conversation
Prior to this commit, certain log messages during Producer/Consumer startup contained potentially sensitive information, such as SSL passwords, etc. This commit amends the string-ification of KafkaCrossDcConf and other Map/Properties types to "redact" properties whose key/name contains a block-list of values. This list currently consists of "password" and "credentials": any property name containing these substrings (ignoring case) will be redacted before logging.
markrmiller
added a commit
that referenced
this pull request
Nov 16, 2022
* WIP for Cross DC consumer (#5) * CrossDc Consumer wip * Temp commit * wip commit * Draft commit, cleanup code * WIP commit * Fix tests, WIP commit * Refactor MessageProcessor and some cleanup. Co-authored-by: Anshum Gupta <anshum@apache.org> * Support for integration tests with SolrCloud. (#6) I've had a few little issues that I've been chasing, so to get something in, I pulled out kafka and some other items and setup a minimal set of changes to use SolrCloud in tests and I have a simple test that runs the same messsage processor test but with a real cloud instance instead of a mock. Hang on a moment and I'll get kafka and the rest back in. * Add back in the EmbeddedKafkaCluster. (#7) * WIP for UpdateRequestProcessor and other producer related files (#8) * IntegrationTest WIP (#9) * Fix a compile error, add solr-core as a dependency as the URP requires it (#10) * Move cross dc producer code into separate module (#11) * Move cross dc producer code into a separate module * Fix package name in javadocs * Create a commons module for crossdc (#12) * Integration tests for cross dc consumer with SolrCloud (#14) * Support for integration tests with SolrCloud. * Add back in the EmbeddedKafkaCluster. * Working out remaining issues via manual testing. (#15) * Tie up * Fix a variety of issues. * Work out SolrAndKafkaIntegrationTest (#16) Each of the 2 tests still needs to be run separately - need to fix up running both in same run. * Test Passing (#17) * Producer only needs one of the many runtime dependencies. (#18) * Experiment with Producer in crossdc package. (#19) * Simplify build artifacts. (#20) * Revert "Experiment with Producer in crossdc package. (#19)" This reverts commit 9007aa6. * Simplify build artifacts and allow for future shading if necessary with commons and producer only creating a single uber jar artifact. * Allow central config from ZK and build updates. (#21) * Docs and enable flag. (#22) * Add explicit error logging around CrossDC config. (#23) * Work around access limitations (what you can do here seems to vary by Java version) (#24) * Fix startup checks (#26) This change corrects the `if` statements in the consumer start, so that it actually checks the config being passed. Previously it just checked the `bootstrapServers` three times. * KafkaCrossDcConf: Add a human readable toString (#27) This change adds a human readable `toString` function to `KafkaCrossDcConf`. * Update Solr version, build improvement for some envs, include missing solrj dep (#28) * Cleanup some of the testing issues. (#25) * Consumer: Add initial metrics (#29) This change adds initial support for Dropwizard Metrics to the Consumer. With this we could allow users to send metrics to a number of reporting backends. * Inefficient DBQ (#31) * Finish setting up the retry queue based on existing implementation. (#30) * Reindex Test, cover a few more test cases, additional config options. (#32) * Reindex Test, cover a few more test cases, additional config options. Adds a basic test focused on reindexing, covers a few more test cases: more replicas and shards, different shard count on primary and secondary dc, allows consumer group to be configured, allows crossdc zk prop file location to be configured. * Remove TODO, some minor fixes, doc. * Add maxPolledRecords and some logging cleanup. * Config improvements and queue bug fixes. (#34) * Configuration improvements, cleanup, minor fixes. (#35) * Explicit SSL Config (#36) * SSL config passed on to Kafka and a variety of general cleanup. (#37) * Flush producer on close to prevent losing any pending updates. (#38) * Config override test and cleanup. (#39) * Flush producer on close to prevent losing any pending updates. * Add a config override test and some cleanup. * Cleanup Kafka config class. (#40) * Improve test to cover a bit more. (#41) * Beef up config override test, additional config logging. (#42) * Allow empty string as a config property value. (#43) * Check docSize against max before mirroring (#45) * Check docSize against max before mirroring Prior to this commit, the MirroringUpdateProcessor had no check to ensure that docs weren't running afoul of the batch size limit set at the Kafka level. This commit changes this to ensure that docs exceeding this limit are not mirrored. These offending docs may still be indexed, based on the value of the URP's "indexUnmirrorableDocs" config property (which defaults to 'false' if not set). * Fix log message printed when mirroring skipped for large doc (#46) * Check batch-size against "max" before mirroring (#48) Prior to this commit, MirroringUpdateProcessor checked that individual documents didn't exceed the mirroring batch-size limit on their own, but no check existed to ensure that the entire list of docs doesn't exceed the limit. This commit adds that check (in MirroringUpdateProcessor.finish()). If the configured max batch-size is exceeded, MUP will log out an error (which includes the IDs of all documents in the batch). * Redact properties before string-ifying or logging (#49) Prior to this commit, certain log messages during Producer/Consumer startup contained potentially sensitive information, such as SSL passwords, etc. This commit amends the string-ification of KafkaCrossDcConf and other Map/Properties types to "redact" properties whose key/name contains a block-list of values. This list currently consists of "password" and "credentials": any property name containing these substrings (ignoring case) will be redacted before logging. * Remove a couple wip testing tools. Co-authored-by: Anshum Gupta <anshum@apache.org> Co-authored-by: Patrik Greco <pgreco@apple.com> Co-authored-by: Jason Gerlowski <gerlowskija@apache.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Prior to this commit, certain log messages during Producer/Consumer startup contained potentially sensitive information, such as SSL passwords, etc.
This commit amends the string-ification of KafkaCrossDcConf and other Map/Properties types to "redact" properties whose key/name contains a block-list of values. This list currently consists of "password" and "credentials": any property name containing these substrings (ignoring case) will be redacted before logging.