-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: move scrubbing logger to logx #1319
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This diff is yak shaving for a subsequent diff that attempts to mitigate potential IP addresses leaks caused by using happy eyeballs more aggressively in the codebase. The general idea is that we previously had a netxlite implementation that gave IPv4 preference over IPv6, meaning that we would end up using IPv4 in most cases and very rarely IPv6. As part of my work to make beacons possible, I have introduced happy eyeballs, which means we may sometimes use IPv4 instead of IPv6. This fact makes it more likely that we include the IPv6 address of a probe when we know its IPv4 address or the other way around into measurements. To mitigate this possible issue before it actually is possible (note that I have not yet changed how we discover the probe IP), I am going to proactively modify the serialization of HTTP bodies and headers to scrub. However, to do this, I need to decouple the scrubber package from model such that internal/model/archival.go can use the scrubber package. Part of ooni/probe#2531
bassosimone
added a commit
that referenced
this pull request
Sep 28, 2023
There are cases where we know we have binary data in input and we want the output to be binary data using the dictionary encoding like `{"format":"base64","data":"..."}`. Such cases are, for example, DNS messages bodies. There is no need for us to pass through the MaybeArchivalBinaryData in such cases. Additionally, this makes MaybeArchivalBinaryData a bit more complex than it probably needs to be. What's more, ArchivalBinaryData is for when you do not require to scrub the data. I want to introduce new data types that automatically perform scrubbing when they're used. But this puts even more pressure on MaybeArchivalBinaryData, and hence this commit, to start differentiating between: 1. always binary data vs maybe binary data 2. no scrubbing requires vs scrubbing required The rationale for doing this set of changes has been explained in #1319. The reference issue is ooni/probe#2531. For now, this commit just adds the new type and tests for it without using the type, which we'll do later once we have added better marshal/unmarshal testing for the interested types.
bassosimone
added a commit
that referenced
this pull request
Sep 28, 2023
There are cases where we know we have binary data in input and we want the output to be binary data using the dictionary encoding like `{"format":"base64","data":"..."}`. Such cases are, for example, DNS messages bodies. There is no need for us to pass through the MaybeArchivalBinaryData in such cases. Additionally, this makes MaybeArchivalBinaryData a bit more complex than it probably needs to be. What's more, ArchivalBinaryData is for when you do not require to scrub the data. I want to introduce new data types that automatically perform scrubbing when they're used. But this puts even more pressure on MaybeArchivalBinaryData, and hence this commit, to start differentiating between: 1. always binary data vs maybe binary data 2. no scrubbing required vs scrubbing required The rationale for doing this set of changes has been explained in #1319. The reference issue is ooni/probe#2531. For now, this commit just adds the new type and tests for it without using the type, which we'll do later once we have added better marshal/unmarshal testing for the interested types.
bassosimone
added a commit
that referenced
this pull request
Sep 28, 2023
I am about to modify this structure to unconditionally use ArchivalBinaryData rather than ArchivalMaybeBinaryData, following the plan that I explained in #1319. Before doing that, I want MORE test coverage for ArchivalTLSOrQUICHandshakeResult. Part of ooni/probe#2531
bassosimone
added a commit
that referenced
this pull request
Sep 28, 2023
I am about to modify this structure to unconditionally use ArchivalBinaryData rather than ArchivalMaybeBinaryData, following the plan that I explained in #1319. Before doing that, I want MORE test coverage for ArchivalTLSOrQUICHandshakeResult. Part of ooni/probe#2531
Murphy-OrangeMud
pushed a commit
to Murphy-OrangeMud/probe-cli
that referenced
this pull request
Feb 13, 2024
This diff is yak shaving for a subsequent diff that attempts to mitigate potential IP addresses leaks caused by using happy eyeballs more aggressively in the codebase. The general idea is that we previously had a netxlite implementation that gave IPv4 preference over IPv6, meaning that we would end up using IPv4 in most cases and very rarely IPv6. As part of my work to make beacons possible, I have introduced happy eyeballs, which means we may sometimes use IPv4 instead of IPv6 if we adopt happy eyeballs more widely. This fact makes it more likely that we include the IPv6 address of a probe when we know its IPv4 address or the other way around into measurements. To mitigate this possible issue before it actually is possible (note that I have not yet changed how we discover the probe IP), I am going to proactively modify the serialization of HTTP bodies and headers to scrub IP endpoints unconditionally. However, to do this, I need to decouple the scrubber package from model such that internal/model/archival.go can use the scrubber package. Part of ooni/probe#2531
Murphy-OrangeMud
pushed a commit
to Murphy-OrangeMud/probe-cli
that referenced
this pull request
Feb 13, 2024
There are cases where we know we have binary data in input and we want the output to be binary data using the dictionary encoding like `{"format":"base64","data":"..."}`. Such cases are, for example, DNS messages bodies. There is no need for us to pass through the MaybeArchivalBinaryData in such cases. Additionally, this makes MaybeArchivalBinaryData a bit more complex than it probably needs to be. What's more, ArchivalBinaryData is for when you do not require to scrub the data. I want to introduce new data types that automatically perform scrubbing when they're used. But this puts even more pressure on MaybeArchivalBinaryData, and hence this commit, to start differentiating between: 1. always binary data vs maybe binary data 2. no scrubbing required vs scrubbing required The rationale for doing this set of changes has been explained in ooni#1319. The reference issue is ooni/probe#2531. For now, this commit just adds the new type and tests for it without using the type, which we'll do later once we have added better marshal/unmarshal testing for the interested types.
Murphy-OrangeMud
pushed a commit
to Murphy-OrangeMud/probe-cli
that referenced
this pull request
Feb 13, 2024
I am about to modify this structure to unconditionally use ArchivalBinaryData rather than ArchivalMaybeBinaryData, following the plan that I explained in ooni#1319. Before doing that, I want MORE test coverage for ArchivalTLSOrQUICHandshakeResult. Part of ooni/probe#2531
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This diff is yak shaving for a subsequent diff that attempts to mitigate potential IP addresses leaks caused by using happy eyeballs more aggressively in the codebase. The general idea is that we previously had a netxlite implementation that gave IPv4 preference over IPv6, meaning that we would end up using IPv4 in most cases and very rarely IPv6. As part of my work to make beacons possible, I have introduced happy eyeballs, which means we may sometimes use IPv4 instead of IPv6 if we adopt happy eyeballs more widely. This fact makes it more likely that we include the IPv6 address of a probe when we know its IPv4 address or the other way around into measurements. To mitigate this possible issue before it actually is possible (note that I have not yet changed how we discover the probe IP), I am going to proactively modify the serialization of HTTP bodies and headers to scrub IP endpoints unconditionally. However, to do this, I need to decouple the scrubber package from model such that internal/model/archival.go can use the scrubber package.
Part of ooni/probe#2531