-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow optional rewrite rules before computing similarity scores during replay #499
Conversation
2a9b45a
to
4926e3b
Compare
// The following payloads are identical after rewrite rules have been applied. | ||
expect( | ||
computeSimilarity( | ||
"POST", | ||
"/test", | ||
{}, | ||
compactJsonBuffer({ | ||
name: "Jane Doe", | ||
email: | ||
"jane.doe-some-test-6f82fbbe-d36a-4c5c-b47b-84100122fbbc@example.com", | ||
age: 42, | ||
}), | ||
{ | ||
request: { | ||
method: "POST", | ||
path: "/test", | ||
headers: {}, | ||
body: wellFormattedJsonBuffer({ | ||
name: "Jane Doe", | ||
email: "jane.doe-some-test@example.com", | ||
age: 42, | ||
}), | ||
}, | ||
response: DUMMY_RESPONSE, | ||
}, | ||
new RewriteRules().appendRule( | ||
new RewriteRule( | ||
/-[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}(@example.com)$/gi, | ||
"$1" | ||
) | ||
) | ||
) | ||
).toBe(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a new test that tests similarity with rewrite rules. The rest of the changes in this file are just formatting changes applied by the linter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good 👍 few nits/suggestions primarily around documentation
src/cli.ts
Outdated
import { RecordReplayServer } from "./server"; | ||
|
||
function commaSeparatedList(value: string) { | ||
return value ? value.split(",") : []; | ||
} | ||
|
||
const RE_REWRITE_RULE = /s\/(.+(?<!\\))\/(.+(?<!\\))\/([gims]*)/; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We could use some more descriptive names here e.g. RE_SED_INPUT_VALIDATION
& RE_REPLACE_SED_CAPTURE_GROUPS
describe("RewriteRule", () => { | ||
describe("without capture groups", () => { | ||
it("applies the expected changes", () => { | ||
const rule = new RewriteRule(/-[a-z0-9]+-?/gi, ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Could we add a comment explaining the regex to the expressions in these tests? Will help identify the intention and to verify the output is expected
src/cli.ts
Outdated
@@ -40,6 +78,12 @@ async function main(argv: string[]) { | |||
"--https-ca <filename.pem>", | |||
"Enable HTTPS server where the certificate was generated by this CA. Useful if you are using a self-signed certificate. Also requires --https-key and --https-cert." | |||
) | |||
.option<RewriteRules>( | |||
"--rewrite-before-diff [s/find/replace/g...]", | |||
"Provide regex-based rewrite rules over strings before passing them to the diffing algorithm. The regex rules use sed-style syntax. s/find/replace/ with an optional regex modifier suffixes. Capture groups can be used using sed-style \\N syntax. This only is only used during replaying existing tapes.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Provide regex-based rewrite rules over strings before passing them to the diffing algorithm. The regex rules use sed-style syntax. s/find/replace/ with an optional regex modifier suffixes. Capture groups can be used using sed-style \\N syntax. This only is only used during replaying existing tapes.", | |
"Provide regex-based rewrite rules over strings before passing them to the diffing algorithm. The regex rules use sed-style syntax. s/find/replace/ with an optional regex modifier suffixes. Capture groups can be used using sed-style \\N syntax. This is only used during replaying existing tapes.", |
src/cli.ts
Outdated
import { RecordReplayServer } from "./server"; | ||
|
||
function commaSeparatedList(value: string) { | ||
return value ? value.split(",") : []; | ||
} | ||
|
||
const RE_REWRITE_RULE = /s\/(.+(?<!\\))\/(.+(?<!\\))\/([gims]*)/; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question Should we have a comment here describing that this is sed-style regex? I know it's mentioned on line 17, but feels more relevant to have with the definition of the rule.
Do we want to link to a documentation of the style somewhere?
src/cli.ts
Outdated
} | ||
const [rawFind, rawReplace, rawFlags] = match.slice(1, 4); | ||
|
||
// Parse the find regex with the given regex flags. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Parse the find regex with the given regex flags. | |
// Parse the found regex with the given regex flags. |
catgie: "wuff", | ||
xyz: "I hate cats", | ||
}); | ||
expect(o1).toEqual({ dog: "woof", doggie: "wuff", xyz: "I hate dogs" }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"I hate dogs", blasphemy! 😂
for (const rule of this.rules) { | ||
s = rule.apply(value); | ||
} | ||
return (s as any) as T; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question Why is this first converted to any to then immediately be casted to T?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type of s
was resolved as a string with our manual check typeof value === "string"
and subsequent cast let s = value as string
. However the top level recursed function has a generic type T which it needs to return. string
and T
don't overlap and the cast to any
is required to allow us to cast it to T afterwards
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a limitation of TypeScript. Line 54 casts the generic type T
to string
. We need to cast string
back to T
in order to return it. The compiler does not allow casting string
directly to T
as there's no overlap of the types. Going via any
is the way you do a "cast to anything" cast in TypeScript.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My Typescript is nothing to brag about, but it looks okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remaining comments are non-blocking. LGTM 👍
Ah crap. I can't merge this due to what looks to be a broken |
Right. I needed to sign the CLA. Makes sense, I guess. |
In order to be able to replay the correct tape more accurately, we want to be able to apply some rewrite rules before the similarity score is computed between a request and a recorded tape. The intent of these rewrite rules is to remove random values in both the request and the recorded tapes, so that the tapes match exactly (once the random data has been removed).
This PR adds a new varadic CLI argument to allow users to provide zero to many regex-based rewrite rules using sed-style replacement syntax. These regex replacements are performed over all strings that go through similarity scoring.