Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tools/goreplay-middleware: Add goreplay middleware #4496

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

bartekn
Copy link
Contributor

@bartekn bartekn commented Aug 2, 2022

PR Checklist

PR Structure

  • This PR has reasonably narrow scope (if not, break it down into smaller PRs).
  • This PR avoids mixing refactoring changes with feature changes (split into two PRs
    otherwise).
  • This PR's title starts with name of package that is most changed in the PR, ex.
    services/friendbot, or all or doc if the changes are broad or impact many
    packages.

Thoroughness

  • This PR adds tests for the most critical parts of the new functionality or fixes.
  • I've updated any docs (developer docs, .md
    files, etc... affected by this change). Take a look in the docs folder for a given service,
    like this one.

Release planning

  • I've updated the relevant CHANGELOG (here for Horizon) if
    needed with deprecations, added features, breaking changes, and DB schema changes.
  • I've decided if this PR requires a new major/minor version according to
    semver, or if it's mainly a patch change. The PR is targeted at the next
    release branch if it's not a patch change.

What

Adds middleware for goreplay which checks if the mirrored response matches the original response.

Close #2840.

Why

goreplay middleware gives access to request and responses of original and mirrored targets. This allows us to replicate horizon-cmp functionality but on a larger scale (ex. horizon-cmp sends requests to public load balancers so normal rate limiting applies).

Known limitations

Currently logs mismatched response bodied to stderr. In the future, we can send files to S3 and build some diff checker infrastructure on top of it.

@Shaptic
Copy link
Contributor

Shaptic commented Aug 3, 2022

I'm concerned about using goreplay to mirror production traffic to the k8s cluster that @sreuland is deploying Horizon Lite to. Will the hardware be able to handle it? Can we otherwise tweak the replay settings to e.g. replay 10% of requests or at least filter them on a particular endpoint? And does the main prod that's doing the mirroring care at all about the performance of the servers it's mirroring to? For example if Lite takes 30s to fulfill a mirrored request, does the initiator care at all?

@bartekn
Copy link
Contributor Author

bartekn commented Aug 4, 2022

Can we otherwise tweak the replay settings to e.g. replay 10% of requests or at least filter them on a particular endpoint?

Both things can be done via CLI flags to goreplay command:

And does the main prod that's doing the mirroring care at all about the performance of the servers it's mirroring to? For example if Lite takes 30s to fulfill a mirrored request, does the initiator care at all?

No, prod doesn't care about mirroring and it doesn't affect it at all: How Traffic Mirroring works.

if !req.ResponseEquals() {
// TODO improve the message to at least print the requested path
// TODO in the future publish the results to S3 for easier processing
os.Stderr.WriteString("MISMATCH " + req.SerializeBase64() + "\n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can these replay comparison results potentially intermix with normal horizon app log output to console, making it harder to read overall? if so, maybe it's worth optional configurable target(defaulted to console err) for goreplay results output early on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This messages land in completely different log: goreplay log. It's separate from Horizon log. Ideally the diffs are sent to external server but for testing purposes I wanted to start with something simple.

@sreuland
Copy link
Contributor

looks good, would it be possible to show a small diagram of how the replay process works for context? such as the flow from prod(source), target(mirror server), goreplay.log , and the jenkins stellar-goreplay-service-action job. I was trying to understand where 'rate limit' parameter from the jenkins job would flow into here, or I may have mis-understood the context.

@bartekn
Copy link
Contributor Author

bartekn commented Aug 17, 2022

@sreuland I added a short comment explaining how middleware works. @stellar/horizon-committers I think this is ready for review because it helped a lot during Horizon v2.20.0 testing. PTAL.

@bartekn bartekn marked this pull request as ready for review August 17, 2022 10:19
@bartekn bartekn requested a review from a team August 17, 2022 10:19
@sreuland sreuland mentioned this pull request Jun 26, 2023
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

go-replay middleware/plugin for a/b regression testing
3 participants