Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix several bugs + add long-running scenario tests and build reports #12

Merged
merged 33 commits into from
May 21, 2014

Conversation

robhruska
Copy link
Member

New Tests

Adds Hudl.Mjolnir.SystemTests, which runs as a unit test, but is actually more of a long-running system test.

It's capable of serially running different scenarios to test and report on circuit breaker and thread pool behavior. Tests can hit a locally-running HTTP server whose response behavior can be configured.

Example tests:

  • Execute command once per second with HTTP endpoint immediately responding with 200.
  • Execute command once per second with HTTP endpoint delaying 15 seconds before sending 200 response.
  • Execute five commands per second with HTTP endpoint responding with 500, also setting custom breaker thresholds and pool sizes.

For each test, Riemann metrics are intercepted and built into a Highcharts report:

(Download actual HTML here)

image

Minor notelets:

  • Needs to run as admin for now since it starts an HTTP server (or it needs a urlacl add, either way). Ideally we'd probably not run the HTTP server on the same machine (or at least in the same VM) as the tests to isolate the tests.

Fix errors not being counted by circuit breakers

After running a few scenarios and seeing a couple charts that didn't make sense to me, I discovered a pretty serious bug in Mjolnir where failures wouldn't get registered with the circuit breaker.

Since we weren't awaiting the result of ExecuteAsync() in our try/catch, we'd catch exceptions that were thrown directly from ExecuteAsync(), but wouldn't see exceptions thrown from the Task that it returned. Our unit tests were only testing the former, so we didn't catch this.

I fixed the issue and added a bunch of unit tests to make sure that we have the same behavior for exceptions coming out of ExecuteAsync(), both from the call itself and from its returned Task.

Riemann via CommandContext

PLAT-112 - In order to better control Riemann during testing, I changed all of the components to grab their default Riemann instance from CommandContext instead of using RiemannStats.Instance. The CommandContext property can be changed, so changing it early in the app (before breakers and pools initialize) will cause them all to use it as well.

Fix Cancellation with [Command]

PLAT-21- CancellationTokens provided by Mjolnir (via the Timeout property for
[Command]) weren't being passed through to CancellationToken parameters in
Bifrost service calls.

The Mjolnir token will now replace empty or null parameter values in
Bifrost method calls if a CancellationToken parameter exists. If a token
was explicitly provided to the call, it won't be replaced.

Avoid creating unnecessary ConfigurableValues

PLAT-113 - We'd create and then discard ConfigurableValues on every GetCircuitBreaker() call, which may have exacerbated an existing memory leak in our config library. Moved them into the delegate that gets called on breaker creation.

Updated project README

Updated the README for this project to have examples and useful content.

CancellationTokens provided by Mjolnir (via the Timeout property for
[Command]) weren't being passed through to CancellationToken parameters in
Bifrost service calls.

The Mjolnir token will now replace empty or null parameter values in
Bifrost method calls if a CancellationToken parameter exists. If a token
was explicitly provided to the call, it won't be replaced.
Change some test class visibilities and sleep times.
To facilitate controlling Riemann for all of Mjolnir during system
testing.
Discovered a bug in Command where exeptions thrown directly from
ExecuteAsync() were handled correctly, but exceptions thrown from its
returned Task weren't marking failures in the circuit breaker (because we
weren't awaiting the task in our try/catch.

Fixed this bug and added a bunch of unit tests to make sure we have
identical behavior regardless of where the exception is thrown from in
ExecuteAsync().

Also updated the system test to dump its metrics to file per-scenario.
…s' into SystemTests

Conflicts:
	Hudl.Mjolnir/Properties/AssemblyInfo.cs
@robhruska robhruska changed the title Fix major circuit breaker bug, add long-running scenario tests and build reports Fix several bugs and add long-running scenario tests and build reports May 20, 2014
@robhruska robhruska changed the title Fix several bugs and add long-running scenario tests and build reports Fix several bugs + add long-running scenario tests and build reports May 20, 2014
robhruska added a commit that referenced this pull request May 21, 2014
Fix several bugs + add long-running scenario tests and build reports
@robhruska robhruska merged commit aae659c into master May 21, 2014
@robhruska robhruska deleted the SystemTests branch May 21, 2014 03:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants