Skip to content
This repository has been archived by the owner on May 3, 2024. It is now read-only.

feat(createRequestHtmlFragment): implemented circuit breaker #111

Merged
merged 9 commits into from
May 1, 2020

Conversation

10xLaCroixDrinker
Copy link
Member

@10xLaCroixDrinker 10xLaCroixDrinker commented Apr 21, 2020

Description

This implements a circuit breaker using opossum. The circuit opens when event loop delay is > 250ms (by default- is configurable). When the circuit is open there is no SSR. The circuit will go into a half open state after 10s. At that point it will attempt the full request again, the circuit will close.

Motivation and Context

This provides significant performance improvements.

How Has This Been Tested?

In addition to the unit tests, I did some load testing locally. I ran a json server using json-server http://jsonplaceholder.typicode.com/db -d 800 -p 1337, then changed SSR Frank to make its requests to http://localhost:1337/todos. Once that was set up I called autocannon with autocannon localhost:3000/healthy-frank/ssr-frank -c 200 -d 30 --headers "Accept-Language= en-USen;q=0.5" --headers "User-Agent= curl/7.5.0" . Below are the results for master and for this branch.

(Note the circuit breaker is disabled for perf tests by setting eventLoopDelayThreshold to Infinity. To validate locally, that line should be removed.

master

Running 30s test @ http://localhost:3000/healthy-frank/ssr-frank
200 connections

┌─────────┬────────┬─────────┬─────────┬─────────┬────────────┬─────────┬────────────┐
│ Stat    │ 2.5%   │ 50%     │ 97.5%   │ 99%     │ Avg        │ Stdev   │ Max        │
├─────────┼────────┼─────────┼─────────┼─────────┼────────────┼─────────┼────────────┤
│ Latency │ 980 ms │ 1110 ms │ 1426 ms │ 1517 ms │ 1124.38 ms │ 99.5 ms │ 1590.64 ms │
└─────────┴────────┴─────────┴─────────┴─────────┴────────────┴─────────┴────────────┘
┌───────────┬─────┬──────┬─────────┬─────────┬─────────┬─────────┬─────────┐
│ Stat      │ 1%  │ 2.5% │ 50%     │ 97.5%   │ Avg     │ Stdev   │ Min     │
├───────────┼─────┼──────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Req/Sec   │ 0   │ 0    │ 184     │ 200     │ 175.1   │ 36.24   │ 144     │
├───────────┼─────┼──────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Bytes/Sec │ 0 B │ 0 B  │ 11.1 MB │ 13.4 MB │ 9.75 MB │ 3.82 MB │ 1.73 MB │
└───────────┴─────┴──────┴─────────┴─────────┴─────────┴─────────┴─────────┘

Req/Bytes counts sampled once per second.

5k requests in 30.14s, 292 MB read

This branch

Running 30s test @ http://localhost:3000/healthy-frank/ssr-frank
200 connections

┌─────────┬────────┬────────┬────────┬────────┬──────────┬──────────┬────────────┐
│ Stat    │ 2.5%   │ 50%    │ 97.5%  │ 99%    │ Avg      │ Stdev    │ Max        │
├─────────┼────────┼────────┼────────┼────────┼──────────┼──────────┼────────────┤
│ Latency │ 239 ms │ 291 ms │ 441 ms │ 463 ms │ 312.1 ms │ 99.35 ms │ 1450.22 ms │
└─────────┴────────┴────────┴────────┴────────┴──────────┴──────────┴────────────┘
┌───────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┐
│ Stat      │ 1%      │ 2.5%    │ 50%     │ 97.5%   │ Avg     │ Stdev   │ Min     │
├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Req/Sec   │ 229     │ 229     │ 661     │ 807     │ 637.5   │ 124.65  │ 229     │
├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Bytes/Sec │ 1.91 MB │ 1.91 MB │ 5.51 MB │ 6.72 MB │ 5.31 MB │ 1.03 MB │ 1.91 MB │
└───────────┴─────────┴─────────┴─────────┴─────────┴─────────┴─────────┴─────────┘

Req/Bytes counts sampled once per second.

19k requests in 30.09s, 159 MB read

Types of Changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation (adding or updating documentation)

Checklist:

  • My change requires a change to the documentation and I have updated the documentation accordingly.
  • These changes should be applied to a maintenance branch.
  • This change requires cross browser checks.
  • Performance tests should be ran against the server prior to merging.
  • This change impacts caching for client browsers.
  • This change impacts HTTP headers.
  • This change adds additional environment variable requirements for One App users.
  • I have added the Apache 2.0 license header to any new files created.

What is the Impact to Developers Using One App?

No impact to developers other than SSR turning off in the cases that the circuit opens

@10xLaCroixDrinker 10xLaCroixDrinker requested review from a team as code owners April 21, 2020 19:10
@@ -229,4 +245,86 @@ describe('createRequestHtmlFragment', () => {
expect(next).toHaveBeenCalled();
/* eslint-enable no-console */
});

it('should open the circuit when event loop lag is > 30ms', async () => {
expect.assertions(5);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this required with async test ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checked locally, expect.assertions(5); can be removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expect.assertions ensures we get passed the promise to all the assertions

} from '../../../src/server/utils/circuitBreaker';

describe('Circuit breaker', () => {
it('should be an opossum circuit breaker', () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this matter if everything else works ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not necessarily

@@ -18,6 +18,7 @@ import csp from './csp';
import createFrankLikeFetch from './createFrankLikeFetch';

export default {
eventLoopLagThreshold: Infinity,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do these not work with the default ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prevents the circuit from opening during the integration tests

@oneamexbot
Copy link
Contributor

oneamexbot commented Apr 22, 2020

📊 Bundle Size Report

file name size on disk gzip
app.js 112.6KB 31.4KB
runtime.js 15KB 5.3KB
vendors.js 128.4KB 38KB
app~vendors.js 405.9KB 105.9KB
legacy/app.js 119.3KB 33KB
legacy/runtime.js 15KB 5.3KB
legacy/vendors.js 163.4KB 44.9KB
legacy/app~vendors.js 412KB 107.6KB

Generated by 🚫 dangerJS against 9047392

const realHrtime = process.hrtime;
const mockHrtime = (...args) => realHrtime(...args);
mockHrtime.bigint = jest.fn();
process.hrtime = mockHrtime;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should not really required here, e.g. the breaker should be automatically disabled in tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just needed for the tests that are asserting on the functionality of the breaker

const breaker = new CircuitBreaker(getModuleData, options);
// Just need to connect opossum to prometheus
// eslint-disable-next-line no-unused-vars
const metrics = new PrometheusMetrics(breaker, register);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably not expose this at this level. Why was it added here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just keeping it close to the code, but yes it makes more since to go in src/server/metrics

}
}, 100);

export default breaker;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a more reusable and testable implementation would be to make this a factory, so we can avoid pulling in holocron in here, but rather have the getModuleData passed in to the factory instead.

let eventLoopLagThreshold = 30;

export const setEventLoopLagThreshold = (n) => {
eventLoopLagThreshold = Number(n) || 30;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mcollina what are your thoughts on this default value?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

30ms is very aggressive. I would increase quite a bit. The 100-500ms range seems more ok for an SSR app (most React rendering go up to 80-100ms).

breaker.healthCheck(async () => {
if (!getModule(rootModuleName)) return;
const start = process.hrtime.bigint();
await immediate();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to use https://nodejs.org/dist/latest-v12.x/docs/api/perf_hooks.html#perf_hooks_perf_hooks_monitoreventloopdelay_options to measure event loop latency. It adds less overhead.

@10xLaCroixDrinker
Copy link
Member Author

@mcollina I did some refactoring here, think you could review again?

mcollina
mcollina previously approved these changes May 1, 2020
Copy link

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@@ -18,6 +18,7 @@ import csp from './csp';
import createFrankLikeFetch from './createFrankLikeFetch';

export default {
eventLoopDelayThreshold: Infinity,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be set to a smaller amount?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just to prevent the circuit from opening during the integration tests

mtomcal
mtomcal previously approved these changes May 1, 2020
if (disableScripts || renderPartialOnly) {
await dispatch(composeModules(routeModules));
} else {
const fallback = await breaker.fire({ dispatch, modules: routeModules });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe rename breaker to show its related to getModuleData

Suggested change
const fallback = await breaker.fire({ dispatch, modules: routeModules });
const fallback = await getModuleDataBreaker.fire({ dispatch, modules: routeModules });

@10xLaCroixDrinker 10xLaCroixDrinker dismissed stale reviews from mtomcal and mcollina via 9047392 May 1, 2020 18:39
@10xLaCroixDrinker 10xLaCroixDrinker merged commit e10f707 into master May 1, 2020
@10xLaCroixDrinker 10xLaCroixDrinker deleted the feature/circuit-breaker branch May 1, 2020 18:58
@PixnBits PixnBits mentioned this pull request May 14, 2020
12 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants