-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Healthcheck service #24817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Healthcheck service #24817
Conversation
Changed Packages
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 🎉
Couple of initial high level bits
const router = Router(); | ||
let handler = () => Promise.resolve({ status: 'ok' }); | ||
|
||
router.get('/healthcheck', async (_request: Request, response: Response) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking we should have this separated from the main routes, probably something like /.backstage/health
, but potentially grouped a bit
export interface HttpRouterHealthCheckConfig { | ||
handler: () => Promise<any>; | ||
} | ||
|
||
/** | ||
* @public | ||
*/ | ||
export interface HttpRouterService { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about keeping health checks at the root level instead? Something like /.backstage/health/v1/readiness
for checking whether the entire instance is up, and /.backstage/health/v1/<plugin-id>/readiness
for individual plugins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but the other way around, /.backstage/health/v1/readiness/<plugin-id>
// @public (undocumented) | ||
export interface HttpRouterService { | ||
// (undocumented) | ||
addAuthPolicy(policy: HttpRouterServiceAuthPolicy): void; | ||
// (undocumented) | ||
healthCheckConfig(healthCheckOptions: HttpRouterHealthCheckConfig): void; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this should live on the http router tbh. Or framed a bit differently: if we can make this a separate service we probably should. I'm thinking this should probably be additive as well so that it's possible to add health checks from modules etc.
// @public (undocumented) | ||
export interface HttpRouterHealthCheckConfig { | ||
// (undocumented) | ||
handler: () => Promise<any>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's important to consider a liveness + readiness split. It can help avoid situations where instances are shut down because they timed out at startup, while still being able to be pretty aggressive about timeouts when it comes to whether the service is functional at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you have any suggestion for a possible root level readiness implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you 🎉, left minor comments. Could we also deprecate the legacy handler from the backend-common
package as well?
packages/backend-app-api/src/services/implementations/httpRouter/createHealthcheck.ts
Outdated
Show resolved
Hide resolved
packages/backend-app-api/src/services/implementations/httpRouter/httpRouterServiceFactory.ts
Outdated
Show resolved
Hide resolved
packages/backend-plugin-api/src/services/definitions/HttpRouterService.ts
Outdated
Show resolved
Hide resolved
3ede1f0
to
74a6836
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, left minor comments 🙌🏻
@@ -58,6 +59,7 @@ export const defaultServiceFactories = [ | |||
userInfoServiceFactory(), | |||
urlReaderServiceFactory(), | |||
eventsServiceFactory(), | |||
healthServiceFactory(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we have also to provide a mock implementation here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is not used anymore, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this file can be deleted, right?
.changeset/bright-panthers-leave.md
Outdated
'@backstage/backend-app-api': patch | ||
--- | ||
|
||
Added a new health service which adds new endpoints for health checks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice add this service to the core services docs:
https://backstage.io/docs/backend-system/core-services/index
cf48e42
to
8c5ca35
Compare
8c5ca35
to
926d6c9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 👍
Couple of nits
* The service reference for the plugin scoped {@link HealthService}. | ||
*/ | ||
export const health = createServiceRef< | ||
import('./HealthService').HealthService |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's prolly best to keep this as void
instead, since now you can pass anything instead. We can still add methods in the future even if it starts out as void
.
router.get('/v1/readiness', async (_request: Request, response: Response) => { | ||
if (!isRunning) { | ||
throw new Error('Backend has not started yet'); | ||
} | ||
response.json({ status: 'ok' }); | ||
}); | ||
|
||
router.get('/v1/liveness', async (_request: Request, response: Response) => { | ||
response.json({ status: 'ok' }); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a quick description to each of these with what their purpose is?
|
||
router.get('/v1/readiness', async (_request: Request, response: Response) => { | ||
if (!isRunning) { | ||
throw new Error('Backend has not started yet'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking this should be a 503 ServiceUnavailableError
. Perhaps we shouldn't throw an error in this case either, so that we can have a more consistent response? Instead setting an explicit status code and returning whatever the opposite of status: 'ok'
is
4fd88dc
to
cdc62c9
Compare
* The service reference for the plugin scoped {@link HealthService}. | ||
*/ | ||
export const health = createServiceRef< | ||
import('./HealthService').HealthService | ||
>({ id: 'core.health', scope: 'root' }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* The service reference for the plugin scoped {@link HealthService}. | |
*/ | |
export const health = createServiceRef< | |
import('./HealthService').HealthService | |
>({ id: 'core.health', scope: 'root' }); | |
* The service reference for the root scoped {@link RootHealthService}. | |
*/ | |
export const rootHealth = createServiceRef< | |
import('./RootHealthService').RootHealthService | |
>({ id: 'core.rootHealth', scope: 'root' }); |
Every single other root scoped thing is named like this - we should probably follow suit here. Make sure to change to that naming pattern everywhere else too.
I'd also be interested in what the future might be in terms of per-plugin liveness/health. That's sort of where the most interesting developments might be in the future. This root scoped one could at most do simple lifecycle matching like it does today, or possibly offer some form of rollup of the states of all installed plugins (not even sure that's a good idea though). I mean, how often do we really desire to look at the liveness of effectively a machine that hosts many plugins? What does that even mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I beleive should deprecate these functions and types as well in backend common:
backstage/packages/backend-common/api-report.md
Lines 188 to 192 in 6fa100a
export function createStatusCheckRouter(options: { | |
logger: LoggerService; | |
path?: string; | |
statusCheck?: StatusCheck; | |
}): Promise<express.Router>; |
export type StatusCheck = () => Promise<any>; |
backstage/packages/backend-common/api-report.md
Lines 644 to 646 in 6fa100a
export function statusCheckHandler( | |
options?: StatusCheckHandlerOptions, | |
): Promise<RequestHandler>; |
backstage/packages/backend-common/api-report.md
Lines 649 to 651 in 6fa100a
export interface StatusCheckHandlerOptions { | |
statusCheck?: StatusCheck; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, I'm gonna do it on a following PR
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
cb2689f
to
2c97234
Compare
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
2c97234
to
e36e507
Compare
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
const backend = createBackend(); | ||
|
||
class MyRootHealthService implements RootHealthService { | ||
async getLiveness(): Promise<{ status: number; payload?: any }> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the payloads could have been JsonValue
packages/backend-defaults/src/entrypoints/rootHealth/rootHealthServiceFactory.test.ts
Show resolved
Hide resolved
}); | ||
|
||
it(`should return a 500 response if the server has stopped`, async () => { | ||
const lifecycle = mockServices.rootLifecycle.mock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can also do
const lifecycle = mockServices.rootLifecycle.mock({
addStartupHook: jest.fn(() => {
...
},
const { indexPath, configure = defaultConfigure } = options ?? {}; | ||
const logger = rootLogger.child({ service: 'rootHttpRouter' }); | ||
const app = express(); | ||
|
||
const router = DefaultRootHttpRouter.create({ indexPath }); | ||
const middleware = MiddlewareFactory.create({ config, logger }); | ||
const routes = router.handler(); | ||
|
||
const healthRouter = createHealthRouter({ health }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this file should have any changes at all.
The rootHealthServiceFactory
should have a dependency on coreServices.rootHttpRouter
and just call .use
on it with the right path.
Your DefaultRootHealthService
could have a create
method that accepts the root http router service as an argument and does that work. You could also make your rootHealthServiceFactory
be of the options callback form, where you can pass in custom liveness and readiness callbacks as an adopter, and then the DefaultRootHealthService
uses those instead of the default ones if they were given.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this approach simplifies matters in the event that the adopter wishes to provide their own implementation of the health service. With your approach, they would essentially have to replicate what the default implementation of the healthServiceFactory is doing, remembering the path where the middleware should be added. This could potentially lead to problems if they attach a middleware to a non-standard path.
This alternative approach simplifies things: they only need to provide their own implementation of the getLiveness
and getReadiness
methods. No raw operations on the root router are necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant was,
backend.add(
rootHealthServiceFactory({
getLiveness: async () => ...,
})
)
as the first level of adaptation, and if you need more,
backend.add(
createServiceFactory({
service: coreServices.rootHealth,
deps: {
rootHttpRouter: coreServices.rootHttpRouter,
},
async factory({ rootHttpRouter }) {
// performs all registration etc inside create
return DefaultRootHealthService.create({
rootHttpRouter,
getLiveness: async () => ...,
});
}
})
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of course, this is all moot once we have service modules :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But yeah I see that we did the same type of thing for the auth and lifecycle middlewares in httpRouterServiceFactory
. So maybe this is where we were going with all this anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this approach simplifies the customization process for adopters and provides a more opinionated method for exposing the metrics
/** | ||
* Get the liveness status of the backend. | ||
*/ | ||
getLiveness(): Promise<{ status: number; payload?: any }>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah again, making these payloads JsonObject
conveys the limitation that they are being serialized as a response body.
Co-authored-by: Fredrik Adelöw <freben@gmail.com> Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Co-authored-by: Fredrik Adelöw <freben@gmail.com> Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Signed-off-by: Vincenzo Scamporlino <vincenzos@spotify.com>
Thank you for contributing to Backstage! The changes in this pull request will be part of the |
Hey, I just made a Pull Request!
This PR adds a new healthcheck service which adds a bunch of endpoints to the root HTTP Router
Closes #24548
✔️ Checklist
Signed-off-by
line in the message. (more info)