-
-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add predefined health checks #2
Comments
I am currently investigating what is the best architecture to go with. health := healthcheck.NewHandler()
// Our app is not happy if we've got more than 100 goroutines running.
health.AddLivenessCheck("goroutine-threshold", healthcheck.GoroutineCountCheck(100))
// Our app is not ready if we can't resolve our upstream dependency in DNS.
health.AddReadinessCheck(
"upstream-dep-dns",
healthcheck.DNSResolveCheck("upstream.example.com", 50*time.Millisecond))
// Our app is not ready if we can't connect to our database (`var db *sql.DB`) in <1s.
health.AddReadinessCheck("database", healthcheck.DatabasePingCheck(db, 1*time.Second) I think this is a good approach and we should try to integrate it with the existing infrastructure of godday/terminus. |
A possible proposal: Provided by
@Injectable()
export class DatabaseHealth {
constructor(private connection: Connection)) {
}
async checkConnection(responseTime: Number): Promise<boolean> {
// connect to database (`var db *sql.DB`) in < {reponseTime}.
return true;
}
} Provided by user: import { Injectable } from '@nestjs/common';
import { TerminusOptions, TerminusOptionsFactory, DatabaseHealth } from '@nestjs/terminus';
@Injectable()
export class HealthService implements TerminusOptionsFactory {
constructor(private databaseHealth: DatabaseHealth){ }
async customCheck() {
return true;
}
async health(): Promise<any> {
return {
'database': await this.databaseHealth.checkConnection(1),
'custom': await this.customCheck()
};
}
async createTerminusOptions(): Promise<TerminusOptions> {
return {
healthChecks: { '/health': this.health }
};
}
} Would result into: {
"status": "ok",
"info": {
"database": true,
"custom": true
}
} What do you guys think? @weeco @kamilmysliwiec This approach would not break the API as it is used at the moment. |
Do you think the health check should be made in it's own module/class? Personally I expected the health check to be close to the code/instance which you want to run a health check on. However the most common use case for health checks would probably be providers, as they can be used to instantiate and await async operations (e. g. connection attempts). I don't see how we could implement a proper health check inside of the exported provider: export const fortniteClientProviders: Provider[] = [
{
provide: InjectToken.FortniteClient,
useFactory: async (config: ConfigService): Promise<FortniteClient> => {
const api: FortniteClient = new FortniteClient(config.fortniteClient, {
timeoutMs: 1.5 * 1000
});
await api.login();
return api;
},
inject: [InjectToken.ConfigService]
}
]; For this reason I believe HealthChecks should be added to modules: Module: @Module({
providers: [...fortniteClientProviders],
exports: [...fortniteClientProviders],
healthChecks: [FortniteClientHealthCheck]
})
export class FortniteClientModule {} Health check: @Injectable
export class FortniteClientHealthCheck implements HealthIndicator {
// If isHealthy() takes longer than this, consider the check as failed and consider this health indicator as down
protected healthCheckTimeoutMs: number = 1 * 1000;
// If false only status: up or down will be shown in the response
protected isPublic: boolean = true;
constructor(@Inject(InjectToken.FortniteClient) private readonly api: FortniteClient) {}
public async isHealthy(): HealthResponse {
const builder = new HealthBuilder("fortniteClient"); // Or use class name with applied lower camel case format instead? Then pass the builder as argument, so that you don't need to create an instance
try {
const testResponse = await api.ping();
if (testResponse != null) return builder.up();
return builder.down({ reason: "FortniteClient ping response is null or undefined" });
} catch (err) {
// Return down status, optionally along with more information
return builder.down({ host: api.host, ...err });
}
}
} which would result in either: Success: {
"status": "UP",
"diskSpace": {
"status": "UP",
"free": 56443746,
"threshold": 1345660
},
"fortniteClient": {
"status": "UP"
}
} or Down: {
"status":"DOWN",
"reason":"fortniteClient is DOWN",
"diskSpace":{
"status":"UP",
"free":56443746,
"threshold":1345660
},
"fortniteClient":{
"status":"DOWN",
"error": {....}
}
} I prefer this approach over your proposal because it is more "decoupled" and the health check is still implemented near the component you are checking against. What do you think? If you decide for this route you'd still need to think about the predefined health checks. The only problem I see here is the dependency injection of a "TypeORM / MySQL connection" for instance. Other than that you can easily write predefined Health Checks I believe |
@weeco I am not 100% sure if we're on the same page, but that what you proposed is more or less the same I propose. In my opinion having
Agree. My solution actually supports this. I do not recommend writing your health checks in Here is how I would implement your import { Injectable } from '@nestjs/common';
import { TerminusOptions, TerminusOptionsFactory } from '@nestjs/terminus';
import { FortniteClientHealthCheck } from './fortnite/fortnite-client.health.ts'
@Injectable()
export class HealthService implements TerminusOptionsFactory {
constructor(private fortniteClientHealthCheck: FortniteClientHealthCheck){ }
async health(): Promise<any> {
return {
'fortnite': await this.fortniteClientHealthCheck.isHealthy()
};
}
async createTerminusOptions(): Promise<TerminusOptions> {
return {
healthChecks: { '/health': this.health }
};
}
} So you have your actual health check still near your feature, but it is connected through a "root"-health provider called `health.service. One thing you'd need to refactor is the |
Can you elloborate if this "root health provider" is actually necessary? This sounds like boilerplate which could potentially be avoided. If I write one or more health checks I usually want it to be picked up as well. I can imagine it's not possible because you wouldn't be able to tell if it's a normal service or a health check if you put them all together in the providers property of a module. That's why I proposed to put it into a healthcheck property which indicates that this will be picked up in our "HealthCheckRepository". Is that the main difference between our two proposals? |
100% agree. Let's don't mess with However, I totally understand your point @weeco. Keeping healthchecks close to their domain sounds clear. I guess that another possible middle ground would be to delay instantiation of |
@kamilmysliwiec So then I can finally finish the documentation and release the first version :) —- One thing which stands out for me; this would mean that the Terminus feature of reacting to process signals like SIGTERM is not covered with Kamils proposal, because that was solved by the “root health provider”. In my opinion we should kick the signal feature out for the first release and then start the discussion afterwards, if that is okay? |
I believe the sig feature is important, but until recently no one has ever requested such a health check module, so I believe it would be a very solid module without sig hook as well :-). Looking forward to it. |
Things like The rest (e.g. |
Sorry for closing 😅 |
@weeco I tried to adapt to @kamilmysliwiec solution. You can see a working example with my dogs-app (not a cat person, sorry :D) What I've found is that the general behavior of In case a subsystem is down as shown here with {"status":"ok","info":{"dogs":{"status":"down","goodboys":5}}} The property So then you have to throw a error to change the state? This is hardcoded too in @godaddy/terminus, so in case a error is thrown it can only show this JSON response (no additional data / message): {
"status": "error"
} Maybe my use cases are not needed, so we could proceed with this implementation. Otherwise we may consider moving away from |
The usage looks very nice, good job. In my oppinion having detailed information on subsystems is a must have. Otherwise it may take some time to figure out why the healthcheck is failing. I believe it shouldn't take long to implement the godaddy/terminus featureset in Typescript and is worth the effort |
It wouldn't be so hard to implement it, but keep in mind we have to maintain this, think about testing and security etc. So that is why I created an issue on @godaddy/terminus and lets hear out if the guys from over there support us. If not it seems like we have no other option than just writing it for our own as you proposed... |
@kamilmysliwiec I had some discussions with @weeco: I experienced some issues with your proposal with registering the health check in the constructor as you proposed. this.terminusRegistry.register(this); The problem is with predefined health checks. Let's say nestjs/terminus module offers a predefined database health check. How does we register this? Certainly we can not do that in the constructor, because what if the user does not want to use our predefined health check in his ecosystem? Therefor a user must select the predefined health checks himself. In the end this would lead back to my proposal, having a "main" health check class, which binds health check to a health address as I proposed in my previous comment. @weeco which represents "the customer" in this case would be fine with this solution. Do you have another idea, or should I go with this? |
Couldn't we support two cases at once? I mean, combine options (manually defined checks) + registered ones (flexibility). |
@kamilmysliwiec I think it should be consistent in the end. Predefined health checks and user defined health checks should use the same interface and should be able to be registered the same way. I really do not see many other options than a main file.. |
I'm submitting a...
Expected behavior
As discussed on nestjs/nest#966 there should be prefedined health checks, so the user can easily access data like (as @weeco mentioned)
The text was updated successfully, but these errors were encountered: