Add predefined health checks #2

BrunnerLivio · 2018-09-06T14:26:59Z

I'm submitting a...


[ ] Regression 
[ ] Bug report
[x] Feature request
[ ] Documentation issue or request
[ ] Support request => Please do not submit support request here, instead post your question on Stack Overflow.

Expected behavior

As discussed on nestjs/nest#966 there should be prefedined health checks, so the user can easily access data like (as @weeco mentioned)

When all initialization stuff is done (asynchronous connections, binding routes, asynchronous modules, etc..)
When the status of the connections to the infrastructure services (such as a database) is good

The text was updated successfully, but these errors were encountered:

BrunnerLivio · 2018-09-14T15:44:14Z

I am currently investigating what is the best architecture to go with.
As an example I had a look into heptiolabs/healthcheck. This library written in Go-lang provides predefined healthchecks as such:

health := healthcheck.NewHandler()

// Our app is not happy if we've got more than 100 goroutines running.
health.AddLivenessCheck("goroutine-threshold", healthcheck.GoroutineCountCheck(100))

// Our app is not ready if we can't resolve our upstream dependency in DNS.
health.AddReadinessCheck(
    "upstream-dep-dns",
    healthcheck.DNSResolveCheck("upstream.example.com", 50*time.Millisecond))

// Our app is not ready if we can't connect to our database (`var db *sql.DB`) in <1s.
health.AddReadinessCheck("database", healthcheck.DatabasePingCheck(db, 1*time.Second)

I think this is a good approach and we should try to integrate it with the existing infrastructure of godday/terminus.

BrunnerLivio · 2018-09-14T16:10:24Z

A possible proposal:

Provided by @nestjs/terminus. Can be see as a "repository" full of health checks of the given topic (DatabaseHealth, DNSHealth, MachineHealth, etc.):

database.health.ts

@Injectable()
export class DatabaseHealth {
  constructor(private connection: Connection)) {
  }
  
  async checkConnection(responseTime: Number): Promise<boolean> {
    // connect to database (`var db *sql.DB`) in < {reponseTime}.
    return true;
  }
}

Provided by user:
health.service.ts

import { Injectable } from '@nestjs/common';
import { TerminusOptions, TerminusOptionsFactory, DatabaseHealth } from '@nestjs/terminus';

@Injectable()
export class HealthService implements TerminusOptionsFactory {
   constructor(private databaseHealth: DatabaseHealth){ }

  async customCheck() {
    return true;
  }

  async health(): Promise<any> {
    return {
      'database': await this.databaseHealth.checkConnection(1),
      'custom': await this.customCheck()
    };
  }

   async createTerminusOptions(): Promise<TerminusOptions> {
    return {
      healthChecks: { '/health': this.health }
    };
  }
}

Would result into:

{
  "status": "ok",
  "info": {
    "database": true,
    "custom": true
  }
}

What do you guys think? @weeco @kamilmysliwiec

This approach would not break the API as it is used at the moment.

weeco · 2018-09-16T15:59:03Z

Do you think the health check should be made in it's own module/class?

Personally I expected the health check to be close to the code/instance which you want to run a health check on. However the most common use case for health checks would probably be providers, as they can be used to instantiate and await async operations (e. g. connection attempts). I don't see how we could implement a proper health check inside of the exported provider:

export const fortniteClientProviders: Provider[] = [
  {
    provide: InjectToken.FortniteClient,
    useFactory: async (config: ConfigService): Promise<FortniteClient> => {
      const api: FortniteClient = new FortniteClient(config.fortniteClient, {
        timeoutMs: 1.5 * 1000
      });
      await api.login();

      return api;
    },
    inject: [InjectToken.ConfigService]
  }
];

For this reason I believe HealthChecks should be added to modules:

Module:

@Module({
  providers: [...fortniteClientProviders],
  exports: [...fortniteClientProviders],
  healthChecks: [FortniteClientHealthCheck]
})
export class FortniteClientModule {}

Health check:

@Injectable
export class FortniteClientHealthCheck implements HealthIndicator {
    // If isHealthy() takes longer than this, consider the check as failed and consider this health indicator as down
    protected healthCheckTimeoutMs: number = 1 * 1000;
    // If false only status: up or down will be shown in the response
    protected isPublic: boolean = true;

    constructor(@Inject(InjectToken.FortniteClient) private readonly api: FortniteClient) {}

    public async isHealthy(): HealthResponse {
        const builder = new HealthBuilder("fortniteClient"); // Or use class name with applied lower camel case format instead? Then pass the builder as argument, so that you don't need to create an instance
        try {
            const testResponse = await api.ping();
            if (testResponse != null) return builder.up();
            return builder.down({ reason: "FortniteClient ping response is null or undefined" });
        } catch (err) {
            // Return down status, optionally along with more information
            return builder.down({ host: api.host, ...err });
        }
    }
}

which would result in either:

Success:
HTTP response status code: 200
HTTP response body:

{
    "status": "UP", 
    "diskSpace": {
        "status": "UP",
        "free": 56443746,
        "threshold": 1345660
    },
   "fortniteClient": {
      "status": "UP"
    }
}

or

Down:
HTTP response status code: 503
HTTP response body:

{  
  "status":"DOWN",
  "reason":"fortniteClient is DOWN",
  "diskSpace":{  
    "status":"UP",
    "free":56443746,
    "threshold":1345660
  },
  "fortniteClient":{  
    "status":"DOWN",
    "error": {....}
  }
}

I prefer this approach over your proposal because it is more "decoupled" and the health check is still implemented near the component you are checking against. What do you think?

If you decide for this route you'd still need to think about the predefined health checks. The only problem I see here is the dependency injection of a "TypeORM / MySQL connection" for instance. Other than that you can easily write predefined Health Checks I believe

BrunnerLivio · 2018-09-17T09:47:05Z

@weeco I am not 100% sure if we're on the same page, but that what you proposed is more or less the same I propose.

In my opinion having healthChecks: Provider[] inside a @Module()-option is not a good idea. Everything, even interceptors go into providers: Provider[], why should Health Checks be the exception?

Personally I expected the health check to be close to the code/instance which you want to run a health check on.

Agree. My solution actually supports this. I do not recommend writing your health checks in health.service.ts as I did (just did it for simplicity reasons). Your health.service.ts is just to connect your defined health checks with terminus. But you do not have to write the check inside the service itself. You can provide them anywhere you want.

Here is how I would implement your FortniteClientHealthCheck in my proposal.

import { Injectable } from '@nestjs/common';
import { TerminusOptions, TerminusOptionsFactory } from '@nestjs/terminus';
import { FortniteClientHealthCheck } from './fortnite/fortnite-client.health.ts'

@Injectable()
export class HealthService implements TerminusOptionsFactory {
   constructor(private fortniteClientHealthCheck: FortniteClientHealthCheck){ }

  async health(): Promise<any> {
    return {
      'fortnite': await this.fortniteClientHealthCheck.isHealthy()
    };
  }

   async createTerminusOptions(): Promise<TerminusOptions> {
    return {
      healthChecks: { '/health': this.health }
    };
  }
}

So you have your actual health check still near your feature, but it is connected through a "root"-health provider called `health.service.

One thing you'd need to refactor is the HealthBuilder. I also thought about such a class, but I dropped that, because it is not really Angular like. Since it is not being injected by the constructor it is harder to test. So the developer who write the health check has to build it's own HealthBuilder, if he prefers such a solution.

weeco · 2018-09-19T08:33:52Z

Can you elloborate if this "root health provider" is actually necessary? This sounds like boilerplate which could potentially be avoided. If I write one or more health checks I usually want it to be picked up as well.

I can imagine it's not possible because you wouldn't be able to tell if it's a normal service or a health check if you put them all together in the providers property of a module. That's why I proposed to put it into a healthcheck property which indicates that this will be picked up in our "HealthCheckRepository". Is that the main difference between our two proposals?

kamilmysliwiec · 2018-09-19T10:56:36Z

In my opinion having healthChecks: Provider[] inside a @module()-option is not a good idea. Everything, even interceptors go into providers: Provider[], why should Health Checks be the exception?

100% agree. Let's don't mess with @Module() properties.

However, I totally understand your point @weeco. Keeping healthchecks close to their domain sounds clear. I guess that another possible middle ground would be to delay instantiation of terminus itself (for example move it to the OnModuleInit or OnApplicationBootstrap instead of creating it as a provider using options object). Then, provide a class like TerminusRegistry that exposes register method. Each controller/service/anything could inject TerminusRegistry and inside the constructor, call this.terminusRegistry.register(this) for instance. Registry would hold an array of healthchecks as a class property and during the OnModuleInit/Bootstrap, create terminus instance and pass down these registered methods. :)

BrunnerLivio · 2018-09-19T20:55:45Z

@kamilmysliwiec
Thanks a lot for joining the discussion. That sounds like a nice solution we all can agree on. Will have a look into implementing this asap and will add you guys as reviewers in the PR.

So then I can finally finish the documentation and release the first version :)

—-

One thing which stands out for me; this would mean that the Terminus feature of reacting to process signals like SIGTERM is not covered with Kamils proposal, because that was solved by the “root health provider”.

In my opinion we should kick the signal feature out for the first release and then start the discussion afterwards, if that is okay?

weeco · 2018-09-19T21:03:04Z

I believe the sig feature is important, but until recently no one has ever requested such a health check module, so I believe it would be a very solid module without sig hook as well :-). Looking forward to it.

kamilmysliwiec · 2018-09-22T08:38:48Z

One thing which stands out for me; this would mean that the Terminus feature of reacting to process signals like SIGTERM is not covered with Kamils proposal, because that was solved by the “root health provider”.

Things like onSignal and onShutdown might be covered in the same fashion as healthchecks. We could provide sth like TerminusLifecycleManager with addOnSignalHook() (?).

The rest (e.g. signal and logger) could work same as now (using createTerminusOptions).

kamilmysliwiec · 2018-09-22T08:39:01Z

Sorry for closing 😅

BrunnerLivio · 2018-09-26T12:38:08Z

@weeco I tried to adapt to @kamilmysliwiec solution. You can see a working example with my dogs-app (not a cat person, sorry :D)

What I've found is that the general behavior of @godaddy/terminus is not so pleasant.

In case a subsystem is down as shown here with info.dogs.status:

{"status":"ok","info":{"dogs":{"status":"down","goodboys":5}}}

The property status should also change to down, because info.dogs.status is "down", right?
This is not possible with @godaddy/terminus.. It expects if no error is thrown, everything is great. Therefore it will just automatically use status = "ok" as shown here

So then you have to throw a error to change the state? This is hardcoded too in @godaddy/terminus, so in case a error is thrown it can only show this JSON response (no additional data / message):

{
  "status": "error"
}

Maybe my use cases are not needed, so we could proceed with this implementation. Otherwise we may consider moving away from @godaddy/terminus or create issues on their repository.

weeco · 2018-09-26T14:20:24Z

The usage looks very nice, good job.

In my oppinion having detailed information on subsystems is a must have. Otherwise it may take some time to figure out why the healthcheck is failing. I believe it shouldn't take long to implement the godaddy/terminus featureset in Typescript and is worth the effort

BrunnerLivio · 2018-09-26T14:49:07Z

I believe it shouldn't take long to implement the godaddy/terminus featureset in Typescript and is worth the effort

It wouldn't be so hard to implement it, but keep in mind we have to maintain this, think about testing and security etc. So that is why I created an issue on @godaddy/terminus and lets hear out if the guys from over there support us. If not it seems like we have no other option than just writing it for our own as you proposed...

BrunnerLivio · 2018-10-10T18:59:37Z

@kamilmysliwiec I had some discussions with @weeco:

I experienced some issues with your proposal with registering the health check in the constructor as you proposed.

this.terminusRegistry.register(this);

The problem is with predefined health checks. Let's say nestjs/terminus module offers a predefined database health check. How does we register this? Certainly we can not do that in the constructor, because what if the user does not want to use our predefined health check in his ecosystem?

Therefor a user must select the predefined health checks himself. In the end this would lead back to my proposal, having a "main" health check class, which binds health check to a health address as I proposed in my previous comment. @weeco which represents "the customer" in this case would be fine with this solution.

Do you have another idea, or should I go with this?

kamilmysliwiec · 2018-10-10T19:02:42Z

Couldn't we support two cases at once? I mean, combine options (manually defined checks) + registered ones (flexibility).

BrunnerLivio · 2018-10-10T20:09:40Z

@kamilmysliwiec I think it should be consistent in the end. Predefined health checks and user defined health checks should use the same interface and should be able to be registered the same way. I really do not see many other options than a main file..

BrunnerLivio added the type: feature label Sep 6, 2018

BrunnerLivio mentioned this issue Sep 6, 2018

Add health check module nestjs/nest#966

Closed

BrunnerLivio added the state: open for proposals label Sep 6, 2018

BrunnerLivio mentioned this issue Sep 14, 2018

Missing documentation #5

Closed

kamilmysliwiec closed this as completed Sep 22, 2018

kamilmysliwiec reopened this Sep 22, 2018

BrunnerLivio mentioned this issue Sep 25, 2018

different package name on npm from docs #6

Closed

BrunnerLivio added this to the First release milestone Sep 26, 2018

BrunnerLivio mentioned this issue Sep 26, 2018

use with nestjs godaddy/terminus#70

Closed

BrunnerLivio closed this as completed Nov 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add predefined health checks #2

Add predefined health checks #2

BrunnerLivio commented Sep 6, 2018 •

edited

Loading

BrunnerLivio commented Sep 14, 2018

BrunnerLivio commented Sep 14, 2018 •

edited

Loading

weeco commented Sep 16, 2018 •

edited

Loading

BrunnerLivio commented Sep 17, 2018 •

edited

Loading

weeco commented Sep 19, 2018

kamilmysliwiec commented Sep 19, 2018 •

edited

Loading

BrunnerLivio commented Sep 19, 2018 •

edited

Loading

weeco commented Sep 19, 2018

kamilmysliwiec commented Sep 22, 2018

kamilmysliwiec commented Sep 22, 2018

BrunnerLivio commented Sep 26, 2018 •

edited

Loading

weeco commented Sep 26, 2018

BrunnerLivio commented Sep 26, 2018 •

edited

Loading

BrunnerLivio commented Oct 10, 2018 •

edited

Loading

kamilmysliwiec commented Oct 10, 2018

BrunnerLivio commented Oct 10, 2018

Add predefined health checks #2

Add predefined health checks #2

Comments

BrunnerLivio commented Sep 6, 2018 • edited Loading

I'm submitting a...

Expected behavior

BrunnerLivio commented Sep 14, 2018

BrunnerLivio commented Sep 14, 2018 • edited Loading

weeco commented Sep 16, 2018 • edited Loading

BrunnerLivio commented Sep 17, 2018 • edited Loading

weeco commented Sep 19, 2018

kamilmysliwiec commented Sep 19, 2018 • edited Loading

BrunnerLivio commented Sep 19, 2018 • edited Loading

weeco commented Sep 19, 2018

kamilmysliwiec commented Sep 22, 2018

kamilmysliwiec commented Sep 22, 2018

BrunnerLivio commented Sep 26, 2018 • edited Loading

weeco commented Sep 26, 2018

BrunnerLivio commented Sep 26, 2018 • edited Loading

BrunnerLivio commented Oct 10, 2018 • edited Loading

kamilmysliwiec commented Oct 10, 2018

BrunnerLivio commented Oct 10, 2018

BrunnerLivio commented Sep 6, 2018 •

edited

Loading

BrunnerLivio commented Sep 14, 2018 •

edited

Loading

weeco commented Sep 16, 2018 •

edited

Loading

BrunnerLivio commented Sep 17, 2018 •

edited

Loading

kamilmysliwiec commented Sep 19, 2018 •

edited

Loading

BrunnerLivio commented Sep 19, 2018 •

edited

Loading

BrunnerLivio commented Sep 26, 2018 •

edited

Loading

BrunnerLivio commented Sep 26, 2018 •

edited

Loading

BrunnerLivio commented Oct 10, 2018 •

edited

Loading