New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DNS health checker #429

Merged
merged 3 commits into from Sep 21, 2016

Conversation

Projects
None yet
3 participants
@pandax381
Contributor

pandax381 commented Sep 21, 2016

Hi,

I implemented DNS helath checker.
It has been running stable for several years in our system.

I want merge it to the built-in health checker.

There is also a way to use MISC_CHECK, but we avoided it for the following reasons.

  • MISC_CHECK requires /bin/sh

    MISC_CHECK is using the system(3). It is equivalent to the "/bin/sh -c". Therefore, we can't use MISC_CHECK in environment does not have /bin/sh. It is, e.g. not uncommon in the application container environment.

  • Risk of the process increase in MISC_CHECK

    If misc_path of the program has blocked, the process will increase. You can check the symptoms with the following settings.

    MISC_CHECK {
      misc_path "/bin/sleep 3600"
      misc_timeout 10 
    }
    

    It's results.

    UID   PID  PPID  PGID   SID COMMAND
      0 41010     1 41010 41010 /sbin/keepalived
      0 41013 41010 41010 41010  \_ /sbin/keepalived
      0 41361 41013 41010 41010  |   \_ /sbin/keepalived
      0 41362 41361 41010 41010  |   |   \_ sh -c /bin/sleep 3600
      0 41363 41362 41010 41010  |   |       \_ /bin/sleep 3600
      0 41364 41013 41010 41010  |   \_ /sbin/keepalived
      0 41365 41364 41010 41010  |   |   \_ sh -c /bin/sleep 3600
      0 41366 41365 41010 41010  |   |       \_ /bin/sleep 3600
      0 41367 41013 41010 41010  |   \_ /sbin/keepalived
      0 41368 41367 41010 41010  |       \_ sh -c /bin/sleep 3600
      0 41369 41368 41010 41010  |           \_ /bin/sleep 3600
      0 41014 41010 41010 41010  \_ /sbin/keepalived
      0 41019     1 41010 41010 sh -c /bin/sleep 3600
      0 41020 41019 41010 41010  \_ /bin/sleep 3600
      0 41025     1 41010 41010 sh -c /bin/sleep 3600
      0 41026 41025 41010 41010  \_ /bin/sleep 3600
      0 41031     1 41010 41010 sh -c /bin/sleep 3600
      0 41032 41031 41010 41010  \_ /bin/sleep 3600
    

    MISC_CHECK is calling system(3) after fork(2). In check_misc.c, it's sending SIGTERM and SIGKILL on timeout. However, the signal is received only in the process that called system(3), the subsequent process is not propagated.

    For this matter, I'm going to send the proposal of the pull request.

In preparing the DNS checker, also modified layer4 library. It's added support for implementation of the UDP-based health checker.

I believe to add to the built-in health checker is a good way, given the importance of the DNS.
Please consider the merge of this pull request.

Best regards,
Masaya Yamamoto

pandax381 added some commits Sep 20, 2016

Add support for UDP socket to layer4 library
Signed-off-by: Masaya Yamamoto <yamamoto-ma@klab.com>
Add DNS checker
Signed-off-by: Masaya Yamamoto <yamamoto-ma@klab.com>
Update documentation for DNS health checker
Signed-off-by: Masaya Yamamoto <yamamoto-ma@klab.com>
@pqarmitage

This looks really useful. There's just one minor thing though. Could you please update doc/keepalived.SYNOPSIS in line with your changes to doc/man/man5/keepalived.conf.5.

Many thanks

@pandax381

This comment has been minimized.

Show comment
Hide comment
@pandax381

pandax381 Sep 21, 2016

Contributor

Thank you for your review.

I updated doc/keepalived.conf.SYNOPSIS file as you pointed out.

Let me know if you are in need of anything!

Contributor

pandax381 commented Sep 21, 2016

Thank you for your review.

I updated doc/keepalived.conf.SYNOPSIS file as you pointed out.

Let me know if you are in need of anything!

@pqarmitage

This comment has been minimized.

Show comment
Hide comment
@pqarmitage

pqarmitage Sep 21, 2016

Collaborator

+1

Collaborator

pqarmitage commented Sep 21, 2016

+1

@acassen

This comment has been minimized.

Show comment
Hide comment
@acassen

acassen Sep 21, 2016

Owner

well done, thanks Masaya !

Owner

acassen commented Sep 21, 2016

well done, thanks Masaya !

@acassen acassen merged commit bc21563 into acassen:master Sep 21, 2016

@pandax381

This comment has been minimized.

Show comment
Hide comment
@pandax381

pandax381 Sep 23, 2016

Contributor

Thank you to merge my pull request!

Contributor

pandax381 commented Sep 23, 2016

Thank you to merge my pull request!

@pandax381 pandax381 deleted the pandax381:add-dns-checker branch Sep 23, 2016

@pandax381 pandax381 referenced this pull request Sep 27, 2016

Merged

Fix process increase #432

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment