Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/smonit #142

Merged
merged 7 commits into from
Jun 14, 2017
Merged

Feature/smonit #142

merged 7 commits into from
Jun 14, 2017

Conversation

p4u
Copy link
Member

@p4u p4u commented Jun 5, 2017

Information at issue: Tiny monitor daemon #137

p4u added 3 commits May 25, 2017 20:33
Small and modular monitor for daemons.
It executes every 5 minutes and run a set of hooks to check the
daemons and status of the system. Hooks are installed on /etc/smonit and
must define a function for quickfix (like daemon restart) and longfix
(like rebuild configuration for daemon). Longfix is launched only if
after 3 quickfixes the problem is not solved. Logs are stored on /tmp/smonit/<daemon_name>.

First commit with dnsmasq and bmx6 hooks.

Signed-off-by: Pau Escrich <p4u@dabax.net>
Signed-off-by: Pau Escrich <p4u@dabax.net>
Fix date format. Add maxsize to remove log file if too big.

Signed-off-by: Pau Escrich <p4u@dabax.net>
@p4u p4u added the in progress label Jun 5, 2017
@p4u p4u requested a review from nicopace June 5, 2017 13:36
@nicopace
Copy link
Member

nicopace commented Jun 9, 2017

I couldn't test it, but I proofread it and it seems ok :)

@G10h4ck
Copy link
Member

G10h4ck commented Jun 9, 2017

I have reviewed the code and made comments inline, beside to that i believe that checkers like tha one of bmx6 should be split in their own packages, like we do with lime-proto-*

@p4u p4u requested a review from G10h4ck June 9, 2017 11:53
G10h4ck
G10h4ck previously requested changes Jun 9, 2017
/etc/init.d/cron reload
}

stop() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is executed at each shutdown/reboot thus causing writes on the flash, moreover it will cause another write on boot because the smonit line it's not found on the crontab anymore

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is right, I'll fix it.

/etc/init.d/bmx6 restart
}

function hook_longfix() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function does nothing ATM, so or we do implement it or we do change the log message accordingly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. It was an example to help understand the concept. But I agree on changing the log message.

[ -x /usr/sbin/dnsmasq ] && [ -x /etc/init.d/dnsmasq ] && installed=yes
}

function hook_check() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it enough to just to check if the process is running?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not enough. It is just an "initial" hook (almost an exampe), to help understand the concept.

/etc/init.d/dnsmasq restart
}

function hook_longfix() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function does nothing ATM, so or we do implement it or we do change the log message accordingly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. It was an example to help understand the concept. But I agree on changing the log message.

Signed-off-by: Pau Escrich <p4u@dabax.net>
Copy link
Member

@nicopace nicopace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was locally tested by @p4u ... we will trust him :)

@nicopace nicopace merged commit b5bed1c into develop Jun 14, 2017
@nicopace nicopace mentioned this pull request Jun 14, 2017
@dangowrt
Copy link
Member

Which functionality is provided by smonit which isn't already provided by procd? I'm wondering because to me it looks like everything smonit does can already be done using the procd 'respawn' parameter. Can you enlighten me?

@p4u
Copy link
Member Author

p4u commented Jun 15, 2017

See #137

I agree, everything that might be controlled by procd must be moved there. The two snippets added are kind of examples. The idea is to extend them to for instance:

  • bmx6: check if there is tunnel connectivity with the nodes (using ping), sometimes we deteced an extrange bug which has not yet been identified. Smonit can help to identify it

  • dnsmasq: some times it is still alive but it does not resolve. Smonit might try to resolve a hostname and execute actions if it does not work.

Can these two things be handled by procd?

@dangowrt
Copy link
Member

dangowrt commented Jun 15, 2017

No, procd can handle restartes based on ubus events and config changes and automaticly respawn a process when it died. Typically ubus events from netifd are used for the restarts, such as interface link state changes. Another thing already possible is to tie services in such way that the depending services are restarted once the depended-on service is (such as restarting dnsmasq when ever bmx6 is or the similar stuff you do in hotfix/bmx6-restart-watchping and basically all of lime-apply can all easily be done using procd triggers).
The (in my opinion) ideal pattern to solve the examples you described above would be to have a daemon which statefully monitors things (like smonit does now) but rather emmits ubus events than directly restarting services -- then you can either attach procd triggers to it or simply write scripts which fire on occurance of an object or event using ubus wait_for or ubus listen.

@ilario ilario mentioned this pull request Jul 4, 2017
@altergui altergui deleted the feature/smonit branch April 9, 2019 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants