Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service Monitoring with statping #2

Closed
p-a-s-c-a-l opened this issue Jan 21, 2020 · 32 comments
Closed

Service Monitoring with statping #2

p-a-s-c-a-l opened this issue Jan 21, 2020 · 32 comments
Assignees
Labels
bug Something isn't working validation CSIS Validation Issues
Milestone

Comments

@p-a-s-c-a-l
Copy link
Member

p-a-s-c-a-l commented Jan 21, 2020

Set-up service uptime monitoring for CSIS internal (Apache, PostGIS) and external services (AIT EMIKAT, ATOS & METEOGRID GeoServer).

@p-a-s-c-a-l

This comment has been minimized.

@p-a-s-c-a-l
Copy link
Member Author

p-a-s-c-a-l commented Feb 11, 2020

@p-a-s-c-a-l
Copy link
Member Author

p-a-s-c-a-l commented Feb 11, 2020

@p-a-s-c-a-l
Copy link
Member Author

p-a-s-c-a-l commented Feb 11, 2020

@p-a-s-c-a-l
Copy link
Member Author

p-a-s-c-a-l commented Feb 11, 2020

@p-a-s-c-a-l
Copy link
Member Author

p-a-s-c-a-l commented Feb 11, 2020

@p-a-s-c-a-l
Copy link
Member Author

p-a-s-c-a-l commented Feb 11, 2020

@p-a-s-c-a-l

This comment has been minimized.

@clarity-ci

This comment has been minimized.

@clarity-ci

This comment has been minimized.

@clarity-ci

This comment has been minimized.

@clarity-ci

This comment has been minimized.

@clarity-ci

This comment has been minimized.

@p-a-s-c-a-l
Copy link
Member Author

p-a-s-c-a-l commented Feb 12, 2020

@jeanatcismet

Enabling 'Send Updates only' resets the configuration:

statping

Can you please restore the correct settings? Maybe a problem with the database volume?
It's a known bug: statping/statping#356

Statping sends weirdly formatted HTML Mails, so we cannot use this feature to update GitHub Issues.

Furthermore, the 'View Service' Link in HTML Mails is missing the domain part:

<a href=3D"/service/32" class=3D"button button--blue" tar=
get=3D"_blank" style=3D"-webkit-text-size-adjust: none; background: #3869D4=
; border-color: #3869d4; border-radius: 3px; border-style: solid; border-wi=
dth: 10px 18px; box-shadow: 0 2px 3px rgba(0, 0, 0, 0.16); box-sizing: bord=
er-box; color: #FFF; display: inline-block; font-family: Arial, 'Helvetica =
Neue', Helvetica, sans-serif; text-decoration: none;">View Service</a> </td=
>

Has the domain name to be configured at the level of the docker-compose file?

@jeanatcismet
Copy link

Has the domain name to be configured at the level of the docker-compose file?

No, there is a field for that in the settings. Apparently it has been reseted too.
Auswahl_126

@jeanatcismet
Copy link

The services from above are now all monitored under https://health-check.clarity.cismet.de/

The Offline/Online status of each service is shown on the front page:

Auswahl_127

Below that, there is on overview of the health history of each service:

Auswahl_128

Details for each single service can be shown:

Auswahl_129

Inclusively the reason for an offline status:

Auswahl_130

@p-a-s-c-a-l
Copy link
Member Author

p-a-s-c-a-l commented Feb 13, 2020

Furthermore, the 'View Service' Link in HTML Mails is missing the domain part

I've created an .env file and added DOMAN=https://health-check.clarity.cismet.de/ but that didn't work. Even worse, now all emails are missing the subject:

grafik

@p-a-s-c-a-l
Copy link
Member Author

Still no subject line in emails after removing the .env file. It seems that the configuration (database?) is messed up (again).

@p-a-s-c-a-l
Copy link
Member Author

Perhaps we should switch to a tool that is more stable and used in productive environments, e.g. https://cachethq.io/

@p-a-s-c-a-l
Copy link
Member Author

p-a-s-c-a-l commented Feb 13, 2020

now all emails are missing the subject:

Seems to be related to this setting:

grafik

If the limit is reached, mails will have no subject.
I have no idea what causes this behaviour.

@p-a-s-c-a-l
Copy link
Member Author

p-a-s-c-a-l commented Feb 13, 2020

After increasing the limit, I had to disable email notifications because it's spamming the mailing lists. Apparently, this doesn't work as expected. We should look into an alternative.

@p-a-s-c-a-l
Copy link
Member Author

p-a-s-c-a-l commented Feb 13, 2020

CSIS (Dev) - Summary (csis-dev.ait.ac.at) is Offline!
Your Statping service CSIS (Dev) - Summary (csis-dev.ait.ac.at) has been triggered with a HTTP status code of '200' and is currently offline based on your requirements. This failure was created on 2020-02-12 10:56:14.338441671 +0000 UTC.
Last Response
<!DOCTYPE html> <html ...

HTTP Status Code is 200 but Statpings sents email notifications, that the service is offline. 😞

@p-a-s-c-a-l
Copy link
Member Author

grafik

Title & e-mail don’t fit together.

@p-a-s-c-a-l
Copy link
Member Author

p-a-s-c-a-l commented Feb 14, 2020

grafik

grafik

Unfortunately, mails are sent every minute when a service is offline, not only when it went offline for the 1st time.

@p-a-s-c-a-l p-a-s-c-a-l added the bug Something isn't working label Feb 14, 2020
@p-a-s-c-a-l
Copy link
Member Author

Unfortunately, mails are sent every minute when a service is offline, not only when it went offline for the 1st time.

This can be avoided by enabling the "sent updates only" option which is broken atm: #2 (comment)

A patch is available here but the PR hasn't been integrated yet.

Unfortunately, this particular setting is not available in .env file or config.yml, so I've changed the flag directly in the SQLite database:

grafik

@p-a-s-c-a-l
Copy link
Member Author

Ha-ha, now the subject line is gone again. :hurtrealbad:

@p-a-s-c-a-l
Copy link
Member Author

p-a-s-c-a-l commented Feb 14, 2020

$ docker-compose start
Starting drupal-db ... done
Starting drupal    ... done

CKAN - Tag List is Online!
Your Statping service CKAN - Tag List is back online. This service has been triggered with a HTTP status code of '200' and is currently online based on your requirements. Your service was reported online at 2020-02-11 18:17:19.529653361 +0000 UTC.

Started CSIS, not CKAN. Notifications are sent for the wrong service. 🤦‍♂
I'll stop wasting my time with this now.

@p-a-s-c-a-l
Copy link
Member Author

p-a-s-c-a-l commented Feb 14, 2020

grafik

grafik

lol 😞

T1.4 Industrialization and Support automation moved this from To do to Done Feb 14, 2020
@p-a-s-c-a-l p-a-s-c-a-l changed the title Service Monitoring Service Monitoring with statping Feb 14, 2020
@p-a-s-c-a-l
Copy link
Member Author

Temporarily reactivated till #7 is implemented. Dysfunctional mail notifications disabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working validation CSIS Validation Issues
Development

No branches or pull requests

5 participants