Skip to content
Maikel edited this page Oct 11, 2023 · 16 revisions

Wormly

Wormly is used for basic monitoring to verify a site is alive. An account costs AUD 25 per month. That includes alerts via SMS when a server is down. Here are some examples of the Australian configuration:

  • https://openfoodnetwork.org.au (check every 30 seconds, alert after 5 minutes)
    • Expected text: Food, unincorporated
    • Expected HTTP response: 200 OK
    • Min. SSL cert. validity (days): 5
  • https://openfoodnetwork.org.au/ (check every 12 hours, alert after 5 minutes)
    • Expected HTTP response: 200 OK
    • Min. SSL cert. validity (days): 20
  • https://openfoodnetwork.org.au/api/status/job_queue (check every 2 minutes, alert after 5 minutes)
    • Expected text: {"alive":true}
    • Expected HTTP response: 200 OK

The different alert times are configured via alert groups. We have separate groups for delayed job and SSL/TLS certificates so that a dev doesn't freak out when a certificate will expire in 20 days. They then have time to fix the configuration.

New Relic

All managed instances are monitored by New Relic. Application Performance Monitoring (APM) is only activated for au-prod because it slows down server response times by 30% and our plan may not allow for all the data of all instances.

Alerts are set up for three infrastructure conditions:

  • Host not responding - selected hosts only, you need to add new hosts to this.
  • Memory almost full - all hosts (90%)
  • Disk almost full - all hosts (90%)

Notifications go to Slack channel #devops-alerts. You can also set up your email or mobile phone app to receive notifications.

Our account has not-for-profit status through the Open Food Foundation in Australia.

Uptime Kuma

Simple availability checker: https://kuma.openfoodnetwork.org.uk/status/global

Datadog

We used Datadog for several years but it got too expensive because they charge per host and each country has its own server. So we switched to New Relic.