Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from curl-based default Docker healthcheck to a CLI-based one #5342

Merged
merged 2 commits into from
Feb 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions changes/5342.added
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Added `MigrationsBackend` to health-check, which will fail if any unapplied database migrations are present.
1 change: 1 addition & 0 deletions changes/5342.changed
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Changed default Docker HEALTHCHECK to use `nautobot-server health_check` CLI command.
6 changes: 0 additions & 6 deletions development/docker-compose.final.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,6 @@ services:
image: "local/nautobot-final:local-py${PYTHON_VER}"
ports:
- 8443:8443
healthcheck:
test:
- "CMD"
- "curl"
- "-fk"
- "https://localhost:8443/health/"
celery_worker:
image: "local/nautobot-final:local-py${PYTHON_VER}"
entrypoint: "nautobot-server celery worker -l INFO --events"
Expand Down
10 changes: 0 additions & 10 deletions development/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,16 +25,6 @@ services:
env_file:
- dev.env
tty: true
healthcheck:
interval: 5s
timeout: 5s
start_period: 5m # it takes a WHILE to run initial migrations with an empty DB
retries: 3
test:
- "CMD"
- "curl"
- "-f"
- "http://localhost:8080/health/"
celery_worker:
image: "local/nautobot-dev:local-py${PYTHON_VER}"
ports:
Expand Down
4 changes: 3 additions & 1 deletion docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,9 @@ RUN --mount=type=cache,target="/root/.cache/pip",sharing=locked \
--mount=type=cache,target="/tmp",sharing=locked \
pip install --upgrade pip wheel

HEALTHCHECK --interval=5s --timeout=5s --start-period=5s --retries=1 CMD curl --fail http://localhost:8080/health/ || exit 1
# timeout/interval=10s because `nautobot-server` can be slow to start - https://github.com/nautobot/nautobot/issues/4292
# start-period=5m because initial migrations can take several minutes to run on a fresh DB
HEALTHCHECK --interval=10s --timeout=10s --start-period=5m --retries=3 CMD nautobot-server health_check

# Generate nautobot user and its required dirs for later consumption
RUN mkdir /opt/nautobot /opt/nautobot/.cache /prom_cache /source && \
Expand Down
4 changes: 4 additions & 0 deletions nautobot/core/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -425,6 +425,10 @@
"graphene_django",
"health_check",
"health_check.storage",
# We have custom implementations of these in nautobot.extras.health_checks:
# "health_check.db",
# "health_check.contrib.migrations",
# "health_check.contrib.redis",
"django_extensions",
"constance.backends.database",
"django_ajax_tables",
Expand Down
5 changes: 3 additions & 2 deletions nautobot/extras/apps.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,10 +43,11 @@ def convert_field_to_list_tags(field, registry=None):
"during the execution of the migration command for the first time."
)

# Register the DatabaseBackend health check
from nautobot.extras.health_checks import DatabaseBackend, RedisBackend
# Register the DatabaseBackend, MigrationsBackend, and RedisBackend health checks
from nautobot.extras.health_checks import DatabaseBackend, MigrationsBackend, RedisBackend

plugin_dir.register(DatabaseBackend)
plugin_dir.register(MigrationsBackend)
plugin_dir.register(RedisBackend)

# Register built-in SecretsProvider classes
Expand Down
33 changes: 28 additions & 5 deletions nautobot/extras/health_checks.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
from urllib.parse import urlparse

from django.conf import settings
from django.db import connection, DatabaseError, IntegrityError
from django.db import connection, connections, DatabaseError, DEFAULT_DB_ALIAS, IntegrityError
from django.db.migrations.executor import MigrationExecutor
from health_check.backends import BaseHealthCheckBackend
from health_check.exceptions import ServiceReturnedUnexpectedResult, ServiceUnavailable
from prometheus_client import Gauge
Expand Down Expand Up @@ -65,10 +66,32 @@ def check_status(self):
obj.title = "newtest"
obj.save()
obj.delete()
except IntegrityError:
raise ServiceReturnedUnexpectedResult("Integrity Error")
except DatabaseError:
raise ServiceUnavailable("Database error")
except IntegrityError as e:
self.add_error(ServiceReturnedUnexpectedResult("Integrity Error"), e)
except DatabaseError as e:
self.add_error(ServiceUnavailable("Database error"), e)


class MigrationsBackend(NautobotHealthCheckBackend):
"""Check whether all migrations have been applied."""

metric = Gauge(
"health_check_migrations_info",
"State of migrations backend",
multiprocess_mode=NautobotHealthCheckBackend.MULTIPROCESS_MODE,
)

def check_status(self):
db_alias = getattr(settings, "HEALTHCHECK_MIGRATIONS_DB", DEFAULT_DB_ALIAS)
try:
executor = MigrationExecutor(connections[db_alias])
plan = executor.migration_plan(executor.loader.graph.leaf_nodes())
gsnider2195 marked this conversation as resolved.
Show resolved Hide resolved
if plan:
self.add_error(ServiceUnavailable("There are migrations not yet applied"))
gsnider2195 marked this conversation as resolved.
Show resolved Hide resolved
except DatabaseError as e:
self.add_error(ServiceUnavailable("Database is not ready"), e)
except Exception as e:
self.add_error(ServiceUnavailable("Unknown error"), e)


class RedisHealthCheck(NautobotHealthCheckBackend):
Expand Down