Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kong] revamp database wait init containers #285

Merged
merged 10 commits into from
Feb 22, 2021
Merged

Conversation

rainest
Copy link
Contributor

@rainest rainest commented Feb 19, 2021

What this PR does / why we need it:

The main rationale for these changes is similar to the proposal to remove these waits entirely. Based on discussion in that PR, we favor this approach instead.

  • Allows users to disable database waits for compatibility with service meshes.
  • Uses the Kong image for wait-for-postgres init containers by default. This is fine for our stock images, as they include bash. The wait image configuration is now optional, for custom Kong images that don't include bash.
  • Disables the wait-for-postgres ConfigMap and mounts if it isn't actually in use.
  • Re-enables mesh proxies on Jobs.
  • Cleans up some DB documentation.
  • Adds a missing labels block to the init migrations job.

Special notes for your reviewer:

Supersedes #283 (similar goal, different approach) and #278 (cherry-picks that commit and rewords the title to follow this repo's format).

Checklist

[Place an '[x]' (no spaces) in all applicable fields. Please remove unrelated fields.]

  • PR is based off the current tip of the next branch and targets next, not main
  • New or modified sections of values.yaml are documented in the README.md
  • Title of the PR and commit headers start with chart name (e.g. [kong])

rainest and others added 6 commits February 17, 2021 15:21
Add a new waitForDatabase value, default false. When disabled,
wait-for-db and wait-for-postgres init containers do not run.

When using the Postgres subchart, Kong Pods will crash loop for a while
before Postgres comes online, but should still eventually enter Running.

This change provides compatibility with clusters that use a mesh proxy
to authenticate requests to the database.
Default to using the Kong image for migration job database check init
containers. The standard Kong images provide bash and can run our
bash-based check.
When we added jobAnnotations, the initial migrations job didn't have an
existing annotations block, and we forgot to add it. Annotations were
unintentionally placed in the labels block.
Prune some outdated Cassandra information. We haven't supported a
Cassandra sub-chart in quite some time (0.26.0 in Nov 2019) and have no
plans to re-add it. I don't see any reason to call out this old
functionality in the README still.

Break out the Postgres sub-chart info into a new subsection and flesh it
out with some additional info about what it is, how to enable it, and
what it does. Add our recommendation not to use it outside temporary
environments.
Kuma supports Jobs since 1.0.7, therefore no need to disable the sidecar
injection for it.

Signed-off-by: Nikolay Nikolaev <nikolay.nikolaev@konghq.com>
Flip the database wait toggle to enable it by default. We want to
preserve the existing init wait behavior for most users, while still
providing a viable configuration for mesh users. As this default may not
work for them, add an FAQ describing the issue, remediation, and side
effects.
charts/kong/values.yaml Outdated Show resolved Hide resolved
charts/kong/FAQs.md Outdated Show resolved Hide resolved
charts/kong/FAQs.md Outdated Show resolved Hide resolved
@@ -365,10 +365,12 @@ The name of the service used for the ingress controller's validation webhook
emptyDir: {}
- name: {{ template "kong.fullname" . }}-tmp
emptyDir: {}
{{- if (and (eq .Values.env.database "postgres") .Values.waitImage.enabled) }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic, as I understand it, is about "database being present", not about "database being of type postgres". Therefore, != off rather than == "postgres" feels more appropriate to me. Please change the logic to the former unless there's a compelling reason not to do so.

(applies to other similar checks "if database is postgres" in this PR as well)

Copy link
Contributor Author

@rainest rainest Feb 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a reason it's like this, though the check is a bit off. These waits, for the initial migrations, are only useful if Postgres being online or not is influenced by the chart lifecycle. This is only true if you use the sub-chart and Postgres spawns alongside the migration Jobs.

If you manage Postgres or Cassandra independently (all Cassandra instances are managed independently--we have no sub-chart for it), the chart lifecycle has no influence on whether the database is available or not. It either is, and migrations can start immediately, or it isn't, and they cannot. No amount of waiting helps if an independently-managed database isn't online; it makes sense to just let the failure happen immediately so users know they need to take action to make the database reachable. e5012be is more correct for these, since it changes the check from "any Postgres" to "only sub-chart Postgres".

The Deployment waits, however, are to see if migrations have completed. Those apply to any database and are always determined by the chart lifecycle. I did, however, copy the wrong logic when adding the waitImage.enabled for them 🤦 e4b534e fixes that.

Comment on lines 764 to 773
{{- if .Values.waitImage.unifiedRepoTag }}
image: "{{ .Values.waitImage.unifiedRepoTag }}"
{{- else }}
{{- else if .Values.waitImage.repository }}
image: "{{ .Values.waitImage.repository }}:{{ .Values.waitImage.tag }}"
{{- else }} {{/* default to the Kong image */}}
{{- if .Values.image.unifiedRepoTag }}
image: "{{ .Values.image.unifiedRepoTag }}"
{{- else }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
{{- end }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if ladder seems to be adding some duplicated and rather complex logic. Could this be avoided by something similar to the below:

  • implementing a helper template (called e.g. getRepoTag) that takes an "image" object as scope, and returns either .unifiedRepoTag or a {{.repository}}:{{.tag}}
  • using the default function, something like this: image: {{ default (include "getRepoTag" .Values.Image) (include "getRepoTag" .Values.Image) }}

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good idea. This gets repeated in several locations and doesn't need to be. The tests for it were also incorrect. 3f64601

rainest and others added 3 commits February 19, 2021 10:02
Check .Values.postgresql.enabled instead of .Values.env.database =
"postgres" for migration job waits and associated volume mounts. These
checks will now run only if the sub-chart is enabled, not for any
release that uses Postgres.

When Postgres is spawned by the sub-chart, it starts alongside
migration jobs and needs some time to come online, so migrations cannot
start immediately.

If Postgres is not managed by this chart, it is either online or not
when migrations happen, and no amount of waiting will allow it to come
online. If it is not online already, the Postgres administrator will
need to take manual action outside the chart lifecycle in order for
migrations to proceed.
Co-authored-by: Michał Flendrich <michal@flendrich.pro>
Replace per-template logic for selecting either repository+tag image or
unifiedRepoTag with a helper.

Fix CI values that used an outdated "image.single" value and were in
fact using defaults.
Copy link
Member

@mflendrich mflendrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Before merging, please fix the error in the PR description (labels -> annotations).

@nickolaev
Copy link

Thank you to everyone involved in resolving this, seemingly small, yet blocking issue.

@shaneutt shaneutt requested a review from a team February 22, 2021 13:09
@hbagdi
Copy link
Member

hbagdi commented Feb 22, 2021

@rainest Thanks for working through this multiple times!
Looks good from a user experience perspective.

@rainest rainest merged commit 966b9f6 into next Feb 22, 2021
@rainest rainest deleted the feat/wait-for-revamp branch February 22, 2021 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants