Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connect: use stronger validation that ingress gateways have compatible protocols defined for their upstreams #8470

Merged
merged 7 commits into from
Aug 12, 2020

Conversation

rboyer
Copy link
Member

@rboyer rboyer commented Aug 10, 2020

Fixes #8466

Since Consul 1.8.0 there was a bug in how ingress gateway protocol
compatibility was enforced. At the point in time that an ingress-gateway
config entry was modified the discovery chain for each upstream was
checked to ensure the ingress gateway protocol matched. Unfortunately
future modifications of other config entries were not validated against
existing ingress-gateway definitions, such as:

  1. create tcp ingress-gateway pointing to 'api' (ok)
  2. create service-defaults for 'api' setting protocol=http (worked, but not ok)
  3. create service-splitter or service-router for 'api' (worked, but caused an agent panic)

If you were to do these in a different order, it would fail without a
crash:

  1. create service-defaults for 'api' setting protocol=http (ok)
  2. create service-splitter or service-router for 'api' (ok)
  3. create tcp ingress-gateway pointing to 'api' (fail with message about
    protocol mismatch)

This PR introduces the missing validation. The two new behaviors are:

  1. create tcp ingress-gateway pointing to 'api' (ok)
  2. (NEW) create service-defaults for 'api' setting protocol=http ("ok" for back compat)
  3. (NEW) create service-splitter or service-router for 'api' (fail with
    message about protocol mismatch)

In consideration for any existing users that may be inadvertently be
falling into item (2) above, that is now officiall a valid configuration
to be in. For anyone falling into item (3) above while you cannot use
the API to manufacture that scenario anymore, anyone that has old (now
bad) data will still be able to have the agent use them just enough to
generate a new agent/proxycfg error message rather than a panic.
Unfortunately we just don't have enough information to properly fix the
config entries.

…e protocols defined for their upstreams

Fixes #8466

Since Consul 1.8.0 there was a bug in how ingress gateway protocol
compatibility was enforced. At the point in time that an ingress-gateway
config entry was modified the discovery chain for each upstream was
checked to ensure the ingress gateway protocol matched. Unfortunately
future modifications of other config entries were not validated against
existing ingress-gateway definitions, such as:

1. create tcp ingress-gateway pointing to 'api' (ok)
2. create service-defaults for 'api' setting protocol=http (worked, but not ok)
3. create service-splitter or service-router for 'api' (worked, but caused an agent panic)

If you were to do these in a different order, it would fail without a
crash:

1. create service-defaults for 'api' setting protocol=http (ok)
2. create service-splitter or service-router for 'api' (ok)
3. create tcp ingress-gateway pointing to 'api' (fail with message about
   protocol mismatch)

This PR introduces the missing validation. The two new behaviors are:

1. create tcp ingress-gateway pointing to 'api' (ok)
2. (NEW) create service-defaults for 'api' setting protocol=http ("ok" for back compat)
3. (NEW) create service-splitter or service-router for 'api' (fail with
   message about protocol mismatch)

In consideration for any existing users that may be inadvertently be
falling into item (2) above, that is now officiall a valid configuration
to be in. For anyone falling into item (3) above while you cannot use
the API to manufacture that scenario anymore, anyone that has old (now
bad) data will still be able to have the agent use them just enough to
generate a new agent/proxycfg error message rather than a panic.
Unfortunately we just don't have enough information to properly fix the
config entries.
@rboyer rboyer requested review from crhino and a team August 10, 2020 18:18
@rboyer rboyer self-assigned this Aug 10, 2020
@rboyer
Copy link
Member Author

rboyer commented Aug 10, 2020

Note I also flipped any panic instances in the agent/xds package to just return errors now. Only one of these mattered for #8466, but I figured it made sense to just fix them all.

} else if startNode.Type != structs.DiscoveryGraphNodeTypeResolver {
panic(fmt.Sprintf("unexpected first node in discovery chain using protocol=%q: %s", cfg.Protocol, startNode.Type))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main place the linked issue was getting its panic from.

@@ -822,48 +918,6 @@ func configEntryWithOverridesTxn(
return configEntryTxn(tx, ws, kind, name, entMeta)
}

func validateProposedIngressProtocolsInServiceGraph(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer necessary as the general graph validation handles ingress now.

@crhino
Copy link
Contributor

crhino commented Aug 11, 2020

💯 on updating all of the other places that also panic in the xds package!

Testing this out and looking through the code everything seems nice to me, I just had one comment about the potential of allowing IG http -> svc tcp targets.

@rboyer rboyer requested a review from crhino August 11, 2020 20:36
Copy link
Contributor

@crhino crhino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@rboyer rboyer merged commit e3cd4a8 into master Aug 12, 2020
@rboyer rboyer deleted the ingress-config-fix branch August 12, 2020 16:19
@hashicorp-ci
Copy link
Contributor

🍒❌ Cherry pick of commit e3cd4a8 onto release/1.8.x failed! Build Log

rboyer added a commit that referenced this pull request Aug 12, 2020
…ys have compatible protocols defined for their upstreams

Backport of #8470 to 1.8.x
rboyer added a commit that referenced this pull request Aug 13, 2020
…ys have compatible protocols defined for their upstreams (#8494)

Backport of #8470 to 1.8.x
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consul crashed with error "panic: unexpected first node in discovery chain using protocol"
3 participants