Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1866299: Fixes Route53 Client Handling of GovCloud Partition #454

Merged

Conversation

danehans
Copy link
Contributor

@danehans danehans commented Sep 3, 2020

  • Switches the preference of the AWS config to prefer infra config over local AWS env vars.
  • Fixes the GovCloud handling of the Route53 client for standard and custom endpoints.

/assign @Miciah @knobunc
/cc @frobware @sgreene570

@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 3, 2020
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 3, 2020
@danehans danehans changed the title WIP: Prefers AWS Region From Config Over Local Env Bug 1866299: Fixes Route53 Client Handling of GovCloud Partition Sep 4, 2020
@openshift-ci-robot openshift-ci-robot added bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Sep 4, 2020
@openshift-ci-robot
Copy link
Contributor

@danehans: This pull request references Bugzilla bug 1866299, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.6.0) matches configured target release for branch (4.6.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

Bug 1866299: Fixes Route53 Client Handling of GovCloud Partition

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

case sess.Config.Region != nil:
region = aws.StringValue(sess.Config.Region)
log.Info("using region from shared config", "region name", region)
default:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The region preference has been inverted to prefer infrastructure config over local AWS env vars to improve local development and troubleshooting.

@@ -451,6 +453,7 @@ func (m *Provider) updateRecord(domain, zoneID, target, targetHostedZoneID, acti
ResourceRecordSet: &route53.ResourceRecordSet{
Name: aws.String(domain),
Type: aws.String(route53.RRTypeA),
TTL: aws.Int64(ttl),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Miciah I added TTL to Alias records for consistency between the record types. Let me know if you prefer that TTL only be added to CNAME records.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing the following repeated in the ingress operator logs from the last e2e-aws-operator run:

2020-09-05T01:08:17.596Z	ERROR	operator.dns_controller	dns/controller.go:181	failed to publish DNS record to zone	{"record": {"dnsName":"*.apps.ci-op-5pzpx1c2-43abb.origin-ci-int-aws.dev.rhcloud.com.","targets":["aa404373c70e04efebd0275f95643892-1078231396.us-west-2.elb.amazonaws.com"],"recordType":"CNAME","recordTTL":30}, "dnszone": {"id":"Z2GYOLTZHS5VK"}, "error": "failed to update alias in zone Z2GYOLTZHS5VK: couldn't update DNS record in zone Z2GYOLTZHS5VK: InvalidInput: Invalid request: Expected exactly one of [AliasTarget, all of [TTL, and ResourceRecords], or TrafficPolicyInstanceId], but found more than one in Change with [Action=UPSERT, Name=*.apps.ci-op-5pzpx1c2-43abb.origin-ci-int-aws.dev.rhcloud.com., Type=A, SetIdentifier=null]\n\tstatus code: 400, request id: a8d0dc74-4e7d-492b-bbb3-2d55696052ec"}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Miciah thanks for catching this. That's a similar error that required I add TTL to the CNAME record. Let me push a new commit with the TTL removed for the Alias record.

@openshift-ci-robot
Copy link
Contributor

@danehans: This pull request references Bugzilla bug 1866299, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.6.0) matches configured target release for branch (4.6.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

Bug 1866299: Fixes Route53 Client Handling of GovCloud Partition

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

1 similar comment
@openshift-ci-robot
Copy link
Contributor

@danehans: This pull request references Bugzilla bug 1866299, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.6.0) matches configured target release for branch (4.6.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

Bug 1866299: Fixes Route53 Client Handling of GovCloud Partition

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Comment on lines 489 to 496
case m.config.ServiceEndpoints != nil:
for _, ep := range m.config.ServiceEndpoints {
if ep.Name == Route53Service {
if strings.Contains(ep.URL, govCloudRoute53Region) {
return true
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be simplified like the following?

Suggested change
case m.config.ServiceEndpoints != nil:
for _, ep := range m.config.ServiceEndpoints {
if ep.Name == Route53Service {
if strings.Contains(ep.URL, govCloudRoute53Region) {
return true
}
}
}
case strings.Contains(m.route53.Endpoint, govCloudRoute53Region):
return true

Is it necessary to check both the partition and the endpoint?

What do you think about making this a function for simplicity and testability?

func (m *Provider) updateRecord(domain, zoneID, target, targetHostedZoneID, action string, ttl int64) error {
// ...
	if clientEndpointIsGovCloud(&m.route53.Client.ClientInfo) {
// ...

// clientEndpointIsGovCloud returns true if the provided client info references
// a US GovCloud API endpoint.
func clientEndpointIsGovCloud(clientInfo *metadata.ClientInfo) bool {
	return strings.Contains(clientInfo.Endpoint, govCloudRoute53Region)
}
```	

@danehans danehans force-pushed the dns_region_pref_cfg branch 2 times, most recently from 8de5780 to fb5c6a2 Compare September 4, 2020 23:29
@Miciah
Copy link
Contributor

Miciah commented Sep 4, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 4, 2020
@danehans
Copy link
Contributor Author

danehans commented Sep 4, 2020

@Miciah commit fb5c6a2 includes your suggestions in #454 (comment).

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

5 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@Miciah
Copy link
Contributor

Miciah commented Sep 5, 2020

/hold
Specifying TTL seems to be causing problems; see #454 (comment).

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 5, 2020
@danehans
Copy link
Contributor Author

danehans commented Sep 8, 2020

level=error msg="Cluster operator authentication Degraded is True with OAuthRouteCheckEndpointAccessibleController_SyncError: OAuthRouteCheckEndpointAccessibleControllerDegraded: Get \"https://oauth-openshift.apps.ci-op-hr97ggsg-43abb.origin-ci-int-aws.dev.rhcloud.com/healthz\": dial tcp: lookup oauth-openshift.apps.ci-op-hr97ggsg-43abb.origin-ci-int-aws.dev.rhcloud.com on 172.30.0.10:53: no such host"

/test e2e-aws-operator

@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Sep 8, 2020
@danehans
Copy link
Contributor Author

danehans commented Sep 8, 2020

Commit 83bedf4 removes TTL from Alias record updates for the AWS DNS provider.

@Miciah
Copy link
Contributor

Miciah commented Sep 8, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 8, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danehans, Miciah

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@danehans
Copy link
Contributor Author

danehans commented Sep 8, 2020

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 8, 2020
@openshift-merge-robot openshift-merge-robot merged commit 757d879 into openshift:master Sep 8, 2020
@openshift-ci-robot
Copy link
Contributor

@danehans: All pull requests linked via external trackers have merged:

Bugzilla bug 1866299 has been moved to the MODIFIED state.

In response to this:

Bug 1866299: Fixes Route53 Client Handling of GovCloud Partition

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants