Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: clean up AWS ELBs #242

Merged
merged 3 commits into from
Sep 21, 2018
Merged

*: clean up AWS ELBs #242

merged 3 commits into from
Sep 21, 2018

Conversation

steveej
Copy link
Contributor

@steveej steveej commented Sep 12, 2018

/hold

@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Sep 12, 2018
@steveej
Copy link
Contributor Author

steveej commented Sep 12, 2018

/test e2e-aws-smoke

@@ -14,7 +14,8 @@ import (
)

const (
caPath = "generated/tls/root-ca.crt"
caPath = "generated/tls/root-ca.crt"
tncPort uint = 49500
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why specify the numeral type?

@crawford
Copy link
Contributor

/retest

@crawford
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Sep 13, 2018
@@ -14,7 +14,8 @@ import (
)

const (
caPath = "generated/tls/root-ca.crt"
caPath = "generated/tls/root-ca.crt"
tncPort = 49500
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to ignServerPort

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's surprising. Is ignition what's effectively listening on this port?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if we make this change, is there any point keeping the acronym TNC at all?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we are running ignition server on this port; we can drop acronym TNC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use mcsPort, since that is the actual service.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we agree on doing this in another PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to change it here... since we are removing *-tnc from url.

@@ -117,15 +117,9 @@ resource "aws_instance" "master" {
), var.extra_tags)}"
}

resource "aws_elb_attachment" "masters_tnc" {
resource "aws_elb_attachment" "masters_ctrlp" {
Copy link
Contributor

@abhinavdahiya abhinavdahiya Sep 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just masters_internal or ctrlp_internal ? We have lost information that all these are internal lbs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctrlp_internal between those, long-term it might be possible to consolidate internal and external

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

long-term, @crawford wants to drop external lbs... :P

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets do ctrp_internal

@@ -126,18 +127,11 @@ func (c *ConfigGenerator) embedUserBlock(ignCfg *ignconfigtypes.Config) {
func (c *ConfigGenerator) getTNCURL(role string, query string) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer TNC getMCSURL ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of writing out anything project-specific. Folks likely have seen URL before, which would make this getMachineConfigServerURL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep the rename out of scope here see #242 (comment).

(I know I'm repeating myself here :-D)

@steveej
Copy link
Contributor Author

steveej commented Sep 17, 2018

Here's something I noticed in the logs.

module.masters.aws_instance.master[1]: Creation complete after 33s (ID: i-02069d73e7c8e58a4)
aws_route53_record.tectonic_ctrlp_a: Creating...
  alias.#:                                 "" => "1"
  alias.2249508764.evaluate_target_health: "" => "true"
  alias.2249508764.name:                   "" => "ci-op-w1zlmf5c-5849d-ctrlp"
  alias.2249508764.zone_id:                "" => "Z35SXDOTRQ7X7K"
  fqdn:                                    "" => "<computed>"
  name:                                    "" => "ci-op-w1zlmf5c-5849d-ctrlp.origin-ci-int-aws.dev.rhcloud.com"
  type:                                    "" => "A"
  zone_id:                                 "" => "ZMT2Y1F79ZZSM"
module.dns.aws_route53_record.tectonic_api_external: Creating...
  alias.#:                                 "" => "1"
  alias.2552743047.evaluate_target_health: "" => "true"
  alias.2552743047.name:                   "" => "ci-op-w1zlmf5c-5849d-ext-1686418545.us-east-1.elb.amazonaws.com"
  alias.2552743047.zone_id:                "" => "Z35SXDOTRQ7X7K"
  fqdn:                                    "" => "<computed>"
  name:                                    "" => "ci-op-w1zlmf5c-5849d-api.origin-ci-int-aws.dev.rhcloud.com"
  type:                                    "" => "A"
  zone_id:                                 "" => "Z328TAU66YDJH7"

the alias name for ctrlp_a record is just a hostname, wheres the alias name for the tectonic_api_external record is a FQDN. This might explain that the ctrlp_a alias doesn't belong to the correct zone, as it might not have one. I'll need more investigation on this one

@steveej steveej changed the title [WIP] aws: consolidate ELBs for TNC and API aws: consolidate ELBs for TNC and API Sep 19, 2018
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 19, 2018
@steveej
Copy link
Contributor Author

steveej commented Sep 19, 2018

/hold cancel

I think the TNC -> MCS renaming should go to a different PR.

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 19, 2018
@@ -57,6 +31,14 @@ resource "aws_elb" "api_internal" {
interval = 5
}

# health_check {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these disabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An ELB can only have one health_check. We'll need to come up with a different solution for these.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An ELB can only have one health_check.

So drop the commented-out code and file an issue? Do we need to continue to health-check 49500 independently?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add this as comment above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we like TODOs in the code? If not I'll opt for removing the code altogether

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

49500 doesn't need a health check because Ignition uses a random resolver. If the endpoint is down, Ignition will try another IP address.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if all are down?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they are all down, Ignition keeps retrying.

Copy link
Member

@wking wking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me :). I left a few minor suggestions inline.

}

output "aws_api_external_dns_name" {
output "aws_elb_api_external_dns_name" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't know if we want to handle this incrementally or not, but aws_ is redundant for output variables in this AWS-only module. Also, I don't know if consumers will care if this is an elastic load balancer or not. So personally I'd prefer api_external_dns_name here.

But it's also nice to keep this diff small and the variables consistent, and aws_elb_api_external_... matches the pattern set by aws_elb_api_external_zone_id below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's also nice to keep this diff small and the variables consistent, and aws_elb_api_external_... matches the pattern set by aws_elb_api_external_zone_id below.

That was my intention. Now it's consistent within the current scheme, which can be refactored followiing up to this change.

@@ -68,6 +28,16 @@ resource "aws_security_group_rule" "api_ingress_console" {
to_port = 6443
}

resource "aws_security_group_rule" "tnc_ingress" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tnc -> mcs rebranding? Or ignition_ingress? Or something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep the rename out of scope here see #242 (comment).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can put it in a different commit, but it belongs in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it the rename is not related to the logic of this PR which is to consolidate ELBs and not rename stuff, I added the commit anyway so y'all are happy ;-)

@openshift-bot openshift-bot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Sep 19, 2018
@steveej steveej changed the title aws: consolidate ELBs for TNC and API aws/ELBs: merge tnc with api_internal and cleanup; *: subst tnc/TNC with mcs/MCS Sep 20, 2018
@steveej
Copy link
Contributor Author

steveej commented Sep 20, 2018

/retest

1 similar comment
@steveej
Copy link
Contributor Author

steveej commented Sep 20, 2018

/retest

@@ -17,7 +17,7 @@ import (
const (
generatedPath = "generated"
kcoConfigFileName = "kco-config.yaml"
tncoConfigFileName = "tnco-config.yaml"
mcsoConfigFileName = "mcso-config.yaml"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop this line, there is not mcso-config.yaml; missed this in #232

@@ -200,7 +200,7 @@ func (a *bootstrap) addBootstrapConfigFiles(config *ignition.Config, dependencie
// TODO (staebler) - missing the following from assets step
// /opt/tectonic/manifests/cluster-config.yaml
// /opt/tectonic/tectonic/cluster-config.yaml
// /opt/tectonic/tnco-config.yaml
// /opt/tectonic/mcso-config.yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop this line, there is not mcso-config.yaml; missed this in #232

@@ -35,7 +35,7 @@ var (
defaultIgnoredManifests = []string{
"bootstrap",
"kco-config.yaml",
"tnco-config.yaml",
"mcso-config.yaml",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop this line, there is not mcso-config.yaml; missed this in #232

@sallyom
Copy link
Contributor

sallyom commented Sep 20, 2018

/test e2e-aws

@crawford crawford changed the title aws/ELBs: merge tnc with api_internal and cleanup; *: subst tnc/TNC with mcs/MCS *: clean up AWS ELBs Sep 20, 2018
@abhinavdahiya
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 20, 2018
@abhinavdahiya
Copy link
Contributor

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 20, 2018
@crawford
Copy link
Contributor

@steveej can you fix the libvirt networking as well? I think it's just a matter of dropping the TNC records. Also, did you not want to name this endpoint "control plane"?

@steveej
Copy link
Contributor Author

steveej commented Sep 21, 2018

@crawford

can you fix the libvirt networking as well?

Do you mean cleanup or is it actually broken?

Also, did you not want to name this endpoint "control plane"?

I'm not too pedantic about the naming here as we're still inconsistent in overall. For further cleanups I'm in favor of keeping only names public/private as they realistically model the zones we're using. Thus, we could probably rename {external,internal} -> {public,private} and merge the console ELB into the public/private ELBs accordingly.

/retest unit

Remove the TNC ELB with and move all associated resources to the
api_internal ELB. This also allows to cleanup the DNS A record and the
security group for the TNC. It also changes the FQDN for the TNC, which
is now the same as for the API, though it remains exclusive to the
internal zone.

Configure the ELB to listen on the TNC port directly.
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Sep 21, 2018
@steveej
Copy link
Contributor Author

steveej commented Sep 21, 2018

I re-did the renaming for the content that slipped in via the rebase.

@abhinavdahiya since you've put this on hold, is there anything else missing IYO?

@crawford
Copy link
Contributor

Worked for me with libvirt.

/retest
/hold cancel
/lgtm

@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Sep 21, 2018
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavdahiya, crawford, steveeJ

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [abhinavdahiya,crawford,steveeJ]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit e03d43f into openshift:master Sep 21, 2018
@steveej steveej deleted the aws-consolidate-tnc-api-elb branch September 21, 2018 19:07
wking added a commit to wking/openshift-installer that referenced this pull request Nov 4, 2018
We used to assign both {name}-api and {name}-tnc to the bootstrap and
master nodes.  But we dropped {name}-tnc in 239373f (aws/ELBs: merge
tnc with api_internal and cleanup, 2018-09-19, openshift#242).  Now that it's
just the one entry, the local.hostnames indirection is unecessary
complication.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants