-
Notifications
You must be signed in to change notification settings - Fork 268
Conversation
Can one of the admins verify this patch? |
- /bootstrap | ||
- --config=/etc/cluster-config | ||
- --port=45900 | ||
- --cert=/assets/tls/ca.crt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@enxebre where can I mount the CA assets from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ca certs are dropped into https://github.com/coreos/tectonic-installer/blob/master/modules/ignition/ca_certs.tf#L19
Other kube certs a temporary dropped into /opt/tectonic/tls
This probably relates to https://github.com/coreos/tectonic-installer/pull/2972/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thorfour I rebased this branch onto master and got this to a semi-working state in this commit enxebre@d6ff12c, which addresses the comments here and add/remove some missing parts.
Feel free to cherry pick it. Things TBD:
- Add right permissions to the docker image so it can be downloaded at tectonic bootstrap time
Then: - Verify the ignition config provided by the TNC satisfy the nodes. E.g address this https://github.com/coreos-inc/tectonic-operators/pull/286/files#r171519139
- Add support for rest like TNC paths https://github.com/thorfour/tectonic-operators/blob/templates/controller/node/pkg/ignition/server.go#L81
- For now let's get this working for http. The first node bootstrapped gets the config through a cname pointing to s3 which does not support https (see INST-944), moreover the TNC will need a proper certificate trusted by ignition. Once we address the points above we should approach https
config.tf
Outdated
@@ -71,6 +71,7 @@ variable "tectonic_container_images" { | |||
awscli = "quay.io/coreos/awscli:025a357f05242fdad6a81e8a6b520098aa65a600" | |||
gcloudsdk = "google/cloud-sdk:178.0.0-alpine" | |||
bootkube = "quay.io/coreos/bootkube:v0.10.0" | |||
tnc_bootstrap = "quay.io/coreos/tectonic-node-controller-dev:fad3a8e284e2c414fdf1713c7e0ae9d1e1e487ba" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't download this image with my tectonic license
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's weird since I added your user to have read access on quay
modules/aws/master-asg/master.tf
Outdated
@@ -61,7 +61,7 @@ resource "aws_autoscaling_group" "masters" { | |||
|
|||
data "ignition_config" "ncg_master" { | |||
append { | |||
source = "http://${var.cluster_name}-ncg.${var.base_domain}/ignition?profile=master" | |||
source = "http://${var.cluster_name}-ncg.${var.base_domain}/ign/v1/role/master" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docker image seems to be created from a branch which does not support rest like, but querystrings https://github.com/thorfour/tectonic-operators/blob/templates/controller/node/pkg/ignition/server.go#L81
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I tried to modify and bazel build //controller/node/cmd/bootstrap
but got running external/local_config_cc/cc_wrapper.sh failed: exit status 1 ld: library not found for -lcrt0.o
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea the image was old. I pushed a new dev image that uses the rest path
modules/aws/worker-asg/worker.tf
Outdated
@@ -27,7 +27,7 @@ data "aws_ami" "coreos_ami" { | |||
|
|||
data "ignition_config" "ncg_worker" { | |||
append { | |||
source = "http://${var.cluster_name}-ncg.${var.base_domain}/ignition?profile=worker" | |||
source = "http://${var.cluster_name}-ncg.${var.base_domain}/ign/v1/role/worker" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docker image seems to be created from a branch which does not support rest like, but querystrings https://github.com/thorfour/tectonic-operators/blob/templates/controller/node/pkg/ignition/server.go#L81
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in image: cb561ffb2b7c747a037231fab07a64fe6ebc7322
@@ -13,3 +13,24 @@ data: | |||
networkProfile: ${tectonic_networking} | |||
calicoConfig: | |||
mtu: ${calico_mtu} | |||
tnc-config: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cluster-config.yaml
is now generated at runtime by the cli, so this template is ignored, there's already a PR to delete it.
To get this config through the cluster-config will need to modify
https://github.com/coreos/tectonic-installer/blob/master/installer/pkg/config-generator/generator.go#L66
and
https://github.com/coreos/tectonic-config
kind: DaemonSet | ||
metadata: | ||
name: tectonic-node-controller | ||
namespace: tectonic-system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this lives in tectonic-system but the config in kube-system so it's not accessible for the pod
containers: | ||
- name: tectonic-node-controller | ||
image: ${tnc_bootstrap_image} | ||
command: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this does not like to the current image, docker inspect shows:
"Entrypoint": [
"/app/controller/node/cmd/bootstrap/bootstrap"
],
So command should be removed and need to add args:
args:
- --config=/etc/cluster-config/tnc-config
- --port=45900
- --cert=/opt/tectonic/tls/ca.crt
- --key=/opt/tectonic/tls/ca.key
@enxebre latest dev image |
current state of affairs is bootstrapping is stuck with |
Hey @thorfour just need to set the s3 key for the bootstrap node back to the rest like url enxebre@c331064 |
🎉 It bootstraps a cluster! (kind of) tectonic comes up just fine. But I ended up with a single node in the cluster |
I'm wondering if that has to do with the fact that I had to run the join step after tnc bootstrapping was already torn down? |
Can one of the admins verify this patch? |
Hey @thorfour Currently when running We know this pr already take cares of deploying the tnc successfully so what needs to be done now is to ensure the TNC serves the right config so other nodes can join the cluster successfully |
this fixes the issue mentioned above, please cherry pick it enxebre@d7141f1 |
steps/assets/tectonic.tf
Outdated
https_proxy = "${var.tectonic_https_proxy_address}" | ||
image_re = "${var.tectonic_image_re}" | ||
kube_dns_service_ip = "${module.bootkube.kube_dns_service_ip}" | ||
kubelet_node_label = "node-role.kubernetes.io/master" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is setting master label for both master and workers
@enxebre all green. Let's do TLS and long-running switchover in a follow-up PR |
hey @thorfour thanks a lot! This looks a good start for fully transitioning to the TNC.
|
kubelet_image_url = "${replace(var.container_images["hyperkube"],var.image_re,"$1")}" | ||
kubelet_image_tag = "${replace(var.container_images["hyperkube"],var.image_re,"$2")}" | ||
iscsi_enabled = "${var.iscsi_enabled}" | ||
kubeconfig_fetch_cmd = "${var.kubeconfig_fetch_cmd != "" ? "ExecStartPre=${var.kubeconfig_fetch_cmd}" : ""}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't it be to have "ExecStartPre=" as part of the go template a bit less error prone?
In 251147e (TNC bootstrapping, 2018-02-13, coreos/tectonic-installer#3053), the resource was renamed from 'ncg' to 'tnc', but the name property wasn't updated to match.
In 251147e (TNC bootstrapping, 2018-02-13, coreos/tectonic-installer#3053), the resource was renamed from 'ncg' to 'tnc', but the name property wasn't updated to match.
No description provided.