Rebasing kni-installer on openshift/installer #36

stbenjam · 2019-04-03T19:07:26Z

No description provided.

This has already been merged in the operator; just need to update the installer's cache of the crd. Someday we can get rid of this, but not yet.

…rces" Bring the docs up to speed after 05f7e0d (create cluster: change Creating cluster to Creating infrastructure resources, 2019-03-14, #1417).

This PR adds back the support to create machines and machinesets with trunk support enabled.

The description of the wildcard DNS entry and the example did not match.

In 200f0c9 (pkg/destroy/aws: Remove some lastError local-variable masking, 2019-03-18, #1434), I removed some local lastError variables that masked the function-level variable. But Matthew points out that we were still clobbering that function-level variable with the loop-level value [1], so a successful loop iteration might silently clear a previously-set lastError. This commit goes through and uses 'err' for consistently for the loop-level variable. When we have an error, we log any previous lastError value before clobbering that value with the new error. It's up to the caller to decide how they want to handle any final lastError; they can log it or not as they see fit. I've demoted the lastError logging from Info to Debug, because the destroy logic usually uses debug for errors (e.g. DependencyViolation errors), and I don't see a point to trying to classify errors as expected or unexpected. [1]: openshift/installer#1434 (comment)

Originally we installed nss_wrapper package from epel-testing, I think because it wasn't available in epel repo (I'm not 100%sure) We can now install from the stable epel repo, so no longer need the epel-testing repo. That's good, because epel-testing is no longer configured in the base image (the build was failing until I removed it, and I realized we no longer needed it).

libvirt CI Dockerfile fix

This adds some terraform to use to create the infrastructure for an OpenShift cluster on vSphere. See upi/vsphere/README.md for some instructions on how to perform an install. The process is very rough and not streamlined at the moment, but it mostly works.

pkg/destroy/aws: Remove lastError value masking

as reported in openshift/installer#1341 , the credential validation errors out when you try to run iam:SimulatePrincipalPolicy on IAM creds that belong to the AWS account's root user. vendor in an updated cloud-credential-operator with the changes to detect and allow the root creds through (with a stern warning printed out) dep ensure -update github.com/openshift/cloud-credential-operator

update vendor of cloud-credential-operator (allow use of root creds)

upi/vshpere: Add initial support for vSphere UPI

Since fd1349c (cmd/openshift-install/create: Log progressing messages, 2019-03-18, #1432), we log progress messages while waiting. But I'd forgotten to log them in the timeout message, which could lead to logs like: $ curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-4.0/5971/artifacts/e2e-aws/installer/.openshift_install.log | grep -B3 level=fatal time="2019-03-21T02:17:31Z" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.0.0-0.ci-2019-03-21-015242: 95% complete" time="2019-03-21T02:36:11Z" level=debug msg="Still waiting for the cluster to initialize: Could not update rolebinding \"openshift-cluster-storage-operator/cluster-storage-operator\" (231 of 310): the server has forbidden updates to this resource" time="2019-03-21T02:36:41Z" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.0.0-0.ci-2019-03-21-015242: 98% complete" time="2019-03-21T02:42:24Z" level=fatal msg="failed to initialize the cluster: timed out waiting for the condition" That's not very helpful if you're looking at stderr and our default info level. With this commit, we'll get: failed to initialize the cluster: Working towards 4.0.0-0.alpha-2019-03-21-015242: 98% complete: timed out waiting for the condition Even better would be to get the "forbidden updates to this resource" message, but that's up to the cluster-verison operator to set more helpful failing/progressing messages.

docs/user/customization: Catch up with "Creating infrastructure resources"

Add support for rchos template that use thin provisioning. Update terraform.tfvars.example to give details about the rchos-latest template. This also removed some unused variables and commented out code around setting static IP addresses for the machines. Static IP addresses are not working yet.

upi/vsphere: support rhcos-latest template

In RHEL8 journald switched to `DynamicUser=yes`, we can't reference the user at Ignition time. Let's hack around this by adding a fixed version of the user and doing the chown.

It builds an image containing binaries like jq, terraform, awscli, oc, etc. to allow bringing up UPI infrastructure. It also contains the `upi` directory that contains various terraform and cloud formation templates that are used to create infrastructure resources.

cmd/openshift-install/create: Log progress on timeout too

bootstrap: Work around systemd-journal-gateway DynamicUser=yes

images/installer: add image that can be used to instal UPI platforms

The cluster object was necessary for the machine controllers to function. This dep is being removed and there's no reason for this object to exist at the moment

Remove cluster-api cluster object dependency

As part of our release process we build container images for installer that are added to the release image (which has a cryptographic relationship to the images it contains, giving strong integrity). A consumer should be able to download and install a locked installer binary that uses the payload. However, we would prefer to not rebuild the binary outside of the image, but instead have: 1. a source for the binary from the payload 2. the binary be locked to the payload it comes from This commit allows a build system above the payload to extract the installer binary for linux from the image (other platforms later) and perform a replacement on the binary itself, patching: ``` _RELEASE_IMAGE_LOCATION_XXXXXXXXX... ``` with ``` quay.io/openshift/ocp-release:v4.0\x00 ``` without requiring a recompilation of the binary. The internal code checks the constant and verifies bounds (panicking if necessary) and then returns the updated constant. This allows a simpler replacement process to customize a binary for both external use and offline use that is locked to a payload.

…o_have_replacement release: Allow release image to be directly substituted into binary

Through 0d891e1 (Merge pull request #1446 from staebler/vsphere_tf, 2019-03-21).

openstack: add image for openstack ci

Bug 1659970: data/aws/route53: Block private Route 53 zone on public record

BUG 1670700: data/data/bootstrap: add --etcd-metric-ca to MCO bootstrap

image: Add a production "installer-artifacts" image for Mac binary

pkg/asset/manifests/infrastructure: Set InfrastructureName

Assets are required to build, but hack/build-go.sh cannot handle cross architecture asset generation. Explicitly generate before invoking the script. Failed when CI tried to build: + go build ... ./cmd/openshift-install data/unpack.go:12:15: undefined: Assets

pkg/destroy/aws: Destroy NAT gateways by VPC too

image: Asset generation is required

The kubeadmin user should only be used to temporarily access the console, and only until an admin configures a proper Identity Provider. There is no reason to login as the kubeadmin user via CLI. If oauth is broken in a cluster, an admin can still access via CLI as system:admin, but will not be able to access via kube:admin.

Modify kubeadmin usage message, admins should not use kubeadmin via CLI

Through 58a2767 (Merge pull request #1497 from vrutkovs/upi-multistage-cli, 2019-03-29).

Catching up with c734361 (Remove cluster-api object as this is not needed anymore, 2019-03-22, #1449), 1408d8a (*: use kube-etcd-cert-signer release image, 2019-03-27, #1477), and possibly others. Generated with: $ openshift-install graph | dot -Tsvg >docs/design/resource_dep.svg using: $ dot -V dot - graphviz version 2.30.1 (20170916.1124)

Catching up with a31e12f (release: Allow release image to be directly substituted into binary, 2019-03-15, #1422).

Bootstrap node is failing to uncompress the bios image with `t1.small` capacity. and also packet seems to be failing to deploy servers in SJC1. Using `any` allows terraform to create the servers in any available datacenter.

CHANGELOG: Document changes since 0.15.0

fix: Gopkg.lock after running dep ensure on pkg/terraform/exec

upi/metal: update the instance location and size

hack/build: Update release-pin location to defaultReleaseImageOriginal

stbenjam · 2019-04-03T20:25:52Z

This rebase pulls in a newer version of RHCOS (410.8.20190325.0) in hack/build.sh. I was running into issues with coredns not starting, but that turned out to be that I needed the same fixes from openshift-metal3/dev-scripts@401ba61.

I did that, but the bootstrap node is still not bringing up the k8s API. There's no errors in bootkube, but all progress seems to stop.

markmc · 2019-04-03T21:26:38Z

Note that you're likely seeing the bootstrap launched with an ootpa image whereas the masters are being given a maipo image by dev-scripts. See #37 and openshift-metal3/dev-scripts#287

derekhiggins · 2019-04-03T23:22:29Z

Build FAILURE, see build http://10.8.144.11:8080/job/dev-tools/516/

markmc · 2019-04-04T10:39:45Z

Ok, the rebase/merge looks good mechanically to me, and our path to fixing everything will be based on this rebase ... so I'm going ahead and merging

squeed and others added 30 commits March 14, 2019 10:52

Re-group the network operator crd.

9f07b61

This has already been merged in the operator; just need to update the installer's cache of the crd. Someday we can get rid of this, but not yet.

docs/user/customization: Catch up with "Creating infrastructure resou…

be1f31e

…rces" Bring the docs up to speed after 05f7e0d (create cluster: change Creating cluster to Creating infrastructure resources, 2019-03-14, #1417).

Re-add support for trunk ports

7db7d30

This PR adds back the support to create machines and machinesets with trunk support enabled.

Fixed DNS record description

2b7c0cd

The description of the wildcard DNS entry and the example did not match.

Merge pull request #1414 from sallyom/fixup-libvirt-ci-dockerfile

95a6bec

libvirt CI Dockerfile fix

Merge pull request #1439 from wking/fix-lastError-masking

a7badf7

pkg/destroy/aws: Remove lastError value masking

Merge pull request #1448 from joelddiaz/root-aws-creds

74ddacc

update vendor of cloud-credential-operator (allow use of root creds)

add gitignore

ee149f5

Merge pull request #1446 from staebler/vsphere_tf

0d891e1

upi/vshpere: Add initial support for vSphere UPI

Merge pull request #1425 from wking/creating-infrastructure-docs

d1d142d

docs/user/customization: Catch up with "Creating infrastructure resources"

Merge pull request #1451 from staebler/vsphere_rhel8

a9d5b60

upi/vsphere: support rhcos-latest template

bootstrap: Work around systemd-journal-gateway DynamicUser=yes

aa840b0

In RHEL8 journald switched to `DynamicUser=yes`, we can't reference the user at Ignition time. Let's hack around this by adding a fixed version of the user and doing the chown.

Merge pull request #1447 from wking/log-progress-when-timing-out-on-cvo

f243d09

cmd/openshift-install/create: Log progress on timeout too

upi: add bare metal example deployment using aws route53 and packet

ed61671

docs/user: add metal upi docs

e8fba19

Merge pull request #1445 from cgwalters/gatewayd-useradd

7f1f546

bootstrap: Work around systemd-journal-gateway DynamicUser=yes

Merge pull request #1456 from abhinavdahiya/upi_image

1c1b2bb

images/installer: add image that can be used to instal UPI platforms

Remove cluster-api cluster api file dependency for bootstrapping

2177408

The cluster object was necessary for the machine controllers to function. This dep is being removed and there's no reason for this object to exist at the moment

Remove cluster-api object as this is not needed anymore

c734361

Merge pull request #1449 from enxebre/remove-cluster-object-dep

48dbde1

Remove cluster-api cluster object dependency

Merge pull request #1422 from smarterclayton/allow_installer_binary_t…

f592cd5

…o_have_replacement release: Allow release image to be directly substituted into binary

CHANGELOG: Document changes since 0.14.0

acafa4f

Through 0d891e1 (Merge pull request #1446 from staebler/vsphere_tf, 2019-03-21).

openshift-merge-robot and others added 19 commits April 1, 2019 08:59

Merge pull request #1466 from trown/openstack-ci

8d15d60

openstack: add image for openstack ci

Merge pull request #1508 from wking/private-route53-after-public

bd9d357

Bug 1659970: data/aws/route53: Block private Route 53 zone on public record

Merge pull request #1511 from hexfusion/add_etcd_metric_ca

d9dfe0e

BUG 1670700: data/data/bootstrap: add --etcd-metric-ca to MCO bootstrap

Merge pull request #1506 from smarterclayton/add_mac_image

d89fb5f

image: Add a production "installer-artifacts" image for Mac binary

Merge pull request #1509 from wking/infrastructure-name

a552220

pkg/asset/manifests/infrastructure: Set InfrastructureName

Merge pull request #1507 from wking/delete-nat-gateways-by-vpc

7ed6e89

pkg/destroy/aws: Destroy NAT gateways by VPC too

Merge pull request #1514 from smarterclayton/generate_assets

954533f

image: Asset generation is required

Merge pull request #1513 from sallyom/modify-kubeadmin-msg

3d904d3

Modify kubeadmin usage message, admins should not use kubeadmin via CLI

CHANGELOG: Document changes since 0.15.0

16a75a1

Through 58a2767 (Merge pull request #1497 from vrutkovs/upi-multistage-cli, 2019-03-29).

hack/build: Update release-pin location to defaultReleaseImageOriginal

689077b

Catching up with a31e12f (release: Allow release image to be directly substituted into binary, 2019-03-15, #1422).

upi/metal: update the instance location and size

8af14a9

Bootstrap node is failing to uncompress the bios image with `t1.small` capacity. and also packet seems to be failing to deploy servers in SJC1. Using `any` allows terraform to create the servers in any available datacenter.

Merge pull request #1519 from wking/version-0.16.0

ccf533a

CHANGELOG: Document changes since 0.15.0

Merge pull request #1472 from serbrech/dep-ensure-terraform-exec

661a3cf

fix: Gopkg.lock after running dep ensure on pkg/terraform/exec

Merge pull request #1523 from abhinavdahiya/fix_upi

e013745

upi/metal: update the instance location and size

Merge pull request #1522 from wking/fix-build-script-release-image-pin

0ead1cf

hack/build: Update release-pin location to defaultReleaseImageOriginal

Merge latest openshift/installer

1128a2a

stbenjam mentioned this pull request Apr 3, 2019

Fix "/assets/tls/etcd-metric-ca-bundle.crt: no such file or directory" error #35

Closed

Use platform-python for get_vip_subnet_cidr

4538914

Apply fixes for newer rhcos

54af937

markmc mentioned this pull request Apr 3, 2019

dev-scripts should use the same rhcos image build that kni-install is pinned to openshift-metal3/dev-scripts#287

Closed

derekhiggins mentioned this pull request Apr 4, 2019

Switch masters to oopta openshift-metal3/dev-scripts#289

Merged

markmc merged commit 9e54579 into openshift-metal3:master Apr 4, 2019

stbenjam deleted the latest-upstream branch April 4, 2019 12:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rebasing kni-installer on openshift/installer #36

Rebasing kni-installer on openshift/installer #36

stbenjam commented Apr 3, 2019

stbenjam commented Apr 3, 2019 •

edited

Loading

markmc commented Apr 3, 2019

derekhiggins commented Apr 3, 2019

markmc commented Apr 4, 2019

Rebasing kni-installer on openshift/installer #36

Rebasing kni-installer on openshift/installer #36

Conversation

stbenjam commented Apr 3, 2019

stbenjam commented Apr 3, 2019 • edited Loading

markmc commented Apr 3, 2019

derekhiggins commented Apr 3, 2019

markmc commented Apr 4, 2019

stbenjam commented Apr 3, 2019 •

edited

Loading