Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌱 Simplify CAPD quick start #3514

Conversation

fabriziopandini
Copy link
Member

What this PR does / why we need it:
This PR simplifies the quick start with the docker provider; this impacts also the clusterctl DX, and more specifically

  • CAPD is now an embedded provider in clusterctl, so the UX is the same of all the other providers
  • CAPD artifacts are now included in the CAPI release (infrastructure-components.yaml, cluster-template.yam)
  • The quick start doc is updated according to ^^
  • The local-override hack was deleted
  • A new create-local-repository hack was introduced to allow the developer to create a local repository starting from sources files
  • The clusterctl for developers dos is updated according to ^^

Which issue(s) this PR fixes:
Fixes #

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 20, 2020
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 20, 2020
Makefile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
@vincepri
Copy link
Member

Thoughts about renaming the artifacts to CAPD where D stands for development? I want to make it super explicit that this provider should never be used in production, and I'm thinking to rename it altogether in v1alpha4, thoughts?

@fabriziopandini
Copy link
Member Author

fabriziopandini commented Aug 21, 2020

I want to make it super explicit that this provider should never be used in production, and I'm thinking to rename it altogether in v1alpha4, thoughts?

TBH I don't think this is necessary. The fact that CAPD is for development only is already well documented and the fact the current implementation is limited to a single machine is a hard stop-gap for any production use case.

In the end, no name/-dev prefix will prevent a user to do crazy things if he really wants, and, given that IMO we already did a good job in advising people about CAPD limitations, I'm OK with the current name.

@vincepri
Copy link
Member

Is the cluster-template.yaml published as well?

@fabriziopandini
Copy link
Member Author

fabriziopandini commented Aug 21, 2020

Is the cluster-template.yaml published as well?

yes. all the CAPD templates are going to be part of the release; currently, there is only one

@vincepri
Copy link
Member

Let's name it at least cluster-template-development?

@fabriziopandini
Copy link
Member Author

Let's name it at least cluster-template-development?

Sure, but this requires --flavor when doing clusterctl config file (I'm going to document it)

@vincepri
Copy link
Member

vincepri commented Aug 21, 2020

Could we add support in the metadata file / internal for the default template to look at?

Is the template file getting copied into the out/ folder when make release is executed?

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 21, 2020
@fabriziopandini
Copy link
Member Author

@vincepri

Could we add support in the metadata file / internal for the default template to look at?

This is tricky, because the template name is generated by the yaml processor and currently this does not get in input any of metadata/provider config

Is the template file getting copied into the out/ folder when make release is executed?

yes!

@vincepri
Copy link
Member

LGTM pending squash, thanks for doing this and applying the requested changes! The documentation looks even better now with --flavor development :)

@fabriziopandini
Copy link
Member Author

/hold
this should be unblocker just before release because it touches the documentation

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 21, 2020
Copy link
Contributor

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is awesome @fabriziopandini !

i have not tested this workflow, but as i have recently broken my test stand i will do so on monday. i left a couple comments inline.

docs/book/src/clusterctl/developers.md Outdated Show resolved Hide resolved
docs/book/src/user/quick-start.md Outdated Show resolved Hide resolved
@vincepri
Copy link
Member

/milestone v0.3.9

@k8s-ci-robot k8s-ci-robot added this to the v0.3.9 milestone Aug 24, 2020
@wfernandes
Copy link
Contributor

/assign
Reviewing. Will try and follow instructions since it has been a while since I did anything with CAPD 🙂

@elmiko
Copy link
Contributor

elmiko commented Aug 24, 2020

i tried to follow the developer instructions and ran into this error during the step for building artifacts. it looks like the dockerfile is referencing an image that is not pulling cleanly for me (gcr.io/distroless/static:latest). did i miss a step?

updated
i had some weird network outage, sorry.

Copy link
Contributor

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey Fabrizio, i've tried to duplicate these instructions but i keep running into a few issues.

for reference, i am using Fedora 31, Docker 19.03, Go 1.13.14 and Kind 0.8.1.

i cleaned my environment (removed all old capi stuff), and rebuilt clusterctl and the capi images, these are some of the issues i see:

  1. clusterctl is looking for a template file named cluster-template.yaml but the one provided is cluster-template-deployment.yaml. fix was to rename my local file, this might be intended but we should mention in the docs if so.
  2. the kind management cluster is having issues pulling my local created images for the capi components. fix was to load all the images into the kind management cluster (with kind load docker-image) and then update the various infrastructure manifests to change the imagePullPolicy: Always to IfNotPresent.
  3. the initial control plane machine deploys fine but it seems like additional control plane machines are never created and the workers are stuck in pending state.

not sure if i've messed something up locally, but i tried to follow the instructions from a clean environment.

@vincepri
Copy link
Member

/milestone v0.3.10

@k8s-ci-robot k8s-ci-robot modified the milestones: v0.3.9, v0.3.10 Aug 28, 2020
Copy link
Member

@vincepri vincepri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/milestone v0.3.9
/approve

@k8s-ci-robot k8s-ci-robot modified the milestones: v0.3.10, v0.3.9 Aug 28, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vincepri, wfernandes

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 28, 2020
@vincepri
Copy link
Member

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 31, 2020
@dlipovetsky
Copy link
Contributor

I verified creating a cluster with this PR last Friday.

Thank you, @fabriziopandini!

/lgtm

@fabriziopandini
Copy link
Member Author

/hold
This PR did not merge before v0.3.9 release due to test flakes 😭

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 1, 2020
@fabriziopandini
Copy link
Member Author

/retest

@vincepri
Copy link
Member

vincepri commented Sep 1, 2020

/milestone v0.3.10

@k8s-ci-robot k8s-ci-robot modified the milestones: v0.3.9, v0.3.10 Sep 1, 2020
@vincepri
Copy link
Member

vincepri commented Sep 2, 2020

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 2, 2020
@k8s-ci-robot k8s-ci-robot merged commit 0750183 into kubernetes-sigs:master Sep 2, 2020
@fabriziopandini fabriziopandini deleted the simpify-CAPD-quick-start branch September 3, 2020 09:03
@bboreham
Copy link
Contributor

bboreham commented Sep 9, 2020

I see the instruction to set KIND_EXPERIMENTAL_DOCKER_NETWORK was removed in this PR; I am finding things work better with that setting. Was I supposed to do something different to replace it?

@bboreham
Copy link
Contributor

bboreham commented Sep 9, 2020

Ignore previous question: the setting is still there on a different page.

@nistiwar
Copy link

I guess this code is not yet released, how do I use it to test. A trivial question maybe, but I am struggling at the exact steps, managed to get through the imagePull error by changing 'ALways' to 'ifNotPresent'. However am getting the following while creating workload cluster:

root@master-1:~# kubectl apply -f capi-quickstart.yaml
dockercluster.infrastructure.cluster.x-k8s.io/capi-quickstart created
dockermachinetemplate.infrastructure.cluster.x-k8s.io/capi-quickstart-control-plane created
dockermachinetemplate.infrastructure.cluster.x-k8s.io/capi-quickstart-md-0 created
Error from server (InternalError): error when creating "capi-quickstart.yaml": Internal error occurred: conversion webhook for cluster.x-k8s.io/v1alpha3, Kind=Cluster failed: the server could not find the requested resource
Error from server (InternalError): error when creating "capi-quickstart.yaml": Internal error occurred: conversion webhook for controlplane.cluster.x-k8s.io/v1alpha3, Kind=KubeadmControlPlane failed: the server could not find the requested resource
Error from server: error when creating "capi-quickstart.yaml": conversion webhook for bootstrap.cluster.x-k8s.io/v1alpha3, Kind=KubeadmConfigTemplate failed: the server could not find the requested resource
Error from server (InternalError): error when creating "capi-quickstart.yaml": Internal error occurred: conversion webhook for cluster.x-k8s.io/v1alpha3, Kind=MachineDeployment failed: the server could not find the requested resource

@wfernandes
Copy link
Contributor

@nistiwar I was able to reproduce this myself. This happens because the current storage version of the CRDs being stored is v1alpha4. We may need to update the test/infrastructure/docker/templates/cluster-template-development.yaml file. If you'd like, feel free to open an issue, else I can do it on your behalf.

@wfernandes
Copy link
Contributor

@nistiwar I ended up creating the issue referenced above and will submit a PR shortly. Thanks for surfacing this issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants