Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to reconcile bootstrap data between datastore and disk #3398

Merged
merged 31 commits into from
Oct 7, 2021

Conversation

briandowns
Copy link
Contributor

@briandowns briandowns commented Jun 3, 2021

Proposed Changes

Adds the ability for k3s to detect when there are differences in bootstrap data in either the datastore or on disk and to update the whichever one is older.

This also adds the ability to migrate bootstrap data from the "older" format to the "newer" format. This situation will happen when clusters are upgraded and the new version started. The new bootstrap data contains a timestamp field which the old format didn't have. That older data is read, determined to be old, migrated to be in the new format and saved back to the datastore.

Types of Changes

Verification

  1. Start a k3s server
  2. Remove /var/lib/rancher/k3s/server/tls
  3. Restart k3s
  4. Verify certificates are rewritten and kubectl can successfully run

or

  1. Start a k3s server
  2. Remove a file from the /var/lib/rancher/k3s/server/tls directory
  3. Restart k3s
  4. Verify that the file is replaced and the cluster is functional

See comment below for additional verification scenarios.

Linked Issues

#3015

Further Comments

@briandowns briandowns added the kind/bug Something isn't working label Jun 3, 2021
@briandowns briandowns requested a review from a team June 3, 2021 22:20
@briandowns briandowns self-assigned this Jun 3, 2021
@brandond
Copy link
Contributor

brandond commented Jun 3, 2021

Can we add additional tests?

Custom certs:

  1. Pre-create custom CA certs in /var/lib/rancher/k3s/server/tls using openssl
  2. Start K3s on server A connected to datastore 1
  3. Start K3s on server B connected to datastore 1
  4. Verify certificates are written on server B, and match those from server A

Changing datastore:

  1. Start K3s server A connected to external datastore 1
  2. Start K3s server B connected to external datastore 2
  3. Stop K3s on server B, change datastore address to point at datastore 1, start K3s
  4. Verify certificates are written on server B, match those from server A, and kubectl can successfully run

Certificate renewal:

  1. Start K3s server A connected to external datastore 1
  2. Start K3s server B connected to external datastore 1
  3. Stop K3s on both servers, and set time forward to 10 years in the future (after CA certs expire)
  4. Start K3s on server A
  5. Verify that CA and other cluster certs are updated on server A (are no longer expired), and kubectl can successfully run
  6. Start K3s on server B
  7. Verify certificates are written on server B, match those from server A (same expiration and CA hash), and kubectl can successfully run.

pkg/cluster/bootstrap.go Show resolved Hide resolved
pkg/cluster/bootstrap.go Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Jul 2, 2021

Codecov Report

Merging #3398 (57fda9a) into master (a1e3615) will decrease coverage by 0.15%.
The diff coverage is 10.57%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3398      +/-   ##
==========================================
- Coverage   11.61%   11.45%   -0.16%     
==========================================
  Files         136      136              
  Lines        8828     8983     +155     
==========================================
+ Hits         1025     1029       +4     
- Misses       7579     7730     +151     
  Partials      224      224              
Flag Coverage Δ
inttests 0.67% <0.00%> (-0.02%) ⬇️
unittests 11.10% <10.57%> (-0.15%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/cli/server/server.go 0.00% <0.00%> (ø)
pkg/clientaccess/kubeconfig.go 0.00% <ø> (ø)
pkg/cluster/bootstrap.go 0.00% <0.00%> (ø)
pkg/cluster/cluster.go 0.00% <0.00%> (ø)
pkg/cluster/storage.go 0.00% <0.00%> (ø)
pkg/daemons/control/deps/deps.go 44.53% <ø> (ø)
pkg/daemons/control/server.go 0.00% <0.00%> (ø)
pkg/etcd/etcd.go 16.58% <0.00%> (ø)
pkg/bootstrap/bootstrap.go 20.51% <16.66%> (-3.02%) ⬇️
pkg/clientaccess/token.go 89.83% <88.23%> (ø)
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a1e3615...57fda9a. Read the comment docs.

@xiaods
Copy link
Contributor

xiaods commented Jul 11, 2021

Dockerfile.dapper826934624 should be remove, this is generate file by dapper

@briandowns
Copy link
Contributor Author

Dockerfile.dapper826934624 should be remove, this is generate file by dapper

Correct. Last commit was a push to get the data up due to hardware failure. PR is t ready for review and will be marked WIP shortly.

@briandowns briandowns changed the title Add the ability to write bootstrap data on start when missing or incorrect [WIP] - Add the ability to write bootstrap data on start when missing or incorrect Jul 12, 2021
@briandowns briandowns force-pushed the issue-3015 branch 3 times, most recently from f455111 to 4d751aa Compare July 26, 2021 21:15
@briandowns briandowns changed the title [WIP] - Add the ability to write bootstrap data on start when missing or incorrect [WIP] - Add ability to reconcile bootstrap data between datastore and disk Jul 29, 2021
@briandowns briandowns changed the title [WIP] - Add ability to reconcile bootstrap data between datastore and disk Add ability to reconcile bootstrap data between datastore and disk Jul 29, 2021
@briandowns briandowns requested a review from a team July 29, 2021 20:39
pkg/bootstrap/bootstrap.go Outdated Show resolved Hide resolved
pkg/bootstrap/bootstrap.go Outdated Show resolved Hide resolved
pkg/cluster/storage.go Outdated Show resolved Hide resolved
pkg/util/tests/tests.go Outdated Show resolved Hide resolved
pkg/cluster/bootstrap.go Show resolved Hide resolved
pkg/cluster/bootstrap.go Outdated Show resolved Hide resolved
pkg/etcd/etcd.go Outdated Show resolved Hide resolved
@brandond
Copy link
Contributor

brandond commented Aug 2, 2021

This doesn't seem to do the right thing when using an external SQL datastore; it overwrites the files on disk from the bootstrap data without printing any message about why it is doing so.

# start K3s and let it run until the system stabilizes
k3s server --debug --datastore-endpoint 'mysql://root:password@tcp(db.dev-backend.k3s.khaus:3306)/k3s' --token token

# generate a CSR to renew the server-ca cert
openssl x509 -x509toreq -in /var/lib/rancher/k3s/server/tls/server-ca.crt -out server-ca.csr -signkey /var/lib/rancher/k3s/server/tls/server-ca.key

# renew server-ca cert with 10000 days of validity
openssl x509 -req -days 10000 -in server-ca.csr -out /var/lib/rancher/k3s/server/tls/server-ca.crt -signkey /var/lib/rancher/k3s/server/tls/server-ca.key

# verify file timestamps and expiration - timestamp on the .crt file should be newer
ls -la /var/lib/rancher/k3s/server/tls/server-ca*
openssl x509 -noout -text -in /var/lib/rancher/k3s/server/tls/server-ca.crt  | grep After

# start K3s again; let it run until the system stabilizes
k3s server --debug --datastore-endpoint 'mysql://root:password@tcp(db.dev-backend.k3s.khaus:3306)/k3s' --token token

# note that the file timestamps are now the same, and the expiration has reverted to the original value (10 years from initial creation)
ls -la /var/lib/rancher/k3s/server/tls/server-ca*
openssl x509 -noout -text -in /var/lib/rancher/k3s/server/tls/server-ca.crt  | grep After

@brandond
Copy link
Contributor

brandond commented Aug 2, 2021

The same behavior can be seen when using a sqlite datastore (no --datastore-endpoint or --cluster-init)

@briandowns
Copy link
Contributor Author

Changes made and pushed.

Signed-off-by: Brian Downs <brian.downs@gmail.com>
Signed-off-by: Brian Downs <brian.downs@gmail.com>
Signed-off-by: Brian Downs <brian.downs@gmail.com>
Signed-off-by: Brian Downs <brian.downs@gmail.com>
Signed-off-by: Brian Downs <brian.downs@gmail.com>
Signed-off-by: Brian Downs <brian.downs@gmail.com>
Signed-off-by: Brian Downs <brian.downs@gmail.com>
Signed-off-by: Brian Downs <brian.downs@gmail.com>
Signed-off-by: Brian Downs <brian.downs@gmail.com>
Signed-off-by: Brian Downs <brian.downs@gmail.com>
Signed-off-by: Brian Downs <brian.downs@gmail.com>
Signed-off-by: Brian Downs <brian.downs@gmail.com>
Copy link
Contributor

@brandond brandond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, but I would like @rancher-max to make sure we're covering all the weird edge cases before we merge.

@briandowns briandowns merged commit ac7a8d8 into k3s-io:master Oct 7, 2021
briandowns added a commit to briandowns/k3s that referenced this pull request Oct 22, 2021
briandowns added a commit to briandowns/k3s that referenced this pull request Oct 22, 2021
briandowns added a commit to briandowns/k3s that referenced this pull request Oct 22, 2021
briandowns added a commit to briandowns/k3s that referenced this pull request Oct 23, 2021
brandond pushed a commit that referenced this pull request Oct 25, 2021
brandond added a commit to brandond/k3s that referenced this pull request Oct 27, 2021
… disk (k3s-io#3398)"

This reverts commits
9a4ca59
c9f6fa0
07f844c
48355dc

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
brandond added a commit that referenced this pull request Oct 27, 2021
… disk (#3398)"

This reverts commits
9a4ca59
c9f6fa0
07f844c
48355dc

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
briandowns added a commit to briandowns/k3s that referenced this pull request Nov 10, 2021
briandowns added a commit to briandowns/k3s that referenced this pull request Nov 10, 2021
briandowns added a commit to briandowns/k3s that referenced this pull request Nov 10, 2021
briandowns added a commit to briandowns/k3s that referenced this pull request Nov 10, 2021
briandowns added a commit to briandowns/k3s that referenced this pull request Nov 10, 2021
briandowns added a commit to briandowns/k3s that referenced this pull request Nov 10, 2021
briandowns added a commit that referenced this pull request Nov 10, 2021
Add ability to reconcile bootstrap data between datastore and disk (#3398)
briandowns added a commit that referenced this pull request Nov 10, 2021
Add ability to reconcile bootstrap data between datastore and disk (#3398)
briandowns added a commit that referenced this pull request Nov 10, 2021
Add ability to reconcile bootstrap data between datastore and disk (#3398)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants