Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: restart ceph daemons when network updated #12791

Merged
merged 1 commit into from
Sep 1, 2023

Conversation

subhamkrai
Copy link
Contributor

We need to restart all the ceph daemons whenever
cephCluster network settings are modified like
requiremsgr2, encryption and compression. This
required for Ceph to consider the new settings
it require new ceph daemons all over.

Description of your changes:

Which issue is resolved by this Pull Request:
Resolves #12622

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Skip Tests for Docs: If this is only a documentation change, add the label skip-ci on the PR.
  • Reviewed the developer guide on Submitting a Pull Request
  • Pending release notes updated with breaking and/or notable changes for the next minor release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

@subhamkrai
Copy link
Contributor Author

making it draft since I need to test these changes

@@ -394,9 +394,37 @@ func ContainerEnvVarReference(envVarName string) string {
}

// DaemonEnvVars returns the container environment variables used by all Ceph daemons.
func DaemonEnvVars(image string) []v1.EnvVar {
func DaemonEnvVars(cephClusterSpec *cephv1.ClusterSpec) []v1.EnvVar {
var rookMsgr2EnvValue string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value doesn't really matter as long as it's unique for the settings, so how about if we do it with a single string format like this?

fmt.Sprintf(
  "msgr2_%t_encryption_%t_compression_%t", 
  cephClusterSpec.Network.Connections.RequireMsgr2,
  //encryption setting,
  //compression setting)

IIRC you'll also need to check if cephClusterSpec.Network.Connections is nil. If that is nil, we can skip adding the env var.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just keep the string without any special char in between like _ otherwise it will make code check longer unneccesarliy if the string value doesn't matter

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you show a code snippet? I thought it would just be a single fmt.Sprintf() as in my example, except checking for nil settings before calling fmt.Sprintf().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for example, let's say I remove the check cephClusterSpec.Network.Connections.Compression != nil then from the unit test you see it will through nil pointer which makes sense also right given these are struct settings apart from requireMsgr2 which is just a bool

--- FAIL: TestDaemonEnvVars (0.00s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x2619358]

goroutine 37 [running]:
testing.tRunner.func1.2({0x288d0c0, 0x3ea5220})
	/usr/local/go/src/testing/testing.go:1545 +0x238
testing.tRunner.func1()
	/usr/local/go/src/testing/testing.go:1548 +0x397
panic({0x288d0c0?, 0x3ea5220?})
	/usr/local/go/src/runtime/panic.go:914 +0x21f
github.com/rook/rook/pkg/operator/ceph/controller.ApplyNetworkEnv(0xc0000a3bc0)
	/home/srai/go/src/github.com/rook/pkg/operator/ceph/controller/spec.go:416 +0x78
github.com/rook/rook/pkg/operator/ceph/controller.TestDaemonEnvVars(0xc0005031e0?)
	/home/srai/go/src/github.com/rook/pkg/operator/ceph/controller/spec_test.go:487 +0x1a5
testing.tRunner(0xc0005031e0, 0x2ca3328)
	/usr/local/go/src/testing/testing.go:1595 +0xff
created by testing.(*T).Run in goroutine 1
	/usr/local/go/src/testing/testing.go:1648 +0x3ad
FAIL	github.com/rook/rook/pkg/operator/ceph/controller	0.015s
FAIL

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about like this? So it's a single format command once all the bools are determined.

if cephClusterSpec.Network.Connections != nil {
  msgr2Required := false
  encryptionEnabled := false
  compressionEnabled := false
  if cephClusterSpec.Network.Connections.RequireMsgr2 {
    msgr2Required = true
  } 
  if cephClusterSpec.Network.Connections.Encryption != nil && cephClusterSpec.Network.Connections.Encryption.Enabled{
    encryptionEnabled = true
  }
  if cephClusterSpec.Network.Connections.Compression != nil && cephClusterSpec.Network.Connections.Compression.Enabled{
    compressionEnabled = true
  }
  envVarValue := fmt.Sprintf("msgr2_%t_encryption_%t_compression_%t", msgr2Required, encryptionRequired, compressionRequired)
  // create the env var
}

@subhamkrai subhamkrai force-pushed the restart-ceph-pods branch 2 times, most recently from a758ed8 to cdcd3d9 Compare August 28, 2023 05:04
pkg/operator/ceph/controller/spec.go Show resolved Hide resolved
pkg/operator/ceph/controller/spec.go Outdated Show resolved Hide resolved
@subhamkrai
Copy link
Contributor Author

testing result

 printenv | grep ROOK_MSGR2
ROOK_MSGR2=msgr2compression
sh-4.4# exit
exit

~/go/src/github.com/rook/deploy/examples
srai@192 ~ (restart-ceph-pods) $ ka cluster-test.yaml 
configmap/rook-config-override unchanged
cephcluster.ceph.rook.io/my-cluster configured
cephblockpool.ceph.rook.io/builtin-mgr unchanged

~/go/src/github.com/rook/deploy/examples
srai@192 ~ (restart-ceph-pods) $ kc exec  -it rook-ceph-osd-0-679f6fdd9c-7fkwt -- sh
Defaulted container "osd" out of: osd, activate (init), expand-bluefs (init), chown-container-data-dir (init)
sh-4.4# printenv | grep ROOK_MSGR2
ROOK_MSGR2=msgr2
sh-4.4# exit
exit

pkg/operator/ceph/cluster/osd/spec.go Show resolved Hide resolved
rookMsgr2EnvValue = strings.Replace(rookMsgr2EnvValue, "mgr2", "", -1)
}

if cephClusterSpec.Network.Connections.Compression != nil && cephClusterSpec.Network.Connections.Compression.Enabled {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@travisn if encryption or compression is set to true do we automatically update the cluster CR with msgr2:true?
Or how it does works

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we keep watch on the cephcluster cr and every CR and if setting is change, we update accordingly

pkg/operator/ceph/controller/spec.go Show resolved Hide resolved
@parth-gr
Copy link
Member

testing result

 printenv | grep ROOK_MSGR2
ROOK_MSGR2=msgr2compression
sh-4.4# exit
exit

~/go/src/github.com/rook/deploy/examples
srai@192 ~ (restart-ceph-pods) $ ka cluster-test.yaml 
configmap/rook-config-override unchanged
cephcluster.ceph.rook.io/my-cluster configured
cephblockpool.ceph.rook.io/builtin-mgr unchanged

~/go/src/github.com/rook/deploy/examples
srai@192 ~ (restart-ceph-pods) $ kc exec  -it rook-ceph-osd-0-679f6fdd9c-7fkwt -- sh
Defaulted container "osd" out of: osd, activate (init), expand-bluefs (init), chown-container-data-dir (init)
sh-4.4# printenv | grep ROOK_MSGR2
ROOK_MSGR2=msgr2
sh-4.4# exit
exit

Can you show at the pod level, instead of container level

We need to restart all the ceph daemons whenever
cephCluster network settings are modified like
requiremsgr2, encryption and compression. This
required for Ceph to consider the new settings
it require new ceph daemons all over.

Signed-off-by: subhamkrai <srai@redhat.com>
@subhamkrai
Copy link
Contributor Author

testing result

 printenv | grep ROOK_MSGR2
ROOK_MSGR2=msgr2compression
sh-4.4# exit
exit

~/go/src/github.com/rook/deploy/examples
srai@192 ~ (restart-ceph-pods) $ ka cluster-test.yaml 
configmap/rook-config-override unchanged
cephcluster.ceph.rook.io/my-cluster configured
cephblockpool.ceph.rook.io/builtin-mgr unchanged

~/go/src/github.com/rook/deploy/examples
srai@192 ~ (restart-ceph-pods) $ kc exec  -it rook-ceph-osd-0-679f6fdd9c-7fkwt -- sh
Defaulted container "osd" out of: osd, activate (init), expand-bluefs (init), chown-container-data-dir (init)
sh-4.4# printenv | grep ROOK_MSGR2
ROOK_MSGR2=msgr2
sh-4.4# exit
exit

Can you show at the pod level, instead of container level

I don't think that's really matter as we just need to confirm the env settings but since you asked here you go both pod level and container level

 kc get pod rook-ceph-osd-0-58589b9dc8-k9kb7  -oyaml  | grep ROOK_MSGR2 -A1
    - name: ROOK_MSGR2
      value: msgr2_true_encryption_false_compression_true
~/go/src/github.com/rook/deploy/examples
srai@192 ~ (restart-ceph-pods) $ kc exec -it rook-ceph-osd-0-58589b9dc8-k9kb7 -- sh  
\Defaulted container "osd" out of: osd, activate (init), expand-bluefs (init), chown-container-data-dir (init)
\sh-4.4# \printenv ROOK_MSGR2
msgr2_true_encryption_false_compression_true
sh-4.4# 

@travisn travisn merged commit 3f67a78 into rook:master Sep 1, 2023
50 checks passed
@subhamkrai subhamkrai deleted the restart-ceph-pods branch September 4, 2023 06:55
travisn added a commit that referenced this pull request Sep 12, 2023
core: restart ceph daemons when network updated (backport #12791)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Restart all ceph daemons when Encryption Compression(any Network.Connections options) change
3 participants