Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set the server_addr for the dashboard and prometheus modules #2408

Merged
merged 1 commit into from Dec 21, 2018

Conversation

travisn
Copy link
Member

@travisn travisn commented Dec 20, 2018

Signed-off-by: travisn tnielsen@redhat.com

Description of your changes:
The server_addr needs to be set on the mgr so the prometheus and dashboard mgr modules can bind to the pod ip when starting their endpoints. The mgr key is not sufficient to set this setting, so the admin key is used in the config-init container, then the admin key is deleted before the init container completes.

Which issue is resolved by this Pull Request:
Resolves #2335

Checklist:

  • Documentation has been updated, if necessary.
  • Pending release notes updated with breaking and/or notable changes, if necessary.
  • Upgrade from previous release is tested and upgrade user guide is updated, if necessary.
  • Code generation (make codegen) has been run to update object specifications, if necessary.
  • Comments have been added or updated based on the standards set in CONTRIBUTING.md

@galexrt galexrt self-requested a review December 20, 2018 17:02
Copy link
Member

@BlaineEXE BlaineEXE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

To note the thought I mentioned in the meeting today: I have been having the thought that I would wish the operator could do more of the config work and we could get rid of the need for calling the rook binary for init. I think this PR is good as-is; I am noticing here that it is a fair bit of additional code to add this support into the binary init path, and it would be nice to not have to keep doing this in the future.

@travisn
Copy link
Member Author

travisn commented Dec 20, 2018

@BlaineEXE @galexrt Per our discussion instead of setting the server_addr to the pod ip in the mgr init container, I successfully tested that setting it to 0.0.0.0 is also working. While this does appear to fix both #2397 and #2171, this would only work for ipv4 clusters. If there is an ipv6 cluster, we should leave the binding to the default [::].

Leaving this PR as-is I believe would work with both ipv4 and ipv6 since we would be binding to the correct pod ip either way. I do prefer the approach of setting 0.0.0.0 centrally in the operator, although this means we need a new setting in the operator.yaml to indicate if we're running ipv4 or ipv6. Any other suggestions?


clusterName := "ceph"
context.ConfigDir = "/etc"
settingPath := fmt.Sprintf("mgr/prometheus/server_addr")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use fmt.Sprintf() here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll clean this up

return fmt.Errorf("setting prometheus server_addr failed. %+v", err)
}

settingPath = fmt.Sprintf("mgr/dashboard/server_addr")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

@@ -102,7 +102,8 @@ func (c *Cluster) toggleDashboardModule() error {
}

func (c *Cluster) configureDashboardModule() error {
hasChanged, err := client.MgrSetConfig(c.context, c.Namespace, c.cephVersion.Name, "mgr/dashboard/url_prefix", c.dashboard.UrlPrefix)
allMgrs := ""
hasChanged, err := client.MgrSetConfig(c.context, c.Namespace, allMgrs, c.cephVersion.Name, "mgr/dashboard/url_prefix", c.dashboard.UrlPrefix)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allMgrs is empty. It is used in the MgrSetConfig func to build the mgr ID. Is that right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it applies to all mgrs, the empty string is needed. I'll rework this to make it more readable.

@@ -98,6 +98,7 @@ func (c *Cluster) makeConfigInitContainer(mgrConfig *mgrConfig) v1.Container {
}}},
k8sutil.PodIPEnvVar(k8sutil.PrivateIPEnvVar),
k8sutil.PodIPEnvVar(k8sutil.PublicIPEnvVar),
k8sutil.PodIPEnvVar("ROOK_MGR_MODULE_SERVER_ADDR"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have two other env vars that have the Pod IP in them (see line 99-100) do we really need a new env var here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while it's not strictly necessary, i wanted to be clear about the purpose of this setting. If the public and private ip configuration changes in the future, we should reevaluate this setting as well. This way it will be more obvious to evaluate it.

Signed-off-by: travisn <tnielsen@redhat.com>
@BlaineEXE
Copy link
Member

BlaineEXE commented Dec 20, 2018

I think I would lean toward the solution that supports both IPv4 and IPv6 without special support for either. I think there could be many ways to do that, though. I'm not sure what might happen if the k8s network is IPv4 and the host network is on IPv6 (or vice versa).

If we know that the mgr binding will always be on the cluster network, we could go the route of setting a config parameter with the 4/6 type of IP for the overlay network, but it would seem better to me to detect that automatically and not make the user configure it at all. Or if we know for certain that the mgr will use the host network, we could detect that type.

A lot of my concerns are based around the fact that many Ceph clusters will likely have 3 networks (mgmt, client, and cluster data). And to be fair, I haven't seen anywhere that Rook has plans for how to achieve that kind of support. And my level of ignorance about this subject is high.

I think my bottom line opinion at present is that if Rook is supposed to support IPv4 and IPv6 now, we keep the initial patch. But if Rook currently just supports IPv4, setting 0.0.0.0 is easier, and we can cross the IPv6 hurdle later when we are ready to tackle it.

@dyusupov
Copy link
Contributor

dyusupov commented Dec 20, 2018 via email

@BlaineEXE
Copy link
Member

BlaineEXE commented Dec 20, 2018

I think that bears more looking into, and this seems to meet a lot of my current thoughts for what is needed for host networking. It also seems similar in scope to CNI Genie, which I have been strongly advised to avoid, so I do have some hesitation.

With Ceph at least, we will also be fighting any sort of layering between the pods and the network. @sebastian-philipp has quoted a figure that I now forget that Ceph performance degrades substantially even just by enabling iptables with non-blocking rules on bare metal hosts. I believe it was around 60% performance degradation.

I think for all Rook storage backends, host networking's main feature will be to increase network throughput and reduce latency, so we may want to make some decisions around what Rook wants to support. Do we risk degrading performance by going through CNI layers that are heavy and increasing security, or do we leave the networks as unimpeded as possible and instead give strong recommendations about isolating the different networks for security purposes? And do we want to give the user the option of either?

@dyusupov
Copy link
Contributor

dyusupov commented Dec 20, 2018 via email

@travisn
Copy link
Member Author

travisn commented Dec 20, 2018

Great discussion around networking. I would propose that this PR go ahead and merge since it is compatible with either ipv4 or ipv6.

@dyusupov
Copy link
Contributor

dyusupov commented Dec 20, 2018 via email

@travisn travisn merged commit fa0e260 into rook:master Dec 21, 2018
@travisn travisn deleted the register-mgr-server-addr branch January 7, 2019 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ceph MGR: 2 modules failed on default install
4 participants