Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue with head pod not monitered by Prometheus under certain condition #963

Conversation

Yicheng-Lu-llll
Copy link
Contributor

Why are these changes needed?

KubeRay is anticipated to provide a Prometheus metrics endpoint on port 8080.

However, if users specify alternative ports in Spec.HeadGroupSpec.Template.Spec.Containers[ray_index].Ports without setting the metrics port, the head service will exclude the default metrics port. As a result, the service monitor will be unable to scrape metrics. Refer to the following code snippet for the existing logic.

func getServicePorts(cluster rayiov1alpha1.RayCluster) map[string]int32 {
ports, err := getPortsFromCluster(cluster)
// Assign default ports
if err != nil || len(ports) == 0 {
ports = getDefaultPorts()
}

This can be reproduced by using any sample yaml files, for example, running:

kind delete cluster
kind create cluster
# kuberay repo path
rootPath="/home/ubuntu/kuberay" 
cd $rootPath
# install the kube-prometheus-stack chart and related custom resources, including ServiceMonitor, PodMonitor and PrometheusRule
./install/prometheus/install.sh  

# install kuberay-operator
helm install kuberay-operator kuberay/kuberay-operator --version 0.4.0

# install ray cluster
kubectl apply -f $rootPath/ray-operator/config/samples/ray-cluster.complete.yaml

kubectl port-forward --address 0.0.0.0 prometheus-prometheus-kube-prometheus-prometheus-0 -n prometheus-system 9090:9090
# see result in http://localhost:9090/targets?search=ray. Only ray-workers-monitor is set.

Related issue number

Checks

After this PR, ray-workers-monitor and ray-head-monitor are all set.
image

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>
Copy link
Member

@kevin85421 kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind adding a test?

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>
@Yicheng-Lu-llll Yicheng-Lu-llll force-pushed the addDefaultMetricsPortToHeadService branch from 98673a2 to 09bda5e Compare March 17, 2023 22:22
Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>
Copy link
Member

@kevin85421 kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kevin85421 kevin85421 merged commit 19ddf04 into ray-project:master Mar 18, 2023
lowang-bh pushed a commit to lowang-bh/kuberay that referenced this pull request Sep 24, 2023
…dition (ray-project#963)

Fix issue with head pod not monitered by Prometheus under certain condition
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants