Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor][1/n] Replace logrus with logr to keep logging consistent #1835

Merged
merged 2 commits into from
Jan 26, 2024

Conversation

rueian
Copy link
Contributor

@rueian rueian commented Jan 16, 2024

Why are these changes needed?

We currently write logs through two different interfaces, the logr and the logrus.

Following the discussion #1793 (comment), we would like to revisit and unify our logging format for further usages, which requires a unified structure format, such as filtering logs by a k8s namespace.

This PR replaces the usages of logrus with the logr, which is recommended by the kubebuilder, to unify our logging format. I will add namespace logging in the next PR.

Related issue number

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@@ -87,7 +86,7 @@ func BuildIngressForHeadService(cluster rayv1.RayCluster) (*networkingv1.Ingress
// Get ingress class name from rayCluster annotations. this is a required field to use ingress.
ingressClassName, ok := cluster.Annotations[IngressClassAnnotationKey]
if !ok {
logrus.Warn(fmt.Sprintf("ingress class annotation is not set for cluster %s/%s", cluster.Namespace, cluster.Name))
log.Info(fmt.Sprintf("ingress class annotation is not set for cluster %s/%s", cluster.Namespace, cluster.Name))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log variable is initialized at

var log = logf.Log.WithName("controllers").WithName("RayCluster")

@@ -34,7 +34,7 @@ const (
EnableInitContainerInjectionEnvKey = "ENABLE_INIT_CONTAINER_INJECTION"
)

var log = logf.Log.WithName("RayCluster-Controller")
var log = logf.Log.WithName("controllers").WithName("RayCluster")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it consistent with

log := ctrl.Log.WithName("controllers").WithName("RayCluster")

@kevin85421 kevin85421 self-requested a review January 16, 2024 22:05
@kevin85421 kevin85421 self-assigned this Jan 16, 2024
Copy link
Member

@kevin85421 kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I like this refactor PR. cc @astefanutti would you mind reviewing this PR? Thanks!

@@ -32,6 +31,8 @@ const (
DefaultDomainName = "cluster.local"
)

var log = logf.Log.WithName("controllers").WithName("RayCluster")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be possible for RayJob/RayService controllers to call utility functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be better to use log := ctrl.LoggerFrom(ctx) mentioned by @astefanutti.

@@ -87,7 +86,7 @@ func BuildIngressForHeadService(cluster rayv1.RayCluster) (*networkingv1.Ingress
// Get ingress class name from rayCluster annotations. this is a required field to use ingress.
ingressClassName, ok := cluster.Annotations[IngressClassAnnotationKey]
if !ok {
logrus.Warn(fmt.Sprintf("ingress class annotation is not set for cluster %s/%s", cluster.Namespace, cluster.Name))
log.Info(fmt.Sprintf("ingress class annotation is not set for cluster %s/%s", cluster.Namespace, cluster.Name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be preferable to avoid string formatting and use the structured form, e.g.:

log.Info("ingress class annotation is not set for cluster", cluster.Namespace, cluster.Name)

Maybe you're already planning to do it in a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I am planning to add the namespace to the structured form, but I am not sure whether should we move all the logging parameters to the structured form. IMHO, the structured fields are subject to special purposes only such as filtering. Moving all parameters out from the message may reduce readability.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, having both the formatted message and the structured info sounds reasonable too.

@@ -32,6 +31,8 @@ const (
DefaultDomainName = "cluster.local"
)

var log = logf.Log.WithName("controllers").WithName("RayCluster")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the case where utility functions are called by multiple controllers, it may be preferable to use the logger propagated in the context instead, e.g.:

log := ctrl.LoggerFrom(ctx)

@@ -87,7 +86,7 @@ func BuildIngressForHeadService(cluster rayv1.RayCluster) (*networkingv1.Ingress
// Get ingress class name from rayCluster annotations. this is a required field to use ingress.
ingressClassName, ok := cluster.Annotations[IngressClassAnnotationKey]
if !ok {
logrus.Warn(fmt.Sprintf("ingress class annotation is not set for cluster %s/%s", cluster.Namespace, cluster.Name))
log.Info(fmt.Sprintf("ingress class annotation is not set for cluster %s/%s", cluster.Namespace, cluster.Name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, there is the logger provided by controller-runtime that can be retrieved from the context, e.g.:

log := ctrl.LoggerFrom(ctx)

This contains the reconciliation request id, and can be customised with common key / value pairs (e.g. namespace and name) with the LogConstructor option of the controller builder.

But a global variable can be a good approximation at the moment.

@rueian rueian force-pushed the remove-logrus branch 4 times, most recently from ea122a1 to 4bbbc3a Compare January 20, 2024 06:32
@rueian
Copy link
Contributor Author

rueian commented Jan 20, 2024

Hi @astefanutti, @kevin85421

I have changed to use log := ctrl.LoggerFrom(ctx) in 4bbbc3a

@kevin85421
Copy link
Member

Would you mind rebasing with the master branch? Thanks!

Copy link
Contributor

@astefanutti astefanutti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@@ -164,6 +164,7 @@ type RayClusterReconcilerOptions struct {
// Reconcile used to bridge the desired state with the current state
func (r *RayClusterReconciler) Reconcile(ctx context.Context, request ctrl.Request) (ctrl.Result, error) {
var err error
ctx = ctrl.LoggerInto(ctx, r.Log) // TODO: add reqeust namespace here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "reqeust" -> "request"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Member

@kevin85421 kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have several questions:

  • All RayCluster, RayJob, and RayService reconcilers seem to be running in a single process.
  • (Not sure) Because the default value of config.ReconcileConcurrency is 1, it seems to be only 1 thread for all reconcilers.
  • What would happen if config.ReconcileConcurrency were set to more than 1? If the RayCluster and RayService reconcilers were to run concurrently and both call ctx = ctrl.LoggerInto(ctx, r.Log), could that lead to a race condition when calling ctrl.LoggerFrom(ctx)?

@astefanutti
Copy link
Contributor

I have several questions:

  • All RayCluster, RayJob, and RayService reconcilers seem to be running in a single process.
  • (Not sure) Because the default value of config.ReconcileConcurrency is 1, it seems to be only 1 thread for all reconcilers.

the MaxConcurrentReconciles option controls the maximum number of routines that concurrently process items from a controller event queue. This is per controller, so at the moment, with MaxConcurrentReconciles defaulted to 1, each of the RayCluster, RayJob and RayService controllers Reconcile method is called by a single routine, but independently from each other.

  • What would happen if config.ReconcileConcurrency were set to more than 1? If the RayCluster and RayService reconcilers were to run concurrently and both call ctx = ctrl.LoggerInto(ctx, r.Log), could that lead to a race condition when calling ctrl.LoggerFrom(ctx)?

ctrl.LoggerInto returns a new context, that's a child of the provided one, so the logger that's set is only local to a single Reconcile method call, and all the methods down the call stack where it's passed as argument. So every concurrent invocation will have its own logger instance.

@rueian
Copy link
Contributor Author

rueian commented Jan 26, 2024

Thank you @astefanutti for your detailed explanation!

Hi @kevin85421, The ctx variable is independent for each Reconcile(ctx context.Context, request ctrl.Request) and the ctrl.LoggerInto will not modify the original context. Therefore, there will be no race between concurrent calls to ctx = ctrl.LoggerInto(ctx, r.Log) and ctrl.LoggerFrom(ctx).

Here is a quick concurrent test to simulate the situation:

func test(ctx context.Context) {
	ctx = ctrl.LoggerInto(ctx, ctrl.Log)
	_ = ctrl.LoggerFrom(ctx)
}

func BenchmarkRaceLoggerInto(b *testing.B) {
	ctx := context.Background()
	b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			test(ctx)
		}
	})
}

Then run it with the race detector enabled go test ./... -race -bench BenchmarkRaceLoggerInto:

▶ go test ./... -count=1 -race -bench BenchmarkRaceLoggerInto
goos: darwin
goarch: arm64
BenchmarkRaceLoggerInto-10    	 1000000	      1055 ns/op
PASS
ok  	2.901s

As you can see there is no race reported.

Copy link
Member

@kevin85421 kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank @rueian @astefanutti for the explanation! It is pretty helpful!

@kevin85421 kevin85421 merged commit 19e58b2 into ray-project:master Jan 26, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants