Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate all env. vars. before starting injecting env. vars #1141

Conversation

avadhut123pisal
Copy link
Contributor

@avadhut123pisal avadhut123pisal commented Oct 5, 2022

This PR adds an implementation to validate the environment variables before starting to mutate the actual container. If validation step fails then it skips the next steps related to common environment variables injection and OTEL SDK Configuration.

Closes #1094

@avadhut123pisal avadhut123pisal requested a review from a team as a code owner October 5, 2022 10:34
@avadhut123pisal avadhut123pisal marked this pull request as draft October 5, 2022 12:32
},
}

for _, test := range tests {
t.Run(test.name, func(t *testing.T) {
pod := injectDotNetSDK(logr.Discard(), test.DotNet, test.pod, 0)
pod, sdkInjectionSkipped := injectDotNetSDK(logr.Discard(), test.DotNet, test.pod, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change the contract to the positive value? sdkInjectionSkipped -> sdkInjected. It will requires changes in all places.

BTW current scenario looks like isDisabled=true, which is usually hard to understand and maintain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to resolve.

@avadhut123pisal avadhut123pisal changed the title Skip env var injection and OTEL SDK configurations if agent injection is skipped Validate all env. vars. before starting injecting env. vars Oct 8, 2022
@avadhut123pisal avadhut123pisal marked this pull request as ready for review October 8, 2022 07:54
// caller checks if there is at least one container.
container := &pod.Spec.Containers[index]

// validate container environment variables.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that most of the comments are redundant.

IMO // caller checks if there is at least one container. it is valid comment, but putting information // validate container environment variables. in front of validateContainerEnv does not make sense.

Please check whole PR in this context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to resolve

// validate container environment variables.
err := validateContainerEnv(container.Env, envDotNetStartupHook, envDotNetAdditionalDeps, envDotNetSharedStore)
if err != nil {
logger.Info("Skipping DotNet SDK injection", "reason:", err.Error(), "container Name", container.Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is how logr should be used

Suggested change
logger.Info("Skipping DotNet SDK injection", "reason:", err.Error(), "container Name", container.Name)
logger.Info("Skipping DotNet SDK injection", "reason", err.Error(), "container", container.Name)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to resolve

err := validateContainerEnv(container.Env, envDotNetStartupHook, envDotNetAdditionalDeps, envDotNetSharedStore)
if err != nil {
logger.Info("Skipping DotNet SDK injection", "reason:", err.Error(), "container Name", container.Name)
return pod, false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if we return error instead of bool. The caller is already doing some loggging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pellared I didn't get your point (The caller is already doing some logging).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See: https://github.com/avadhut123pisal/opentelemetry-operator/blob/2a6569c3b168e9f80dda10916d96579bcf4993d5/pkg/instrumentation/sdk.go#L102-L103

I also think that as a rule of thumb the function should either log or return an error. Returning false looks like returning an error without a description.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is more, if injectDotNetSDK would not log, then the logger would not be need as an argument and the signature would become:

func injectDotNetSDK(dotNetSpec v1alpha1.DotNet, pod corev1.Pod, index int) (corev1.Pod, error)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we return the error from injectDotNetSDK and not log in injectDotNetSDK, then on the caller side, https://github.com/avadhut123pisal/opentelemetry-operator/blob/2a6569c3b168e9f80dda10916d96579bcf4993d5/pkg/instrumentation/sdk.go#L103
should we use that err value to just handle the condition and to log the message like this ?

		pod, err = injectJavaagent(i.logger, otelinst.Spec.Java, pod, index)
		if err != nil {
			i.logger.Info("Skipping javaagent injection", "reason", err.Error(), "container", pod.Spec.Containers[index].Name)
			return pod
		}

Because if want to propagate the error further in the call stack then we need to modify the signature of inject function too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to resolve

return pod, false
}

// inject .Net instrumentation spec env vars.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo:

Suggested change
// inject .Net instrumentation spec env vars.
// inject .NET instrumentation spec env vars.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to resolve

}

func trySetEnvVar(logger logr.Logger, container *corev1.Container, envVarName string, envVarValue string, concatValues bool) bool {
// set env var to the container.
func setDotNetEnvVar(container *corev1.Container, envVarName string, envVarValue string, concatValues bool) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest describing what concatValues is supposed to offer.
AFAIK it should be set to true if the env var supports multiple values supported by :. If it is set to false, the original container's env var value has priority.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to resolve

if !trySetEnvVar(logger, &container, envDotNetOTelAutoHome, dotNetOTelAutoHomePath, doNotConcatEnvValues) {
return pod
}
setDotNetEnvVar(container, envDotNetOTelAutoHome, dotNetOTelAutoHomePath, doNotConcatEnvValues)
Copy link
Member

@pellared pellared Oct 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should additionally validate that the dotNetOTelAutoHomePath env var was not set in the original container. Otherwise, we cannot auto-instrument the .NET app. If someone set it then it would mean that somebody has already set the .NET AutoInstrumentation in the container.

@Kielek do you agree? I think it would be better addressed in a separate issue/PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. We should address this one in separate PR specific to .Net.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created #1156. Free free to resolve this comment.

Copy link
Member

@pavolloffay pavolloffay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to emit an k8s event if the injection fails.

@avadhut123pisal
Copy link
Contributor Author

It would be great to emit an k8s event if the injection fails.

Yes. I will raise separate PR for that, as that would need changes in other places to get the access to the event Recorder.

@avadhut123pisal avadhut123pisal requested review from pellared and Kielek and removed request for pellared and Kielek October 10, 2022 15:48
@pavolloffay
Copy link
Member

@pellared could you please review as well?

@pavolloffay
Copy link
Member

or @Kielek could you please review?

pod, err = injectJavaagent(otelinst.Spec.Java, pod, index)
if err != nil {
i.logger.Info("Skipping javaagent injection", "reason", err.Error(), "container", pod.Spec.Containers[index].Name)
return pod
Copy link
Member

@pellared pellared Oct 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This return would skip other instrumentations from being processed. I see it as a bug.

PS. It would be good to add a unit test to make sure that such a bug would not be introduced in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per my understanding, inject function gets called for a single container at a time. So, there should be only one language instrumentation is required for that particular container.

@pellared Please let me know, if I'm missing something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should be only one language instrumentation is required for that particular container

I do not see any docs, code, nor reason that would disallow injecting more language instrumentations. You can have a container that has more than one process.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pavolloffay Looking at the current implementation, it seems that using multiple instrumentations for the same container will not work. One reason is the duplicate volume mounts, because we use the same mount path. There might be some others things also that can break in context of init container.

I tried adding the annotations for two different language instrumentations, it failed with the error;
Error creating: Pod "spring-petclinic-5d6d58d9b8-pp268" is invalid: spec.containers[0].volumeMounts[2].mountPath: Invalid value: "/otel-auto-instrumentation": must be unique

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pellared Considering the current implementation (multiple instrumentations for a single pod) I don't think return statement would cause any issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal we had was to support multiple instrumentations for a single pod

@avadhut123pisal I suggest doing the following:

  1. Change the inject function implementation in a way as if support for multiple instrumentations for a single pod is working (e.g. by using else instead of return pod in case of an error).
  2. Create an issue.
  3. Document it in README.md as a known issue.

@pavolloffay Does it seem reasonable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal we had was to support multiple instrumentations for a single pod (e.g. one container java and other python etc.)

@avadhut123pisal feel free to book an issue to resolve this limitation if it is important or open a PR to document this in the readme.

sure !

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pellared Considering the current implementation (multiple instrumentations for a single pod) I don't think return statement would cause any issue.

The code is written in an unmaintainable way. The "error handling" suggests that it works only for one instrumentation. The "happy path scenario" suggests that it supports multiple instrumentations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pellared Considering the current implementation (multiple instrumentations for a single pod) I don't think return statement would cause any issue.

The code is written in an unmaintainable way. The "error handling" suggests that it works only for one instrumentation. The "happy path scenario" suggests that it supports multiple instrumentations.

Yeah. I got your point :)

Copy link
Member

@pellared pellared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

…avadhut123pisal/opentelemetry-operator into prevent-incomplete-auto-instrumentation
Copy link
Member

@pellared pellared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (but I was not testing it 😬 )

@pavolloffay
Copy link
Member

I am merging this based on the approvals.

@avadhut123pisal / @pellared please book the issue to simplify the injection code as discussed before.

@avadhut123pisal
Copy link
Contributor Author

I am merging this based on the approvals.

@avadhut123pisal / @pellared please book the issue to simplify the injection code as discussed before.

#1158

ItielOlenick pushed a commit to ItielOlenick/opentelemetry-operator that referenced this pull request May 1, 2024
…emetry#1141)

* skips env var injection and sdk configurations if agent injection is skipped

* mutate container at the last of SDK injection step

* validate first and then mutate the container with env variables

* fixes go lint issues

* incorporates review comments

* fixes go lint issue

* removes return statement in case of failed instrumentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

.NET Intrumentation - validate all env. vars. before start injecting env. vars.
4 participants