Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Put EnableInjectionOrDie back on the main path #1772

Merged
merged 9 commits into from
Oct 6, 2020

Conversation

n3wscott
Copy link
Contributor

@n3wscott n3wscott commented Oct 5, 2020

Supersedes #1757
Reverts #1767

@googlebot googlebot added the cla: yes Indicates the PR's author has signed the CLA. label Oct 5, 2020
@knative-prow-robot knative-prow-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 5, 2020
@codecov
Copy link

codecov bot commented Oct 5, 2020

Codecov Report

❗ No coverage uploaded for pull request base (master@2478414). Click here to learn what that means.
The diff coverage is 14.70%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #1772   +/-   ##
=========================================
  Coverage          ?   68.23%           
=========================================
  Files             ?      218           
  Lines             ?     9295           
  Branches          ?        0           
=========================================
  Hits              ?     6342           
  Misses            ?     2637           
  Partials          ?      316           
Impacted Files Coverage Δ
codegen/cmd/injection-gen/generators/packages.go 0.00% <0.00%> (ø)
injection/config.go 0.00% <0.00%> (ø)
injection/injection.go 0.00% <0.00%> (ø)
test/helpers/dryrun.go 0.00% <0.00%> (ø)
test/helpers/name.go 94.28% <ø> (ø)
test/interactive/docker.go 42.42% <0.00%> (ø)
test/mako/config/benchmark.go 0.00% <0.00%> (ø)
test/mako/config/environment.go 0.00% <0.00%> (ø)
apis/duck/v1/binding_types.go 6.66% <6.66%> (ø)
apis/duck/v1/register.go 100.00% <100.00%> (ø)
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2478414...ff42410. Read the comment docs.

@n3wscott
Copy link
Contributor Author

n3wscott commented Oct 5, 2020

/retest

}

untyped := ctx.Value(informerStartChanKey{})
if untyped == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given return is interface{} this might not always work.
https://medium.com/@mangatmodi/go-check-nil-interface-the-right-way-d142776edef1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

@zhongduo
Copy link
Contributor

zhongduo commented Oct 6, 2020

Hi @n3wscott,

I think this PR needs some discussion. I understand you are trying to avoid changing the function signature to maintain compatibility, but I believe API compatibility also includes underlying logic too and this PR changes the logic considerably IMO. If I understand correctly, with this PR:

  1. For users other than shared_main, the informer start with be delayed for 5 seconds.
  2. For shared_main, it chooses to start the informer after 5 seconds or after controller setup, whichever comes first.

For 1, though I know the timeout is configurable, I do believe most users won't notice this change until they encounter a problem. I am afraid that we are making the issue harder to reproduce to most users and the magic 5 seconds can cause some unexpected problems. Eg, in some case if the controller setup takes about 5 seconds for whatever reason, they will experience some flakiness depending on the actual runtime > 5s or < 5s. Another possible problem is that some logic might be assuming informer start at that time, and this delay start will be quite difficult to debug. So to me it is not much different from the original logic, which starts the informer immediately.

For 2 or any other users that are aware of this problem and decide to start it manually, they can easily just use existing code without this PR. If we want to keep API and at the same time avoid duplicate code, here is what I would suggest:
a) Break the existing EnableInjectionOrDie to two parts: BeforeStartInformer and StartInfomer
b) Rewrite EnableInjectionOrDie to BeforeStartInformer + go StartInformer, which is same logic as the original EnableInjectionOrDie.
c) For shared_main or any users that want to start manually, we can do: BeforeStartInformer + do sth + StartInformer

Thanks,
Jimmy

@zhongduo
Copy link
Contributor

zhongduo commented Oct 6, 2020

I have updated #1757 with this idea, which will keep API compatibility, could you please see if that can be an alternative solution too?

@knative-prow-robot knative-prow-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 6, 2020
Copy link
Member

@mattmoor mattmoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Produced via:
gofmt -s -w $(find -path './vendor' -prune -o -path './third_party' -prune -o -name '*.pb.go' -prune -o -type f -name '*.go' -print)
goimports -w $(find -name '*.go' | grep -v vendor | grep -v third_party | grep -v .pb.go | grep -v wire_gen.go)

Copy link
Member

@mattmoor mattmoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Produced via:
gofmt -s -w $(find -path './vendor' -prune -o -path './third_party' -prune -o -name '*.pb.go' -prune -o -type f -name '*.go' -print)
goimports -w $(find -name '*.go' | grep -v vendor | grep -v third_party | grep -v .pb.go | grep -v wire_gen.go)

@n3wscott
Copy link
Contributor Author

n3wscott commented Oct 6, 2020

Hi @n3wscott,

I think this PR needs some discussion. I understand you are trying to avoid changing the function signature to maintain compatibility, but I believe API compatibility also includes underlying logic too and this PR changes the logic considerably IMO. If I understand correctly, with this PR:

  1. For users other than shared_main, the informer start with be delayed for 5 seconds.
  2. For shared_main, it chooses to start the informer after 5 seconds or after controller setup, whichever comes first.

For 1, though I know the timeout is configurable, I do believe most users won't notice this change until they encounter a problem. I am afraid that we are making the issue harder to reproduce to most users and the magic 5 seconds can cause some unexpected problems. Eg, in some case if the controller setup takes about 5 seconds for whatever reason, they will experience some flakiness depending on the actual runtime > 5s or < 5s. Another possible problem is that some logic might be assuming informer start at that time, and this delay start will be quite difficult to debug. So to me it is not much different from the original logic, which starts the informer immediately.

For 2 or any other users that are aware of this problem and decide to start it manually, they can easily just use existing code without this PR. If we want to keep API and at the same time avoid duplicate code, here is what I would suggest:
a) Break the existing EnableInjectionOrDie to two parts: BeforeStartInformer and StartInfomer
b) Rewrite EnableInjectionOrDie to BeforeStartInformer + go StartInformer, which is same logic as the original EnableInjectionOrDie.
c) For shared_main or any users that want to start manually, we can do: BeforeStartInformer + do sth + StartInformer

Thanks,
Jimmy

Thanks for the thoughtful response. I agree that the timeout was strange. I have moved the EnableInjectionOrDie, and changed the signature, it now returns back a callback the caller should call if it wants to start the informers. It is basically BeforeStartInformer + do sth + StartInformer except wrapped up in one call with a callback. I think the naming of BeforeStartInformer and StartInformer is strange and calling StartInformer alone will cause issues.

Using a callback means the caller can not have access to start informers out of order.

@n3wscott
Copy link
Contributor Author

n3wscott commented Oct 6, 2020

I have updated #1757 with this idea, which will keep API compatibility, could you please see if that can be an alternative solution too?

How do you like this way?

@n3wscott
Copy link
Contributor Author

n3wscott commented Oct 6, 2020

/retest

// Start the injection clients and informers.
logging.FromContext(ctx).Info("Starting informers...")
go func(ctx context.Context) {
go func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this go function? I think it is safer to let it finishes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be deleted asap after knative rolls to the new one, shared main now calls down to the new version. this method is not changed and is legacy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original question is about using go func or not in the old function, but I guess the whole function will be removed anyway.

// EnableInjectionOrDie enables Knative Injection and starts the informers.
// Both Context and Config are optional. Returns context with rest config set
// and a function to start the informers after watches have been set up.
func EnableInjectionOrDie(ctx context.Context, cfg *rest.Config) (context.Context, func()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me, though the name is a little bit confusing as the original EnableInjectionOrDie is exported too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that is intended, the original one is in a different package and marked deprecated.

@@ -116,8 +117,7 @@ func quack(ctx context.Context, kc kubernetes.Interface, component string, leade
}

func main() {
ctx := signals.NewContext()
ctx = sharedmain.EnableInjectionOrDie(ctx, nil)
ctx, _ := injection.EnableInjectionOrDie(signals.NewContext(), nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have to do this, that means we are really trying to avoid calling the original function. I would recommend to have a roadmap to deprecate the original function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is marked deprecated and documented redirected to the new one version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's awesome. Is this on purpose not to call the start function here? Note that the original logic will have informers started here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might skip your attention. Otherwise LGTM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked and the duck does not use informers so it is not needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking.

"k8s.io/client-go/rest"
)

// EnableInjectionOrDie enables Knative Injection and starts the informers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove starts the informers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

// Start the injection clients and informers.
logging.FromContext(ctx).Info("Starting informers...")
go func(ctx context.Context) {
go func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original question is about using go func or not in the old function, but I guess the whole function will be removed anyway.

@zhongduo
Copy link
Contributor

zhongduo commented Oct 6, 2020

Hi @n3wscott,
I think this PR needs some discussion. I understand you are trying to avoid changing the function signature to maintain compatibility, but I believe API compatibility also includes underlying logic too and this PR changes the logic considerably IMO. If I understand correctly, with this PR:

  1. For users other than shared_main, the informer start with be delayed for 5 seconds.
  2. For shared_main, it chooses to start the informer after 5 seconds or after controller setup, whichever comes first.

For 1, though I know the timeout is configurable, I do believe most users won't notice this change until they encounter a problem. I am afraid that we are making the issue harder to reproduce to most users and the magic 5 seconds can cause some unexpected problems. Eg, in some case if the controller setup takes about 5 seconds for whatever reason, they will experience some flakiness depending on the actual runtime > 5s or < 5s. Another possible problem is that some logic might be assuming informer start at that time, and this delay start will be quite difficult to debug. So to me it is not much different from the original logic, which starts the informer immediately.
For 2 or any other users that are aware of this problem and decide to start it manually, they can easily just use existing code without this PR. If we want to keep API and at the same time avoid duplicate code, here is what I would suggest:
a) Break the existing EnableInjectionOrDie to two parts: BeforeStartInformer and StartInfomer
b) Rewrite EnableInjectionOrDie to BeforeStartInformer + go StartInformer, which is same logic as the original EnableInjectionOrDie.
c) For shared_main or any users that want to start manually, we can do: BeforeStartInformer + do sth + StartInformer
Thanks,
Jimmy

Thanks for the thoughtful response. I agree that the timeout was strange. I have moved the EnableInjectionOrDie, and changed the signature, it now returns back a callback the caller should call if it wants to start the informers. It is basically BeforeStartInformer + do sth + StartInformer except wrapped up in one call with a callback. I think the naming of BeforeStartInformer and StartInformer is strange and calling StartInformer alone will cause issues.

Using a callback means the caller can not have access to start informers out of order.

The naming is just for illustration purpose only. :)

@n3wscott
Copy link
Contributor Author

n3wscott commented Oct 6, 2020

/retest

@n3wscott
Copy link
Contributor Author

n3wscott commented Oct 6, 2020

flakes: #1779

@n3wscott n3wscott closed this Oct 6, 2020
@n3wscott n3wscott reopened this Oct 6, 2020
@n3wscott
Copy link
Contributor Author

n3wscott commented Oct 6, 2020

g2g?

/cc @mattmoor

@n3wscott
Copy link
Contributor Author

n3wscott commented Oct 6, 2020

/retest

1 similar comment
@n3wscott
Copy link
Contributor Author

n3wscott commented Oct 6, 2020

/retest

injection/config.go Outdated Show resolved Hide resolved
// 2. Fallback to the KUBECONFIG environment variable.
// 3. Fallback to in-cluster config.
// 4. Fallback to the ~/.kube/config.
func GetRestConfig(serverURL, kubeconfig string) (*rest.Config, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: REST

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ick, how about GetKubeconfig

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is getting a rest.Config...

injection/config.go Outdated Show resolved Hide resolved
@@ -129,12 +130,13 @@ func GetLeaderElectionConfig(ctx context.Context) (*leaderelection.Config, error

// EnableInjectionOrDie enables Knative Injection and starts the informers.
// Both Context and Config are optional.
// Deprecated: use injection.EnableInjectionOrDie
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change this method to:

   ctx, si := injection.EnableInjectionOrDie(ctx, cfg)
   go si()
   return ctx

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, good idea!

@@ -63,6 +63,7 @@ import (
// 2. Fallback to the KUBECONFIG environment variable.
// 3. Fallback to in-cluster config.
// 4. Fallback to the ~/.kube/config.
// Deprecated: use injection.GetRESTConfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this call the other method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea.

Copy link
Member

@mattmoor mattmoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 6, 2020
@mattmoor
Copy link
Member

mattmoor commented Oct 6, 2020

/approve

@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mattmoor, n3wscott

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 6, 2020
@knative-prow-robot knative-prow-robot merged commit 46761ba into knative:master Oct 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cla: yes Indicates the PR's author has signed the CLA. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants