Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the trace command for performance #5707

Open
gotwarlost opened this issue May 16, 2024 · 5 comments · May be fixed by #5726
Open

Improve the trace command for performance #5707

gotwarlost opened this issue May 16, 2024 · 5 comments · May be fixed by #5726
Milestone

Comments

@gotwarlost
Copy link

What problem are you facing?

We have a "facade" composition that creates ~500 resources including secrets. crossplane beta trace takes minutes to run for this. What we really want is for this to run in seconds so that we can do stuff like:

$ watch crossplane beta trace my-type/my-resource

and have it be a meaningful experience.

How could Crossplane help solve your problem?

Make the command run faster.

So, internally in our company, we have a copy of the internal packages that is used to implement crossplane beta trace (resource, xrm) etc. and we have modified them to get the performance we need. We found that there are 2 parts to improving the perf of this tool:

  • Support a --concurrency option that allows dependencies to be loaded in parallel rather than the one-at-a-time approach that is currently in place.
  • Add support for controlling the QPS and burst of the k8s client used to fetch resources so that it doesn't become the bottleneck for this.

I'm happy to contribute this code upstream if there is interest.

It would be nice if, in return, I get some of these packages to not be internal such that I can write a something-like-crossplane-trace tool on top of it, without copying a bunch of code.

If you've read so far...

you might as well read the rest.

The forked version of our command does a few more things than what is offered by crossplane beta trace. These are provider-specific and I imagine you wouldn't want this to be in the core command. But these have been super-helpful from a practical POV and we wouldn't want to lose that. That's why I'd like the internal packages exposed for use by other similar tools.

What we do in our tool that is different is:

  • For Object resources created by the K8s provider, we peek into the status of the manifest of the remote object to get its "true" status and messaging
  • We also look for the AsyncOperationFailure condition that is exposed by (at least) the AWS provider so that our tests can check for a managed resource to be Synced and Ready and not have had any async operation failures.
@gotwarlost gotwarlost added the enhancement New feature or request label May 16, 2024
@gotwarlost gotwarlost changed the title Improve the trace command for performance (and possibly other things) Improve the trace command for performance May 16, 2024
@negz
Copy link
Member

negz commented May 20, 2024

I'm happy to contribute this code upstream if there is interest.

There definitely is. This contribution would be welcome. 🙂

It would be nice if, in return, I get some of these packages to not be internal such that I can write a something-like-crossplane-trace tool on top of it, without copying a bunch of code.

This I'm less sure about. I've generally been pretty reluctant to open c/c code up for use as a library, because it creates a new support vector for the maintainers. Once a package is public, https://www.hyrumslaw.com kicks in and folks start to have expectations around how it will work. This means we can't make changes to internal APIs without potentially breaking someone.

I'm not completely against it, but generally I want to see that there's demand from more than a handful of potential consumers before we commit to expose code as a supported library. If there's only one or two consumers, I think it generally is faster for everyone to just make a copy.

What we do in our tool that is different is

I'm not super familiar with how beta trace is implemented, but this makes me wonder if it could support provider-aware plugins for things like this.

@stevendborrelli
Copy link
Contributor

The issue to export trace libraries is here #5547, if you want to comment on it.

@negz
Copy link
Member

negz commented May 20, 2024

Thanks @stevendborrelli I forgot about that issue. I suppose that's at least two consumers of a hypothetical CLI library. 🙂

@phisco
Copy link
Contributor

phisco commented May 20, 2024

I'm not super familiar with how beta trace is implemented, but this makes me wonder if it could support provider-aware plugins for things like this.

Sure, happy to close #5405 if we decide to go the more generic way.

@gotwarlost
Copy link
Author

I've generally been pretty reluctant to open c/c code up for use as a library, because it creates a new support vector for the maintainers.

I completely understand this viewpoint. Exposing a public API is a lot of maintenance work. I'm happy to wait for more use-cases to accumulate.

gotwarlost added a commit to gotwarlost/crossplane that referenced this issue May 22, 2024
fixes crossplane#5707

Add a loader with configurable concurrency to load resources in concurrent manner.
The xrm client delegates to the loader for resource load and supports a functional
option to set the concurrency.

Add a `--concurrency` flag for the `crank beta trace` command and configure the
zrm client appropriately.

Signed-off-by: gotwarlost <kananthmail-github@yahoo.com>
gotwarlost added a commit to gotwarlost/crossplane that referenced this issue May 22, 2024
fixes crossplane#5707

Add a loader with configurable concurrency to load resources in concurrent manner.
The xrm client delegates to the loader for resource load and supports a functional
option to set the concurrency.

Add a `--concurrency` flag for the `crank beta trace` command and configure the
xrm client appropriately.

Signed-off-by: gotwarlost <kananthmail-github@yahoo.com>
@jbw976 jbw976 added this to the v1.17 milestone May 23, 2024
gotwarlost added a commit to gotwarlost/crossplane that referenced this issue May 27, 2024
fixes crossplane#5707

Add a loader with configurable concurrency to load resources in concurrent manner.
The xrm client delegates to the loader for resource load and supports a functional
option to set the concurrency.

Add a `--concurrency` flag for the `crank beta trace` command and configure the
xrm client appropriately.

Signed-off-by: gotwarlost <kananthmail-github@yahoo.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Review
Development

Successfully merging a pull request may close this issue.

5 participants