Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memoize git loaders for speeding up build and localize commands #5298

Open
2 tasks done
komapa opened this issue Aug 28, 2023 · 5 comments
Open
2 tasks done

Memoize git loaders for speeding up build and localize commands #5298

komapa opened this issue Aug 28, 2023 · 5 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@komapa
Copy link

komapa commented Aug 28, 2023

Eschewed features

  • This issue is not requesting templating, unstuctured edits, build-time side-effects from args or env vars, or any other eschewed feature.

What would you like to have added?

Currently kustomize does not really use any logic to dedup remote git resources when building or localizing which causes these operations to be slow. Real example:

kustomize localize --scope ../ stag  5.23s user 6.32s system 29% cpu 39.464 total
❯ fgrep -R "gitlab" _infrastructure/manifests | cut -d" " -f2 | sort | uniq -c
   4 https://gitlab//kustomize/components/feature-flags?ref=master
   9 https://gitlab//kustomize/components/nomad-shim?ref=master
   7 https://gitlab//kustomize/components/optimizely?ref=master
   1 https://gitlab//kustomize/resources/env?ref=master
   8 https://gitlab//kustomize/resources/service?ref=master
   1 https://gitlab//kustomize/resources/worker?ref=master

With the super naive patch below, we get this result:

~/go/bin/kustomize localize --scope ../ stag  1.11s user 1.28s system 26% cpu 8.951 total

Why is this needed?

This is really naive implementation of what I think needs to be done:

Index: api/internal/localizer/locloader.go
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/api/internal/localizer/locloader.go b/api/internal/localizer/locloader.go
--- a/api/internal/localizer/locloader.go	(revision bd7f001c269b74cd3367d78297d4a05d1dbdc685)
+++ b/api/internal/localizer/locloader.go	(date 1693193678228)
@@ -41,8 +41,10 @@
 
 var _ ifc.Loader = &Loader{}
 
-// NewLoader is the factory method for Loader, under localize constraints, at rawTarget. For invalid localize arguments,
-// NewLoader returns an error.
+var loaders map[string]*Loader
+
+// NewLoader is the factory method for Loader, under localize constraints, at rawTarget.
+// For invalid localize arguments, NewLoader returns an error.
 func NewLoader(rawTarget string, rawScope string, rawNewDir string, fSys filesys.FileSystem) (*Loader, Args, error) {
 	// check earlier to avoid cleanup
 	repoSpec, err := git.NewRepoSpecFromURL(rawTarget)
@@ -109,6 +111,13 @@
 // New returns a Loader at path if path is a valid localize root.
 // Otherwise, New returns an error.
 func (ll *Loader) New(path string) (ifc.Loader, error) {
+	if loaders == nil {
+		loaders = make(map[string]*Loader)
+	}
+	if loaders[path] != nil {
+		return loaders[path], nil
+	}
+
 	ldr, err := ll.Loader.New(path)
 	if err != nil {
 		return nil, errors.WrapPrefixf(err, "invalid root reference")
@@ -126,10 +135,12 @@
 		return nil, errors.Errorf("localize remote root %q missing ref query string parameter", path)
 	}
 
-	return &Loader{
+	loaders[path] = &Loader{
 		fSys:   ll.fSys,
 		args:   ll.args,
 		Loader: ldr,
 		local:  ll.local && ldr.Repo() == "",
-	}, nil
+	}
+
+	return loaders[path], nil
 }

Can you accomplish the motivating task without this feature, and if so, how?

Yes, we can, unfortunately kustomize command get progressively slow as we add more manifests with more remote resources.

What other solutions have you considered?

There are a couple of other attempts to speed this operation up but I think those would probably benefit greatly from this solution instead:

Anything else we should know?

I am willing to work on a proper PR for this feature if the team behind kustomize agrees with the direction and willing to work with me to get it merged.

Feature ownership

  • I am interested in contributing this feature myself! 🎉
@komapa komapa added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 28, 2023
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Aug 28, 2023
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

SIG CLI takes a lead on issue triage for this repo, but any Kubernetes member can accept issues by applying the triage/accepted label.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@komapa komapa changed the title Memoize git loaders for speeding build and localize commands Memoize git loaders for speeding up build and localize commands Aug 28, 2023
@komapa
Copy link
Author

komapa commented Aug 30, 2023

Let me know what you think :)

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 27, 2024
@colinodell
Copy link
Contributor

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 14, 2024
@YorickH
Copy link

YorickH commented May 2, 2024

Can this be picked up? Deduping the remote git resources would make a huge difference for our build times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

5 participants