Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Create findExecutable which searches multiple paths for an executable of a particular executable and returns the first found path. #3162

Closed
wants to merge 2 commits into from

Conversation

arran4
Copy link
Contributor

@arran4 arran4 commented Aug 8, 2023

feat: Create findExecutable which searches multiple paths for an executable of a particular executable and returns the first found path.

Resolves: #3141

I didn't want to squash / force push so much so re-created this. I split out some of the other fixes too.

I think I might need to expose: os.PathListSeparator & os.PathSeparator as chezmoi.osPathListSeparator can you please indicate if that would be well received?

@halostatue
Copy link
Collaborator

This looks fairly good, but I’m not in a position to give it a deep once-over.

I think that the only question that I have immediately from the previous PR is that there had been a mild preference for findExecutable(needle string, haystack string[]) string rather than findExecutable(needle string, haystack ...string) string. This would mean calling as:

{{ findExecutable "needle" (list "straw" "straw" "straw") }}
{{/* ...or... */}}
{{ $haystack := (list "straw" "straw" "straw") }}
{{ findExecutable "needle" $haystack }}

I think I might need to expose: os.PathListSeparator & os.PathSeparator as chezmoi.osPathListSeparator can you please indicate if that would be well received?

I would be neutral on these; I don’t think that I would need them right now or even with this function present, but I don’t think that they’re necessarily unwelcome.

@arran4
Copy link
Contributor Author

arran4 commented Aug 8, 2023

Oh right. I just remember you saying something about varargs. Will address as it does impact me, and it looks like there isn't a spread operator.

package main

import (
	"os"
	"text/template"
)

var tt = "Count: {{ f $.D }}!~\n"

func main() {
	t, err := template.New("").Funcs(map[string]any{
		"f": func(strs ...string) int {
			return len(strs)
		},
	}).Parse(tt)
	if err != nil {
		panic(err)
	}
	err = t.Execute(os.Stdout, struct {
		D []string
	}{
		D: []string{
			"1",
			"2",
			"3",
		},
	})
	if err != nil {
		panic(err)
	}
}
Count: panic: template: :1:13: executing "" at <$.D>: wrong type for value; expected string; got []string

Will switch to a slice now.

Especially given robpike doesn't see a need:
golang/go#51674 (comment)

…ecutable of a particular name and returns the found path.

As per: twpayne#3141
@arran4
Copy link
Contributor Author

arran4 commented Aug 8, 2023

@halostatue done. Please review.

Copy link
Collaborator

@halostatue halostatue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most recent change looks like it will work nicely and goes well beyond my suggestion by recursively flattening string lists. I’m not sure that findExecutable(needle string, haystack ...any) string is preferable to findExecutable(needle string, haystack []string) string (should I end up using this, I plan on using a single (list "straw" "straw" "straw") approach, but I don’t think that there’s anything inherently wrong with supporting either model. @twpayne may feel differently.

That said, I think that the caching needs to be more predictable and stable for a given execution, so I’ve focussed heavily on that and the documentation. I’m still not sure that what I’ve suggested is right, and if the caching is made more predictable, my paragraph The return value of… would need to be rewritten.

The key value was based on onepasswordOutput in onepasswordtemplatefuncs.go, which is reasonably good for showing how to deal with compound input.

@@ -13,6 +13,21 @@ const nativeLineEnding = "\r\n"

var pathExts = strings.Split(os.Getenv("PATHEXT"), string(filepath.ListSeparator))

// findExecutableExtensions returns valid OS executable extensions for a given executable basename - does not check the
// existence.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be wrapped, but a suggested rewording:

// findExecutableExtensions returns all variant Windows executable extensions
// for the provided file if it does not already have an extension. The executable
// extensions are derived from %PATHEXT%.

Copy link
Contributor Author

@arran4 arran4 Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. My wrapping it set at 180 chars and the editorconfig doesn't overwrite it:

chezmoi/.editorconfig

Lines 1 to 13 in 552d255

root = true
[*]
charset = utf-8
end_of_line = lf
insert_final_newline = true
trim_trailing_whitespace = true
[*.ps1]
charset = utf-8
end_of_line = crlf
insert_final_newline = true
trim_trailing_whitespace = true

Will address in update

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made change locally will push soon.


var (
foundExecutableCacheMutex sync.Mutex
foundExecutableCache = make(map[string]struct{})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the caching here makes the results more nondeterministic and, depending on the path list and filesystem, will not reduce Stat calls. I would suggest we want to have the behaviour be closer to the caching of lookPath:

The return value of the first successful call to lookPath is cached, and future calls to lookPath for the same file will return this path.

To enable this, the foundExecutableCache should be make(map[string]string).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this was basically trying to follow the convention of lookPath so I just adapted it. I did think about it. I didn't want to hash the sequence of paths provided so it only makes sense for reducing stat calls but realistically it can be completely removed.

However I agree, if kept, it should be moved up to before the 2nd loop and made into map[string]string. The problem being I wasn't consistently thorough when converting this PR. Sorry. Will address soon. However; should we even bother cache?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the caching here makes the results more nondeterministic and, depending on the path list and filesystem, will not reduce Stat calls.

Okay I reviewed this. I'm not sure I understand the case where it becomes more nondeterministic. -- It does assume a static file system during it's execution. To which the only solution I can think of is the removal of the cache entirely?

Since most of the review is around the cache I can't really proceed with this. I'm looking for direction here. @twpayne @halostatue . I am considering removing the cache as I am not sure if provides value.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't remove the cache, I would make it stricter. That is, for a given (needle, haystack[]) tuple, there is one guaranteed result for one run of chezmoi apply. In general, it means that your key would be something like key := needle + "\x01" + strings.Join(haystack, "\x00"). In this way, if I’m doing:

{{ findExecutable "vim" (list "/opt/local/bin" "/opt/homebrew/bin" "/usr/local/bin" "/usr/bin") }}

the result will be the same for every run of that parameter combination if any one of those returns a value (say it's /opt/homebrew/bin/vim). It's been a while, but I believe that right now, you’re still searching all of the parameters.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially, the improved caching will come from the approach I suggested in an earlier comment.

// makes it useful for the resulting path of rc/profile files.
func FindExecutable(file string, paths ...string) (string, error) {
foundExecutableCacheMutex.Lock()
defer foundExecutableCacheMutex.Unlock()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be with a computed key value. It means that the cache takes up a bit more memory, but it results in a far more predictable output with better guaranteed Stat reduction.

This should be added to the beginning of the function:

key := file + "\x01" + strings.Join(paths, "\x00")
if path, ok := foundExecutableCache[key]; ok {
  return path, nil
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to defer this to @twpayne

continue
}
if IsExecutable(f) {
foundExecutableCache[path] = struct{}{}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be changed to foundExecutableCache[key] = path.

}
p := filepath.Join(dir, file)
for _, path := range findExecutableExtensions(p) {
if _, ok := foundExecutableCache[path]; ok {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be removed.

defer foundExecutableCacheMutex.Unlock()

// stolen from: /usr/lib/go-1.20/src/os/exec/lp_unix.go:52
for _, dir := range paths {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inefficient because it recomputes findExecutableExtensions every time.

Add:

candidates := findExecutableExtensions(file)

outside of this loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct since the path doesn't matter. Will address on update

continue
}
p := filepath.Join(dir, file)
for _, path := range findExecutableExtensions(p) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this to:

for _, f := range candidates {
  path := filePath.Join(dir, f)

As far as I can tell, we can’t avoid the double loop, but we can reduce the number of internal loops by not calling findExecutableExtensions for each path entry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per: #3162 (comment) You are correct since the path doesn't matter. Will address on update.

Not sure why I didn't catch it.

@arran4
Copy link
Contributor Author

arran4 commented Aug 9, 2023

Going to give @twpayne some time to direct too. So will revisit either when he responds, or in a month he should be back from his vacation by then.

@arran4
Copy link
Contributor Author

arran4 commented Sep 10, 2023

I'm not able to resume work on this until I get some direction regarding: #3162 (comment)

@twpayne
Copy link
Owner

twpayne commented Sep 13, 2023

+1 on all of @halostatue's feedback here.

This is a tricky and subtle template function to write. I would also underline what @halostatue has already said:

  1. Please separate concerns. For example, the suggestion of taking a list of paths instead of a string that is then split on some separator makes the code more composable by not intertwining the path separation with the search for executables.
  2. The cache serves two purposes: (1) performance (2) improving consistency. Without a cache, calls to findExecutable will return different results depending on when it is called. With a cache, the first result from findExecutable will be returned for all future calls in the same chezmoi run, which makes template behavior easier to reason about.
  3. The current documentation in this PR is verbose. Please follow the style used for the existing template functions which is short and precise.
  4. This PR already has over twenty comments for adding a single template function. The effort/return ratio is low, and I really want to thank @halostatue for his patience here. Please, follow chezmoi's existing style.

halostatue pushed a commit to halostatue/chezmoi that referenced this pull request Sep 21, 2023
Implement `findExecutable` with strong caching. The main change from
twpayne#3162 is that it now also accepts either a `file` or `file-list` for
searching, so that one of several different options can be searched
simultaneously for roughly equivalent implementations. See twpayne#3256 for the
inspiration for this particular change.

We *can* switch this to two template functions, but I think that the
core implementation would best remain the way that it is.

Resolves: twpayne#3141
Closes: twpayne#3162
Co-authored-by: Arran Ubels <arran4@gmail.com>
halostatue pushed a commit to halostatue/chezmoi that referenced this pull request Sep 21, 2023
Implement `findExecutable` with strong caching. The main change from
twpayne#3162 is that it now also accepts either a `file` or `file-list` for
searching, so that one of several different options can be searched
simultaneously for roughly equivalent implementations. See twpayne#3256 for the
inspiration for this particular change.

We *can* switch this to two template functions, but I think that the
core implementation would best remain the way that it is.

Resolves: twpayne#3141
Closes: twpayne#3162
Co-authored-by: Arran Ubels <arran4@gmail.com>
halostatue pushed a commit to halostatue/chezmoi that referenced this pull request Sep 21, 2023
Implement `findExecutable` with strong caching. The main change from
twpayne#3162 is that it now also accepts either a `file` or `file-list` for
searching, so that one of several different options can be searched
simultaneously for roughly equivalent implementations. See twpayne#3256 for the
inspiration for this particular change.

We *can* switch this to two template functions, but I think that the
core implementation would best remain the way that it is.

Resolves: twpayne#3141
Closes: twpayne#3162
Co-authored-by: Arran Ubels <arran4@gmail.com>
halostatue pushed a commit to halostatue/chezmoi that referenced this pull request Sep 22, 2023
Implemented `findExecutable` with strong caching, extended from twpayne#3162.
Also implemented `findOneExecutable` to search for any one executable
from a list.

Resolves: twpayne#3141
Closes: twpayne#3162
Co-authored-by: Arran Ubels <arran4@gmail.com>
halostatue pushed a commit to halostatue/chezmoi that referenced this pull request Sep 22, 2023
Implemented `findExecutable` with strong caching, extended from twpayne#3162.
Also implemented `findOneExecutable` to search for any one executable
from a list.

Resolves: twpayne#3141
Closes: twpayne#3162
Co-authored-by: Arran Ubels <arran4@gmail.com>
halostatue pushed a commit that referenced this pull request Sep 22, 2023
Implemented `findExecutable` with strong caching, extended from #3162.
Also implemented `findOneExecutable` to search for any one executable
from a list.

Resolves: #3141
Closes: #3162
Co-authored-by: Arran Ubels <arran4@gmail.com>
@arran4
Copy link
Contributor Author

arran4 commented Sep 23, 2023

Sorry guys looks like I failed to complete this.

@halostatue
Copy link
Collaborator

You did the heavy lifting. I had started exploring something else (lookOnePath) when I realized it would be a good fit for find executable, too so I finalized the caching and implemented the list creation, too.

@arran4
Copy link
Contributor Author

arran4 commented Sep 23, 2023

Need is the mother of invention.... Or in this case completion it seems.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 10, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

lookPath should accept an optional path override
3 participants