-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
label: trim non-word chars for import path to repo #688
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this. Couple comments below.
label/label.go
Outdated
@@ -191,11 +191,11 @@ func ImportPathToBazelRepoName(importpath string) string { | |||
importpath = strings.ToLower(importpath) | |||
components := strings.Split(importpath, "/") | |||
labels := strings.Split(components[0], ".") | |||
var reversed []string | |||
var reversed = make([]string, 0, len(labels)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're trying to avoid extra allocation here, we should use len(labels)+len(components)-1
, since we append again on L199.
Also, let's use reversed := ...
instead of var reversed = ...
since we're initializing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to prefer var
when defining anything that is semantically empty, especially when len(slice) == 0
, but I will defer to your preference/style guide.
label/label.go
Outdated
for i := range labels { | ||
l := labels[len(labels)-i-1] | ||
reversed = append(reversed, l) | ||
} | ||
repo := strings.Join(append(reversed, components[1:]...), "_") | ||
return strings.NewReplacer("-", "_", ".", "_").Replace(repo) | ||
return regexp.MustCompile(`\W`).ReplaceAllString(repo, "_") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's lift the regexp.MustCompile
out of the function to avoid compiling it every time.
Let's also match \W+
. Two or more adjacent non-word characters should be replaced by one underscore. So git.sr.ht/~sircmpwn/go-example
would become ht_sr_git_sircmpwn_go_example
.
This is probably complicated enough that we should have a small unit test in label_test.go
to check it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What name would you suggest for the lifted regexp.Regexp
?
This is not backwards compatible, currently github.com/user/go-.repo -> com_github_user_go__repo
. Is that OK?
I will add a unit test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the lifted regexp, maybe nonWordRe
? It'll be close, so it doesn't have to be too verbose.
About compatibility, I think it's okay. A change here should only affect newly added rules or labels without corresponding go_repository
. We try to match on existing go_repository
rules' importpath
attributes before we generate a new name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was also required to change strings.Join
, to a character that would be replaced, to prevent ht_sr_git_~sircmpwn_go-example -> ht_sr_git__sircmpwn_go_example
. Feel free to suggest a better character than .
.
Currently ImportPathToBazelRepoName converts -s and .s to _s. This does not appear to be required by Bazel's repo name conventions, but may follow best practices. Unfortunately, this means characters valid for import paths cause build failures, like git.sr.ht/~sircmpwn/go-example. This change replaces all consecutive non-word, \W+ or [^A-Za-z0-9_]+ characters, not just - and ., with _. This change also preallocates reversed, since its length is known.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
Currently ImportPathToBazelRepoName converts
-
s and.
s to_
s. This doesnot appear to be required by Bazel's repo name conventions, but may
follow best practices. Unfortunately, this means characters valid for
import paths cause build failures, like
git.sr.ht/~sircmpwn/go-example
.This change replaces all non-word,
\W
or[^A-Za-z0-9_]
characters, notjust
-
and.
, with_
.This change also preallocates reversed, since its length is known.