Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support custom forbid reason messages #11

Merged
merged 3 commits into from
Dec 20, 2021

Conversation

craigfurman
Copy link
Contributor

@craigfurman craigfurman commented Nov 17, 2021

In order to allow users to communicate intent to collaborators,
optionally embed custom messages into each forbidden pattern.

The syntax is as follows: identifier(# message goes here)?

Example: ^fmt\.Errorf(# Please don't use this!)?$

Regular expressions containing custom messages are effectively identical
to ones that don't, because the sub-expression containing it is marked
optional with a ?. All this commit does is parse out any recognized
custom message, and place it prominently in the tool's output. The
regular expression itself is omitted from the tool's output when a
custom message is specified.


We discussed this approach in #10. I can't think of a reason a user might start their patterns with #[, but if they do, this PR will break that use case.

This should automatically work from within golangci-lint, although I haven't tested that yet. If accepted, perhaps we should add a custom message example to that project's docs?

Closes #10.

forbidigo/forbidigo.go Outdated Show resolved Hide resolved
@ashanbrown
Copy link
Owner

Thanks @craigfurman for the PR and the thorough testcases. I was thinking more about this and I wanted to propose a slight variation to the syntax which would look like this:

^fmt.\\Printf(# no extra noise allowed)?$

What's interesting about this syntax is that the regex remains a valid regex. You can run it on your code (and verify the string) even without using forbidigo and it will match. We can simply extract the capture groups and report the ones that start with '#' (or maybe just support first and last capture group).

Golang even offers a parser at https://pkg.go.dev/regexp/syntax#Parse that we can use to parse it. See https://play.golang.org/p/KvL8L7GystE for an example of how it is parsed.

What would you think about this syntax change?

@craigfurman
Copy link
Contributor Author

Interesting! At first glance I didn't like that comment syntax, but I have to agree that it'd be neat not to double-parse the regex, which if I understand correctly we can do using that syntax.Parse function. I'm happy to have a go at this, might be a couple of days before I get time though.

@craigfurman
Copy link
Contributor Author

craigfurman commented Nov 28, 2021

I started taking a look at this again today (apologies for the week-long latency between replies here!) and it's a little fiddlier than I was expected. We either have to make certain assumptions about where in the submatch tree the comment will be if it is present, or do a bunch of regex matching on the regex components itself, which I think defeats the point of your suggestion a little bit. I've spiked out the first approach, built on top of this PR branch - note that it doesn't actually work yet, but I think it gives the gist of what I'm saying: craigfurman@b51de3b

While doing this, a thought occurred to be though: taking advantage of the fact that your proposed syntax also matches identifiers without that comment, we might not actually have to write any code. We could just document this (# comment)? trick and call it a day! The tool's output already prints the expression used, so the comment will be present too. WDYT?

@ashanbrown
Copy link
Owner

ashanbrown commented Nov 29, 2021

@craigfurman It did also occur to me that we don't really need a comment if there's obviously a comment in a regex, so we could definitely start using and encouraging this syntax right away in the docs. Parsing the comment just allows us to make the output a little more human-readable.

Thanks for taking a stab at it. I had been imaging a recursive function that just extracted any sub-expressions that began with "#" (joining them with comments). I suppose it would be nice if it also advised you to put a ? after the sub-expresssion (or frankly, only allow # at the beginning of a subexpression), because the '#' symbol is never going to match anything (since forbidigo is focused on package names and identifiers).

@craigfurman
Copy link
Contributor Author

Yeah I might have overthought this, extracting sub-expressions beginning with # is likely good enough, since that can't be contained in an identifier. How about we do that, and also replace, rather than append to, the "forbidden by pattern.." part when comments are present. That'd allow us to unconditionally use the regex provided by the user, whether it contains a comment or not, without duplicating the comment.

@craigfurman
Copy link
Contributor Author

I've implemented what we discussed above. It's effectively a totally different implementation to what I originally posted, so I've just squashed it in and amended the message.

I've changed the output to omit the "by pattern..." part when a custom message is specified, for compactness.

Ready for your review!

if err != nil {
return nil, fmt.Errorf("unable to compile pattern `%s`: %s", ptrn, err)
}
re, err := syntax.Parse(ptrn, syntax.Perl)
Copy link
Contributor Author

@craigfurman craigfurman Nov 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't spot a way in the stdlib to convert between a regexp.Regexp and a syntax.Regexp, so I think we might have to parse it twice if we want to support this feature, which is a bit of a shame.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's not ideal but it only happens once per expression so I'm not too concerned.

}

func parse(ptrn string) (*pattern, error) {
p := &pattern{}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a style thing but my preference here would be to avoid dot notation and construct the final pattern at the end before we return it

Copy link
Contributor Author

@craigfurman craigfurman Dec 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually lean that way too (not mutating structs if it can be avoided), not sure why I went the other way here. Probably a hangover from the first iteration of this PR.

forbidigo/forbidigo.go Outdated Show resolved Hide resolved
Copy link
Owner

@ashanbrown ashanbrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of small suggestions but it looks pretty good to me.

assert.Equal(t, `fmt\.Errorf`, ptrn.pattern.String())
}

func TestParseValidPatternWithCustomMessage(t *testing.T) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to be thorough here we could also add a test for a nested subgroup, ensuring the recursion works.

I personally also like the structure of having a single TestParse method at the top-level and writing a t.Run() for each subtest. A table-style test might also be appropriate here, although there aren't that many variants.

In order to allow users to communicate intent to collaborators,
optionally embed custom messages into each forbidden pattern.

The syntax is as follows: `identifier(# message goes here)?`

Example: `^fmt\.Errorf(# Please don't use this!)?$`

Regular expressions containing custom messages are effectively identical
to ones that don't, because the sub-expression containing it is marked
optional with a `?`. All this commit does is parse out any recognized
custom message, and place it prominently in the tool's output. The
regular expression itself is omitted from the tool's output when a
custom message is specified.
@craigfurman
Copy link
Contributor Author

Oops, I forgot to add the nested subgroup test. Done now.

func extractComment(re *syntax.Regexp) string {
for _, sub := range re.Sub {
if len(sub.Sub) > 0 {
return extractComment(sub)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not noticing this during my first review, but I think we should only return here if extractComment returns a non-empty string. Otherwise, I think risk failing to return a comment if there are other sub-expressions in the pattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good catch! I added a commit containing a test case that indeed failed in the way I think you're describing, and the fix. Are there any other edge cases you had in mind, or does that capture it all?

Copy link
Owner

@ashanbrown ashanbrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just noted one more potential issue that I think it should be easy to construct a test for a fix.

@craigfurman
Copy link
Contributor Author

Thanks for spotting that! Ready for another review, let me know if you'd prefer I squash that fix commit in first (or feel free to ask github to squash it). I've left it separate for review-ability until then.

forbidigo/forbidigo.go Outdated Show resolved Hide resolved
Copy link
Owner

@ashanbrown ashanbrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left a couple of small suggestions

@ashanbrown ashanbrown merged commit 9d87656 into ashanbrown:master Dec 20, 2021
@craigfurman craigfurman deleted the custom-messages branch December 20, 2021 15:14
@craigfurman
Copy link
Contributor Author

Thanks @ashanbrown! Looks like you merged this before accepting your own second suggestion though. Either way, up to you! I look forward to using this when it lands in golangci-lint.

@ashanbrown
Copy link
Owner

Thanks @craigfurman . I think my second suggestion wasn't going to work, so I nixed it. Thank you for your contribution. I look forward to using it myself once it's merged into golangci-lint (which I think will happen automatically via dependabot).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Customize issue messages to describe intent
2 participants