-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Python requirements.txt scanning for pull requests #1225
Conversation
f54d224
to
bfeafd9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly questions rather than concerns. The tests made be feel much more confident than my own reasoning through the code.
"strings" | ||
|
||
pb "github.com/stacklok/mediator/pkg/api/protobuf/go/mediator/v1" | ||
) | ||
|
||
var ( | ||
pyRequirementsNameRegex = regexp.MustCompile(`\s*(>=|<=|==|>|<|!=)`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to make this greedy so that it will match >=
rather than >
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(A test would verify this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the way the operators are ordered already ensures that >= would match before >. You had a point though that the way I was trying to find the lowest version was incomplete, I added a fix to the parser and some more tests. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out that RE2 and regexp
are both greedy. (I thought they were lazy!) I'm going to include the rest just because I find it interesting and already typed it out.
Ordering of alternations is irrelevant for regular expression engines (particularly RE2 and Go's regexp
module). regexp
builds a state machine and marches it through the expression, picking out the longest string (greedy) that matches.
Since the match is greedy, this will match the longest possible string, which works for what you want.
I have big respect that you were even able to reason about a regex - for me, regexes are a write-only code :-) I was writing the regex along with the tests |
Thanks for the review @evankanderson. I will address your comments through the day, but I want to get the Pi support up for review first, to get at least some high-level feedback there. |
Adds support for parsing requirements.txt in PRs.
GH API is again a bit odd here - if the suggestion is only one line, then it expects only the Line parameter to be set. If you set both to the same value you'd get: ``` 422 Unprocessable Entity [{Resource: Field: Code: Message:Pull request review thread start line must precede the end line.}] ```
…rements.txt Queries pypi for latest package versions and provides suggestions to the PR. Fixes: #913
python's requirements.txt allows specifying packages without versions. Because there' no vulnerability check we can do in absence of a version, let's just skip those. pip should install the latest package in this case anyway.
func (p *PyPiReply) LineHasDependency(line string) bool { | ||
nameMatch := util.PyRequestsNameRegexp.FindStringIndex(line) | ||
if nameMatch == nil { | ||
return false | ||
} | ||
|
||
name := strings.TrimSpace(line[:nameMatch[0]]) | ||
return name == p.Info.Name | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like you might want to implement util.ParsePyRequirement(line string) (string, string)
that returns the package name and the version constraints (or "", ""
for not matching a requirement line). This method could then just return the first argument.
// IndentedString returns the patch suggestion for a requirement.txt file | ||
// This method satisfies the patchLocatorFormatter interface where different | ||
// package managers have different patch formats and different ways of presenting | ||
// them. Since PyPi doesn't indent, but can specify zero or multiple versions, we | ||
// don't care about the indent parameter. This is ripe for refactoring, though, | ||
// see the comment in the patchLocatorFormatter interface. | ||
func (p *PyPiReply) IndentedString(_ int, oldDepLine string, oldDep *pb.Dependency) string { | ||
return strings.Replace(oldDepLine, oldDep.Version, p.Info.Version, 1) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the use cases here -- are we replacing one version with a new recommended version? If so, it seems like we might not need oldDepLine
and oldDepVersion
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the idea was to do a minimal replace on the version, keeping the rest of the line intact, mainly things like extras (package[extra_feature,extra_bloat]
. I would like to refactor this interface and the associated structure anyway.
func pyReqNormalizeLine(line string) string { | ||
if !strings.HasPrefix(line, "+") { | ||
return "" | ||
} | ||
line = strings.TrimPrefix(line, "+") | ||
|
||
// Remove inline comments | ||
if idx := strings.Index(line, "#"); idx != -1 { | ||
line = line[:idx] | ||
} | ||
|
||
return strings.TrimSpace(line) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this function only looking at added lines? Normalize doesn't quite seem like it covers that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pyExtractAddedDeps
seems like it might be a better name.
version := "" | ||
var lowestVersion string | ||
for _, match := range matches { | ||
if len(match) < 3 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not:
if len(match) < 3 { | |
if len(match) != util.PyRequestsVersionRegexp.NumSubexp() { |
(Which also seems unlikely to happen...)
} | ||
|
||
// Extract the name by grabbing everything up to the first operator | ||
nameMatch := util.PyRequestsNameRegexp.FindStringIndex(line) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way to do this is to split the string into (dep-name)(version-strings)
first, and then doing the parsing loop above on the version-string part of the line. Since parsing is linear in the size of the input, there's probably not much performance gain either way.
Flask | ||
+requests>=2.0,<3 | ||
+pandas<0.25.0,>=0.24.0 | ||
+numpy==1.16.0`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great test!
description: "Single addition, greater or equal version", | ||
content: ` | ||
Flask | ||
+requests>=2.19.0`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to test lines that contain -
as well?
+requests>=2.19.0`, | |
-requests>=2.14.0 | |
+requests>=2.19.0`, |
Thanks for the review Evan, I filed #1236 to get back to your comments. I think they are all valid, but right now I'd prefer to move faster (and hopefully not break too many things). I'll merge. |
Adds support for ingesting patches that touch requirements.txt
This adds both support for replying to PRs that add python dependencies as
well as a building block for the Pi integration that will use the same code,
just pointed to Pi.
Fixes: #913