-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add license category-change detection and scoring #86
Conversation
* Modify DeltaCode.license_diff() to detect and identify the addition of certain categories of licenses. * Refactor scoring structure and methods to discard use of score for delta identification. * Modify sorting of delta results: by score descending, then alphabetically ascending. * Modify utils.deltas() to adjust to refactored scoring structure. * Add File.licenses_is_empty() and File.copyrights_is_empty() methods. * Add Delta.is_modified() and Delta.is_unmodified() methods. * Fix failing tests, add tests. Signed-off-by: John M. Horan <johnmhoran@gmail.com>
* Clarify Delta sorting. * Simplify set creation. * Simplify implementation of category-change detection. * Rename methods. * Fix failing tests, add new test codebases and tests. Signed-off-by: John M. Horan <johnmhoran@gmail.com>
src/deltacode/__init__.py
Outdated
delta.update(20, 'license info added') | ||
# no license ==> 'Copyleft Limited'or higher | ||
for item in sorted(unique_categories): | ||
if item in new_categories: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic should be reversed for line 221-222:
for category in new_categories:
if category in unique_categories:
src/deltacode/__init__.py
Outdated
return | ||
|
||
if delta.new_file.licenses == [] and len(delta.old_file.licenses) > 0: | ||
if not delta.new_file.has_licenses() and delta.old_file.has_licenses(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should put this block (226-228) near the top, as it is the least logically complex code block.
src/deltacode/__init__.py
Outdated
@@ -211,6 +232,13 @@ def license_diff(self): | |||
|
|||
if new_keys != old_keys: | |||
delta.update(10, 'license change') | |||
for item in sorted(new_categories - old_categories): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we need to sort here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MaJuRG I sorted here (and earlier, line 221, mentioned in your comment above) to display the license category-change factors alphabetically -- i.e., in a consistent order -- thinking that would make it easier for a user to compare and analyze factors.
Removing the sort in both locations works fine -- I just need to modify a few failing tests in which I assert [list A] == [list B]
, which I can replace with assert set([list A]) == set([list B])
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets remove it from both; if anything this will incur a performance cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using set()
works here since there are no duplicates, but perhaps it's better to use sorted()
instead. Tested, works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does for item in new_categories - old_categories:
not work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it works -- I was referring to the modifications needed to fix the failing related tests -- there, we can use either set()
or sorted()
to compare the factors
list comprehension with the expected Delta.factors
list and not worry about the order of the lists' elements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, in the test cases it is up to you; sorting test results is always fine.
src/deltacode/__init__.py
Outdated
|
||
unique_commercial_categories = set([ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets not have a separate distinction/process for commercial licenses.
meaning forget about my mentioning "anything ==> Commercial" in tickets or otherwise.
src/deltacode/__init__.py
Outdated
if len(old_categories & unique_categories) == 0 and item in unique_categories: | ||
delta.update(20, item.lower() + ' added') | ||
# anything ==> 'Proprietary Free' or 'Commercial' | ||
elif item in unique_commercial_categories: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets not have a separate distinction/process for commercial licenses.
meaning forget about my mentioning "anything ==> Commercial" in tickets or otherwise.
* Stop sorting in category-change process, and modify related tests to compare lists that may have dissimilar ordering. * Remove rule: anything ==> 'Proprietary Free' or 'Commercial'. Signed-off-by: John M. Horan <johnmhoran@gmail.com>
No description provided.