New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Text matching #74
Text matching #74
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 informational comments and 3 requested changes:
- Change the name of the
result
variable - Create a constant for the number of words to match
- Remove 2 unused constants
Matcher matcher = result.matcher(compareLicenseText); | ||
if(matcher.find()) { | ||
startIndex = matcher.start(); | ||
endIndex = matcher.end(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, the endIndex will return the end of the bodyText. I plan on improving the nonOptionalTextStartPattern to return a correct end index, so I would leave this code as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tjasmith Could you also add a couple of unit tests?
Ok
src/org/spdx/crossref/Match.java
Outdated
} catch (SpdxCompareException e) { | ||
e.printStackTrace(); | ||
} | ||
Pattern result = LicenseCompareHelper.nonOptionalTextToStartPattern(nonOptionalText, 10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest changing 10
to a constant like NUMBER_MATCH_WORDS
to make it easy to adjust if needed.
@@ -32,4 +32,6 @@ | |||
public static final Integer CROSS_REF_INDEX_ISWAYBACKLINK = 3; | |||
public static final Integer CROSS_REF_INDEX_MATCH = 4; | |||
public static final Integer CROSS_REF_INDEX_TIMESTAMP = 5; | |||
public static final Integer CROSS_REF_LICENSE_TEXT_START_CHAR_COUNT = 80; | |||
public static final Integer CROSS_REF_LICENSE_TEXT_END_CHAR_COUNT = 60; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the CROSS_REF_LICENSE_TEXT_START_CHAR_COUNT
and CROSS_REF_LICENSE_TEXT_END_CHAR_COUNT
still in use? If not, these should be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed them.
@tjasmith Could you also add a couple of unit tests? |
@tjasmith I published the new version of the SPDX tools and tested out this PR. It seems to work well in matching the licenses 👍 I also added 2 new issues which would be nice to have before using for the next license list publication: #76 #75 - let me know if you want to work on either of these. |
try { | ||
nonOptionalText = LicenseCompareHelper.getNonOptionalLicenseText(license.getStandardLicenseTemplate(), true); | ||
} catch (SpdxCompareException e) { | ||
e.printStackTrace(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tjasmith Please replace this with logging and return non-match, otherwise there will be null pointer exceptions or other issues
match = new Boolean(!matchBool).toString(); | ||
} | ||
} catch (SpdxCompareException e) { | ||
e.printStackTrace(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tjasmith Please replace this with logging and return non-match, otherwise there will be null pointer exceptions or other issues
No description provided.