Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text matching #74

Merged
merged 27 commits into from Nov 11, 2020
Merged

Text matching #74

merged 27 commits into from Nov 11, 2020

Conversation

tjasmith
Copy link
Collaborator

@tjasmith tjasmith commented Sep 1, 2020

No description provided.

Copy link
Member

@goneall goneall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 informational comments and 3 requested changes:

  • Change the name of the result variable
  • Create a constant for the number of words to match
  • Remove 2 unused constants

Matcher matcher = result.matcher(compareLicenseText);
if(matcher.find()) {
startIndex = matcher.start();
endIndex = matcher.end();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the endIndex will return the end of the bodyText. I plan on improving the nonOptionalTextStartPattern to return a correct end index, so I would leave this code as is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tjasmith Could you also add a couple of unit tests?

Ok

} catch (SpdxCompareException e) {
e.printStackTrace();
}
Pattern result = LicenseCompareHelper.nonOptionalTextToStartPattern(nonOptionalText, 10);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest changing 10 to a constant like NUMBER_MATCH_WORDS to make it easy to adjust if needed.

@@ -32,4 +32,6 @@
public static final Integer CROSS_REF_INDEX_ISWAYBACKLINK = 3;
public static final Integer CROSS_REF_INDEX_MATCH = 4;
public static final Integer CROSS_REF_INDEX_TIMESTAMP = 5;
public static final Integer CROSS_REF_LICENSE_TEXT_START_CHAR_COUNT = 80;
public static final Integer CROSS_REF_LICENSE_TEXT_END_CHAR_COUNT = 60;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the CROSS_REF_LICENSE_TEXT_START_CHAR_COUNT and CROSS_REF_LICENSE_TEXT_END_CHAR_COUNT still in use? If not, these should be removed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed them.

pom.xml Show resolved Hide resolved
src/org/spdx/crossref/Match.java Outdated Show resolved Hide resolved
@goneall
Copy link
Member

goneall commented Sep 2, 2020

@tjasmith Could you also add a couple of unit tests?

@goneall
Copy link
Member

goneall commented Sep 4, 2020

@tjasmith I published the new version of the SPDX tools and tested out this PR. It seems to work well in matching the licenses 👍
There is one change I'd like you to make before merging - handling the exceptions by logging and returning false. Once that's done I'll go ahead and merge.

I also added 2 new issues which would be nice to have before using for the next license list publication: #76 #75 - let me know if you want to work on either of these.

try {
nonOptionalText = LicenseCompareHelper.getNonOptionalLicenseText(license.getStandardLicenseTemplate(), true);
} catch (SpdxCompareException e) {
e.printStackTrace();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tjasmith Please replace this with logging and return non-match, otherwise there will be null pointer exceptions or other issues

match = new Boolean(!matchBool).toString();
}
} catch (SpdxCompareException e) {
e.printStackTrace();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tjasmith Please replace this with logging and return non-match, otherwise there will be null pointer exceptions or other issues

@goneall goneall merged commit 93aacd2 into spdx:master Nov 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants