Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance the check_license functionality + Check for duplicate and near matches of license submittal #128

Merged
merged 8 commits into from
Aug 29, 2019

Conversation

Ugtan
Copy link
Collaborator

@Ugtan Ugtan commented Jul 19, 2019

  • Enhance the check_license functionality by using close matches
    for optimization to narrow down the possible matches.
  • Check if is license is present on the SPDX License License(or check for duplicate license requests)
  • Check for near matches, if the license is a close match then, show diff view and ask the user if changes look substantial acc to him/her. if changes are substantial to create a new license request then create a new issue otherwise, if the license should match then create a Pull request.

@goneall
Copy link
Member

goneall commented Jul 21, 2019

I would like to do the original check license algorithm if there is a close match. The replacement algorithm doesn't take into account the templates and replaceable terms.

An algorithm like:

  • try the fast comparison
  • If there is 100% match - report a match
  • If there is < a threshold of a match, there is no match
  • If it is >= to a threshold and < 100% then run the original algorithm and report the results

@Ugtan
Copy link
Collaborator Author

Ugtan commented Jul 21, 2019

@goneall Isn't the isTextStandardLicense of the SPDX tools took account for the templates and replaceable terms? I have also tried testing it with few templates and it works for some of them as it works for text that returns isDifferentFound to be false in case if difference found it does not report it to be a license match and just report it to be a close match.
If its not right, can you explain it I'm confused.

Also, when I'm using the check_license with the original algo on the production for templates it returns "there are no matching spdx licenses".

Thanks

@goneall
Copy link
Member

goneall commented Jul 21, 2019

Isn't the isTextStandardLicense of the SPDX tools took account for the templates and replaceable terms?

Yes - that should take into account the template. I may have just missed the code. I was reading the file diffs for this PR and saw the code to call the SPDX tools removed, but didn't see the call to isTextStandardLicense

@Ugtan
Copy link
Collaborator Author

Ugtan commented Jul 21, 2019

@goneall Right. The function is called inside the spdx-license-matcher itself and that's why it is not visible in the current code.

But still I don't know the code behind isTextStandardLicense but when I use a template (eg AAL.template.txt) it reports that difference is found. I'm using the exact same template so, it shouldn't have returned any differences in the first place right?

@goneall
Copy link
Member

goneall commented Jul 22, 2019

I'm using the exact same template so, it shouldn't have returned any differences in the first place right?

The template is used to describe optional and variable text, you don't actually match against it - you match against the license text. But if you remove some optional text or change variable text in a license it will still match. For example, in the Apache 2.0 license, you can change Copyright [yyyy] [name of copyright owner] to Copyright SPDX and it will still give 100% match.

When you look at the license text on spdx.org/licenses, you will see optional text as blue and variable/alt text as red.

The code that implements that matching is in Java staring here.

@Ugtan
Copy link
Collaborator Author

Ugtan commented Jul 22, 2019

The template is used to describe optional and variable text, you don't actually match against it - you match against the license text. But if you remove some optional text or change variable text in a license it will still match. For example, in the Apache 2.0 license, you can change Copyright [yyyy] [name of copyright owner] to Copyright SPDX and it will still give 100% match.

I understood it. And I will check out the code. Thanks for the reply. :)

@Ugtan Ugtan force-pushed the improve-check-license branch 3 times, most recently from 519f966 to 25854d4 Compare August 18, 2019 18:13
@Ugtan Ugtan changed the title Enhance the check_license functionality Enhance the check_license functionality + Check for duplicate and near matches of license submittal Aug 18, 2019
@Ugtan
Copy link
Collaborator Author

Ugtan commented Aug 18, 2019

@goneall @techytushar @rtgdk I have updated the PR with the below changes. Please have a look at it. Thanks

In the PR I have done the following changes:

  • check license will now report for near matches i.e if a license is a close match.
  • license submittal will report for duplicate licenses(if the license is already present on the SPDX license list)
  • license submittal will report if a user tries to submit a license text which closely matches(based on SPDX matching guidelines) with that present on the license list. if it is a close match, ask the user if he/she thinks the changes are substantial. if yes, create a new issue(new license request) if no, then create a PR to modify the existing license.

Also, This function is not used anywhere as a similar function is there right below it, I think it was left out by mistake should I remove it as well?

@goneall
Copy link
Member

goneall commented Aug 23, 2019

Also, This function is not used anywhere as a similar function is there right below it, I think it was left out by mistake should I remove it as well?

If it is not being used, please remove it - we can always go back in git history if we need to refer to the old method.

@goneall
Copy link
Member

goneall commented Aug 23, 2019

@Ugtan Please update the README.md file with the instructions for installing and running the redis server and any other steps needed to successfully setup the online tools. After that, I'll test this out on my local machine.

I took a quick look at the files and it looks good.

@Ugtan
Copy link
Collaborator Author

Ugtan commented Aug 23, 2019

If it is not being used, please remove it - we can always go back in git history if we need to refer to the old method.

Sounds good.

Please update the README.md file with the instructions for installing and running the redis server and any other steps needed to successfully setup the online tools.

Sure, I will add it tomorrow.

@goneall
Copy link
Member

goneall commented Aug 24, 2019

@rtgdk @techytushar Could one of you do a quick review of the code and let me know if this is good to be merged.

Thanks

@Ugtan
Copy link
Collaborator Author

Ugtan commented Aug 25, 2019

@goneall I have added the documentation regarding the installation of the redis server also, I have removed the unnecessary code.

@goneall
Copy link
Member

goneall commented Aug 25, 2019

@Ugtan Running this on my local machine, I'm getting the following errors in the JavaScript console when trying to show the diff:

Uncaught ReferenceError: displayModal is not defined
    at generate_text_diff ((index):793)
    at HTMLButtonElement.<anonymous> ((index):618)
    at HTMLDocument.dispatch (jquery.min.js:3)
    at HTMLDocument.r.handle (jquery.min.js:3)
generate_text_diff @ (index):793
(anonymous) @ (index):618
dispatch @ jquery.min.js:3
r.handle @ jquery.min.js:3

I also noticed the following error when starting up the app (not sure if this is related to the problem):

script.js:102 Uncaught ReferenceError: CodeMirror is not defined
    at HTMLDocument.<anonymous> (script.js:102)
    at i (jquery.min.js:2)
    at Object.fireWith [as resolveWith] (jquery.min.js:2)
    at Function.ready (jquery.min.js:2)
    at HTMLDocument.K (jquery.min.js:2)

@Ugtan
Copy link
Collaborator Author

Ugtan commented Aug 25, 2019

Running this on my local machine, I'm getting the following errors in the JavaScript console when trying to show the diff:

I have fixed the error. Please have a look it should work now. I accidentally deleted the treeview js file while removing unnecessary files as I had imported all the files used in the xml editor. Sorry for the inconvenience.

I also noticed the following error when starting up the app (not sure if this is related to the problem):

I know why this issue is showing up. So, I'm importing a script js file of XML editor which handles the PR making stuff only in the html of the submit license page and in that script there is codemirror used to fix this issue we can shift the make pr function to some other file which will not show the error anymore. But I think it will not impact the functionality of this feature. Although, if you like I will create a new PR refactoring these function. @goneall

@goneall
Copy link
Member

goneall commented Aug 25, 2019

I have fixed the error. Please have a look it should work now.

It works - thanks!

... But I think it will not impact the functionality of this feature. Although, if you like I will create a new PR refactoring these function.

Let's go ahead and create a PR since you know how to fix it.

@goneall
Copy link
Member

goneall commented Aug 25, 2019

I had a chance to test it out and it worked for me. Nice work @Ugtan! Once @rtgdk or @techytushar reviews the Python code, I can merge.

In going through the workflow, we may want to change to submitting an issue rather than a PR for the case when there is a close match which should match the existing licenses since many users won't have the technical skills to create the PR. This can be a separate issue and something we add after GSoC since this would be a change in the requirements and the GSoC project implemented the project correctly for the project idea/requirements at the time.

@Ugtan
Copy link
Collaborator Author

Ugtan commented Aug 26, 2019

Let's go ahead and create a PR since you know how to fix it.

Sure.

In going through the workflow, we may want to change to submitting an issue rather than a PR for the case when there is a close match which should match

If in case the user wants the licenses to match? Then what if he wants to add a new license request?

@Ugtan
Copy link
Collaborator Author

Ugtan commented Aug 26, 2019

NVM. I just saw the issue and it makes sense now. I will soon gonna work on the issue. Should I add changes to this PR or create a new one @goneall ?

@goneall
Copy link
Member

goneall commented Aug 26, 2019

Should I add changes to this PR or create a new one?

@Ugtan Thanks for taking this on!

I would open a new PR once we merge this one - that way we can keep the GSoC PR's consistent with the project plan.

Once I get one more technical review of the Python code, I'll merge then we can create a new PR.

README.md Outdated

**For linux/Mac user use**

`sudo apt-get install redis-server`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On MAC, apt-get is not available. Put the command for HomeBrew.

@rtgdk
Copy link
Collaborator

rtgdk commented Aug 27, 2019

The changes works fine for me. Just a small change in README needed. Other than that, PR looks good, ready to be merged.

I hope tests will be added to verify the functionality (maybe selenium tests) and edge cases. Also, I hope this works well in production after setting up the redis server.

Another thing I noticed is the -e git://github.com/Ugtan/spdx-license-matcher.git#egg=spdx-license-matcher in requirements. I think it was made as part of the GSoC project. I might have missed previous conversation regarding this, but is it possible to move this to the spdx organisation(or you plan to do it later)? Since this would be better managed and visible in the spdx org repos.

@Ugtan
Copy link
Collaborator Author

Ugtan commented Aug 27, 2019

Hello @rtgdk,

Thank you so much for your feedback.

I hope tests will be added to verify the functionality (maybe selenium tests) and edge cases

Yeah, I will write some tests to verify the functionality.

Another thing I noticed is the -e git://github.com/Ugtan/spdx-license-matcher.git#egg=spdx-license-matcher in requirements

This is already transferred to SPDX.

@goneall
Copy link
Member

goneall commented Aug 27, 2019

@Ugtan It looks like when I merged your other PR, it created a conflict. If you could resolve the conflict and update the README for the mac I'll merge this in.

@Ugtan
Copy link
Collaborator Author

Ugtan commented Aug 28, 2019

@goneall I will resolve the merge conflicts soon.

Enhance the check_license functionality by using close matches
for optimization to narrow down the possible matches.
Add details on how to install redis server
Fixes the displayModal error while showing diff for close matches
in the license submittal feature.
@Ugtan
Copy link
Collaborator Author

Ugtan commented Aug 28, 2019

@goneall @techytushar @rtgdk I have resolved the merge conflicts as well as the changes that were suggested. Please have a look. Thanks

@goneall
Copy link
Member

goneall commented Aug 29, 2019

Thanks @Ugtan - merging now. Feel free to add a PR for issue #135

@goneall goneall merged commit f368bab into spdx:master Aug 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants