Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

License detection improvements and review #3346

Merged
merged 95 commits into from
Jun 11, 2023

Conversation

pombredanne
Copy link
Member

@pombredanne pombredanne commented Apr 24, 2023

This PR adds misc. license and copyright detection improvements and the --todo scancode CLI option for review items summary.

Tasks

  • Reviewed contribution guidelines
  • PR is descriptively titled 📑 and links the original issue above 🔗
  • Tests pass -- look for a green checkbox ✔️ a few minutes after opening your PR
    Run tests locally to check for errors.
  • Commits are in uniquely-named feature branch and has no merge conflicts 📁
  • Looked for possible updates in documentation and added updates if applicable
  • Updated CHANGELOG.rst

DennisClark and others added 30 commits April 24, 2023 17:55
Signed-off-by: Dennis Clark <dmclark@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
This is a 100% relevant reference

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
This helps to get a more accurate and correct detection

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Add new rules to improve detection accuracy in phpwiki.

Reference: #3256
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
These are mostly seen in some Debian copyright files

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
These are weird but seen frequently enough

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Also treat some capitalized words as no proper nouns

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Dot not treat solo period as NN
Allow trailing "authors" as part of copyright
Handle more case of truncated copyrights

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
…les' into license-detection-improvements-and-review
* Remove duplicated tests
* Do not strip leading @ sign in words
* Streamline some lexer regex (they were incorrectly handling dots)
* Add new copyright statement variants

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* Adds a new --todo post scan option to review potential
  ambiguous license/package detections
* Adds a `todo` top level attribute with these review items
  which is a `ToDo` list for the reviewer

Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
These solve the following license detection bugs:

* #3361
* #3360
* #3358
* #3355

Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
…-and-review' into license-detection-improvements-and-review
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@AyanSinhaMahapatra AyanSinhaMahapatra force-pushed the license-detection-improvements-and-review branch from b36bb73 to 86151a8 Compare May 30, 2023 10:51
Reference: #3409
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
@AyanSinhaMahapatra AyanSinhaMahapatra force-pushed the license-detection-improvements-and-review branch from 86151a8 to 886bc39 Compare May 30, 2023 11:39
AyanSinhaMahapatra and others added 9 commits May 31, 2023 14:29
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Fixes unknown-spdx detections for nuget nuspecs file license
detection where a dangling </licenseurl> markup would result
in the SPDX license detection not working.

Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
For the form of "Contributors to the project..."

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Somehow we need to re-raise an exception fo this to work correctly

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Do not fail to scan on timeout, instead keep on trucking and log errors.

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@pombredanne pombredanne force-pushed the license-detection-improvements-and-review branch from b439760 to ca92534 Compare June 7, 2023 10:48
@AyanSinhaMahapatra AyanSinhaMahapatra force-pushed the license-detection-improvements-and-review branch 2 times, most recently from 05dd2ba to 6d439eb Compare June 9, 2023 14:27
Fixes a bug where top level license detections had license expressions
and IDs which were not present anywhere in the scan. This was because
we were dereferencing unknown license references after the unique
license detection collections and so the modified identifiers and
license expressions were not present in the top-level license
detections.

Fixes a bug where top level unique license detections
were not properly collected from package license detections.

Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
@AyanSinhaMahapatra AyanSinhaMahapatra force-pushed the license-detection-improvements-and-review branch from 6d439eb to 1bad407 Compare June 9, 2023 15:02
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
@AyanSinhaMahapatra AyanSinhaMahapatra force-pushed the license-detection-improvements-and-review branch from 48fe068 to a87a4d5 Compare June 9, 2023 17:21
Copy link
Member Author

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ... time to merge thos!

@pombredanne pombredanne merged commit b52f7df into develop Jun 11, 2023
34 checks passed
@pombredanne pombredanne deleted the license-detection-improvements-and-review branch June 11, 2023 17:53
@AyanSinhaMahapatra AyanSinhaMahapatra restored the license-detection-improvements-and-review branch June 12, 2023 10:38
@pombredanne pombredanne deleted the license-detection-improvements-and-review branch June 29, 2023 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants