-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
License rework #68
Merged
Merged
License rework #68
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…atch contents of file to a spdx matcher
…onment variable instead
…atch contents of file to a spdx matcher
…onment variable instead
…onment variable instead
…onment variable instead - post rebase fix
# Conflicts: # gimie/sources/common/license.py
…ncode-toolkit to match license. Not finished yet - as the returned scancode-toolkit output still has to be parsed to extract the spdx license, and I'm still getting a strange connection timeout most of the time (but not always!)
…arse it, match it against a spdx license and return that license. still to do: Recursion - in case there are multiple licenses. Refactoring - get rid of all the file writes and unnecessary prints, and build some functions which do one job only.
…nction documentation and reformatted with Black.
…ypes, remove trailing slashes from repo_URL so GitHub API can parse it.
… one big "get-license" function which calls all the other functions in order, using the singular input of repo_url which the user must supply.
…ut it inside the get_license function as a default variable. And black reformat.
…stead of the subprocess call to the tool.
… to github TREES instead of CONTENTS. The rest is the same - with a slightly unnecessary list-dictionary comprehension that hurts my head, which I will probably delete in the future.
…stead of a normal file now to prevent concurrent writes to the same filename.
…ecessary info, and then reusing the result in two different parts of the code
…onment variable instead
…ncode-toolkit to match license. Not finished yet - as the returned scancode-toolkit output still has to be parsed to extract the spdx license, and I'm still getting a strange connection timeout most of the time (but not always!)
…arse it, match it against a spdx license and return that license. still to do: Recursion - in case there are multiple licenses. Refactoring - get rid of all the file writes and unnecessary prints, and build some functions which do one job only.
…nction documentation and reformatted with Black.
…ypes, remove trailing slashes from repo_URL so GitHub API can parse it.
…b.py, probably a file type problem...
…g for both gitlab and github. Removed doubled max license match function.
…d of being duplicated in both github and gitlab.
… with highest coverage
Ok, @cmdoret I think I'm ready for your review, only 65 commits later. 🛠️ Hopefully it's the last one! |
…rm of datePublished
… print statement github.py
cmdoret
requested changes
Oct 11, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job! 😎
I had a few comments related to code organization, docstrings and type hints, but everything we need is here and it seems to work 👍
…e identifiers not fully fixed yet.
…in API license (as this only works in GH). Instead, the license is found in the repository, and the scancode API is used to extract a license ID from that text. This license ID (with bad capitalization) is fed to the SPDX License dictionary to extract a SPDX license ID which is the final value output in a triple. This also works if multiple license files are present in a repo.
…n API license (as this only works in GH). Instead, the license is found in the repository, and the scancode API is used to extract a license ID from that text. This license ID (with bad capitalization) is fed to the SPDX License dictionary to extract a SPDX license ID which is the final value output in a triple. This also works if multiple license files are present in a repo.
…n API license (as this only works in GH). Instead, the license is found in the repository, and the scancode API is used to extract a license ID from that text. This license ID (with bad capitalization) is fed to the SPDX License dictionary to extract a SPDX license ID which is the final value output in a triple. This also works if multiple license files are present in a repo.
… I look at the right one.
4 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR implements license detection based on the GitHub/GitLab repository contents, as opposed to relying on the Git provider API to retrieve the license. In brief:
list_files()
method to retrieve the list of file in the repo (names + download URLs)_get_license()
method which uses scancode to identify + classify license files