Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a command line option to dump the license data #2738

Closed
fviernau opened this issue Oct 14, 2021 · 4 comments
Closed

Add a command line option to dump the license data #2738

fviernau opened this issue Oct 14, 2021 · 4 comments
Assignees

Comments

@fviernau
Copy link
Contributor

fviernau commented Oct 14, 2021

The OSS Review Toolkit needs to have all license IDs along with the respective license text for the ScanCode version it uses.
Currently it extracts it from the license data directory, see e.g. [1] [2]. So it depends on the directory structure as well as on the file formats of ScanCode license data. A ScanCode command line option for dumping

  1. licenseIDs
  2. corresponding texts
  3. corresponding license category (not needed now, only maybe nice to have IMO to enable using the categories within ORT by implementing a conversion to [3])

..to a JSON? file would help to remove that mentioned dependency.

[1] https://github.com/oss-review-toolkit/ort/blob/093a510aa8545cc61cc172d7a7a274f56df87d84/spdx-utils/build.gradle.kts#L62-L71
[2] https://github.com/oss-review-toolkit/ort/blob/093a510aa8545cc61cc172d7a7a274f56df87d84/spdx-utils/build.gradle.kts#L238-L343
[3] https://github.com/oss-review-toolkit/ort/blob/093a510aa8545cc61cc172d7a7a274f56df87d84/model/src/main/kotlin/licenses/LicenseClassifications.kt

@AyanSinhaMahapatra
Copy link
Member

There are two options:

  • We have a command line option (I was thinking something like --get-license-data folder_path) in SCTK to:
    Just generate the .json/.yml files used to generate the LicenseDB pages and have validation tests (this way we have the sort of API for license data and other tools like ORT can consume and not read the actual License files)
  • Do everything in option 1. plus we actually generate the pages (.html files and the index.html) too locally (this would mean adding almost everything in the repository scancode-licensedb into scancode-toolkit, history preserved of course) i.e. move the code into a licensedcode file and the template/static also there.

@pombredanne :

IMHO there are different things

  • SCTK shall be able to generate a dump that would include JSON/YML and HTML (with more or less flexbility TBD, but a single option for now is fine). The output should be exactly like the scancode-licensedb output
  • The scancode-licensedb repo will stay as is, minus the generation code and would instead use the the CLI option in 1. to create itself (and the history stay in this repo entirely)

@AyanSinhaMahapatra AyanSinhaMahapatra self-assigned this Sep 10, 2022
AyanSinhaMahapatra added a commit that referenced this issue Sep 12, 2022
Adds a new command line option: `--get-license-data` to:

* Dump license data in JSON, YAML and HTML formats.
* Also dumps the .LICENSE file with text and data as YAML frontmatter.
* Generates an index and a static website to view the data.

This is reusing code originally located at:
https://github.com/nexB/scancode-licensedb

Reference: #2738
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
@AyanSinhaMahapatra
Copy link
Member

This is in develop now, and will be released in a release-candidate soon, @fviernau can you try it out. More details in the PR linked above ^ .

AyanSinhaMahapatra added a commit that referenced this issue Jan 12, 2023
* Removes CLI option `--dump-license-data`
* Adds new console script `scancode-license-data --path PATH`
* Updates CHANGELOG

Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
AyanSinhaMahapatra added a commit that referenced this issue Jan 12, 2023
* Removes CLI option `--dump-license-data`
* Adds new console script `scancode-license-data --path PATH`
* Updates CHANGELOG

Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
@pombredanne
Copy link
Member

@AyanSinhaMahapatra I think this is all done now? correct? If so please close.

@AyanSinhaMahapatra
Copy link
Member

@pombredanne yes!

This is now present as a seperate console command scancode-license-data at https://github.com/nexB/scancode-toolkit/blob/develop/src/licensedcode/license_db.py#L208

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants