diff --git a/docs/source/archive/gsoc-toc.rst b/docs/source/archive/gsoc-toc.rst index 9d0a27f..dd49183 100644 --- a/docs/source/archive/gsoc-toc.rst +++ b/docs/source/archive/gsoc-toc.rst @@ -14,10 +14,10 @@ GSoC 2022 .. toctree:: :maxdepth: 2 + gsoc/reports/2022/scancode_kevin gsoc/reports/2022/scancodeio_akhil gsoc/reports/2022/scancode_workbench_omkar gsoc/reports/2022/vulnerablecode_vulntotal_keshav - gsoc/reports/2022/vulnerablecode_ziad GSoC 2021 diff --git a/docs/source/archive/gsoc/reports/2022/scancode_kevin.rst b/docs/source/archive/gsoc/reports/2022/scancode_kevin.rst new file mode 100644 index 0000000..007a768 --- /dev/null +++ b/docs/source/archive/gsoc/reports/2022/scancode_kevin.rst @@ -0,0 +1,62 @@ +======================================================================== +Extending license detection to use licenses external to ScanCode Toolkit +======================================================================== + + +| **Organization:** `AboutCode `_ +| **Project:** `Scancode Toolkit `_ +| **Mentee:** `Kevin Ji (KevinJi22) `_ +| **Mentors:** Philippe Ombredanne, AyanSinhaMahapatra, Jono Yang + +Overview +-------- + +When doing license detection, ScanCode uses the licenses and rules in the ScanCode LicenseDB. +The goal of this project is to extend the capabilities of ScanCode license detection to include +licenses that are external to LicenseDB, such as proprietary licenses to be kept within an +organization. I also extended it to include licenses installed from external sources. + +Implementation +-------------- + +All the work I did is contained in `this single PR `_. +I added a new command line option called ``--additional-license-directory`` that someone can use +to include additional licenses/rules contained in other directories in the license index. +Scancode Toolkit uses this license index when doing license detection. +This option must be called with ``--reindex-licenses`` to explicitly regenerate the license cache, +and then when doing license scans, users can just use the regular ``--license`` option and these +additional licenses and/or rules will be used in license detection. + +This change also allows users to install directories of licenses or rules to their local machine, +and then Scancode Toolkit will detect and include them in the license cache when someone is +reindexing the licenses. If someone wants to create a directory of licenses or rules that they +want to install and use in Scancode Toolkit, they must subclass a new Plugin class I added. +This allows Scancode Toolkit to identify the location of these installed licenses/rules +through a unique entry point and add them to the license index. + +Finally, all these changes are tested through multiple unit tests validating both correct +behavior and error handling as needed. + +Post GSoC +--------- + +I would like to merge this PR into Scancode Toolkit, hopefully allowing users to leverage +this feature to expand their license detection capabilities. + +Links +----- + +`Project idea `_ + +`Official GSoC project page `_ + +`GSoC Proposal `_ + +`Documentation page about the feature `_ + +Acknowledgements +---------------- + +Thanks to Jono and Phillippe for being my mentors. I enjoyed all the meetings, code reviews, +and design discussions. Thank you for your time and your patience!