Skip to content
Go to file


What is Scancode-Results-Analyzer

Scancode Toolkit ScanCode detects licenses, copyrights, package manifests and direct dependencies and more both in source code and binary files.

ScanCode license detection is using multiple techniques to accurately detect licenses based on automatons, inverted indexes and multiple sequence alignments. The detection is not always accurate enough. The goal of this project is to improve the accuracy of license detection leveraging the ClearlyDefined data set, where ScanCode is used to massively scan millions of packages.

This project aims to:

  • write tools and create models to massively analyze the accuracy of license detection
  • detect areas where the accuracy could be improved.
  • Write reusable tools and models to assist in the semi-automated reviews of scan results.
  • It will also create new license detection rules semi-automatically to fix the detected anomalies

Quickstart - Local Machine

  1. Download and Get Anaconda Installed.

    Verify your installation

  2. Navigate to the scancode-results-analyzer directory.

  3. Create the Conda Environment

    Run conda env create -f env_files/load_into_dataframes/environment.yml

  4. Activate the Conda Environment

    Run conda activate results-analyzer-load

  5. Open Jupyter Lab in this conda environment

    jupyter lab

  6. Navigate to the .ipynb file you want to open on the left, and click to open.

  7. Run the Cells using Shift+Enter.

  8. More Documentation

Quickstart - Google Colab

  1. Every Jupyter Notebook (i.e. .ipynb files) has a Open In Colab Badge like this - Colab

  2. Clicking that Opens the Jupyter Notebook in Google Colab. Then Run the First Two Group of Cells that do the following tasks.

  3. Cloning the scancode-results-analyzer GitHub Repository so that the Classes/Data can be loaded into the Jupyter Notebook Environment.

  4. Installing conda and some additional requirements from the environment.yml File.

  5. Everything is set up and the Code is Ready To Execute.

GSoC Project Details

You can’t perform that action at this time.