ScanCode toolkit

A typical software project often reuses hundreds of third-party packages. License and packages, dependencies and origin information is not always easy to find and not normalized: ScanCode discovers and normalizes this data for you.

Read more about ScanCode here: https://scancode-toolkit.readthedocs.io/.

Check out the code at https://github.com/nexB/scancode-toolkit

Discover also:

The ScanCode.io server project here: https://scancodeio.readthedocs.io
The ScanCode Workbench project for visualization of scancode results data: https://github.com/nexB/scancode-workbench
Other companion SCA projects for code origin, license and security analysis here: https://aboutcode.org

Build and tests status

We run 30,000+ tests on each commit on multiple CIs to ensure a good platform compabitility with multiple versions of Windows, Linux and macOS.

Azure	RTD Build	GitHub actions Docs	GitHub actions Release

Why use ScanCode?

As a standalone command-line tool, ScanCode is easy to install, run, and embed in your CI/CD processing pipeline. It runs on Windows, macOS, and Linux.
ScanCode is used by several projects and organizations such as the Eclipse Foundation, OpenEmbedded.org, the FSFE, the FSF, OSS Review Toolkit, ClearlyDefined.io, RedHat Fabric8 analytics, and many more.
ScanCode detects licenses, copyrights, package manifests, direct dependencies, and more both in source code and binary files and is considered as the best-in-class and reference tool in this domain, re-used as the core tools for software composition data collection by several open source tools.
ScanCode provides the most accurate license detection engine and does a full comparison (also known as diff or red line comparison) between a database of license texts and your code instead of relying only on approximate regex patterns or probabilistic search, edit distance or machine learning.
Written in Python, ScanCode is easy to extend with plugins to contribute new and improved scanners, data summarization, package manifest parsers, and new outputs.
You can save your scan results as JSON, YAML, HTML, CycloneDX or SPDX or even create your own format with Jinja templates.
You can also organize and run ScanCode server-side with the companion ScanCode.io web app to organize and store multiple scan projects including scripted scanning pipelines.
ScanCode output data can be easily visualized and analysed using the ScanCode Workbench desktop app.
ScanCode is actively maintained, has a growing users and contributors community.
ScanCode is heavily tested with an automated test suite of over 20,000 tests.
ScanCode has an extensive and growing documentation.
ScanCode can process packages, build manifest and lockfile formats to collect Package URLs and extract metadata: Alpine packages, BUCK files, ABOUT files, Android apps, Autotools, Bazel, JavaScript Bower, Java Axis, MS Cab, Rust Cargo, Cocoapods, Chef Chrome apps, PHP Composer and composer.lock, Conda, CPAN, Debian, Apple dmg, Java EAR, WAR, JAR, FreeBSD packages, Rubygems gemspec, Gemfile and Gemfile.lock, Go modules, Haxe packages, InstallShield installers, iOS apps, ISO images, Apache IVY, JBoss Sar, R CRAN, Apache Maven, Meteor, Mozilla extensions, MSI installers, JavaScript npm packages, package-lock.json, yarn.lock, NSIS Installers, NugGet, OPam, Cocoapods, Python PyPI setup.py, setup.cfg, and several related lockfile formats, semi structured README files such as README.android, README.chromium, README.facebook, README.google, README.thirdparty, RPMs, Shell Archives, Squashfs images, Java WAR, Windows executables and the Windows registry and a few more. See all available package parsers for the exhaustive list.

See our roadmap for upcoming features.

Documentation

The ScanCode documentation is hosted at scancode-toolkit.readthedocs.io.

If you are new to visualization of scancode results data, start with our newcomer page.

If you want to compare output changes between different versions of ScanCode, or want to look at scans generated by ScanCode, review our reference scans.

Installation

Before installing ScanCode make sure that you have installed the prerequisites properly. This means installing Python 3.8 for x86/64 architectures. We support Python 3.8, 3.9, 3.10, 3.11 and 3.12.

See prerequisites for detailed information on the support platforms and Python versions.

There are a few common ways to install ScanCode.

**Installation as an application: Install Python 3.8, download a release archive, extract and run**. This is the recommended installation method.
Development installation from source code using a git clone
Development installation as a library with "pip install scancode-toolkit" [Note that this is not supported on arm64 machines]
Run in a Docker container with a git clone and "docker run"
In Fedora 40+ you can dnf install scancode-toolkit

Quick Start

After ScanCode is installed successfully you can run an example scan printed on screen as JSON:

scancode -clip --json-pp - samples

Follow the How to Run a Scan tutorial to perform a basic scan on the samples directory distributed by default with ScanCode.

See more command examples:

scancode --examples

See How to select what will be detected in a scan and How to specify the output format for more information.

You can also refer to the command line options synopsis and an exhaustive list of all available command line options.

Archive extraction

By default ScanCode does not extract files from tarballs, zip files, and other archives as part of the scan. The archives that exist in a codebase must be extracted before running a scan: extractcode is a bundled utility behaving as a mostly-universal archive extractor. For example, this command will recursively extract the mytar.tar.bz2 tarball in the mytar.tar.bz2-extract directory:

./extractcode mytar.tar.bz2

See all extractcode options and how to extract archives for details.

Support

If you have a problem, a suggestion or found a bug, please enter a ticket at: https://github.com/nexB/scancode-toolkit/issues

For discussions and chats, we have:

an official Gitter channel for web-based chats. Gitter is now accessible through Element or an IRC bridge. There are other AboutCode project-specific channels available there too.
The discussion channel for scancode specifically aimed at users and developers using scancode-toolkit.

Source code and downloads

License

Apache-2.0 as the overall license
CC-BY-4.0 for reference datasets (initially was in the Public Domain).
Multiple other secondary permissive or copyleft licenses (LGPL, MIT, BSD, GPL 2/3, etc.) for third-party components and test suite code and data.

See the NOTICE file and the .ABOUT files that document the origin and license of the third-party code used in ScanCode for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 11,168 Commits
.github		.github
docs		docs
etc		etc
samples		samples
src		src
tests		tests
.VERSION		.VERSION
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.readthedocs.yml		.readthedocs.yml
AUTHORS.rst		AUTHORS.rst
CHANGELOG.rst		CHANGELOG.rst
CODE_OF_CONDUCT.rst		CODE_OF_CONDUCT.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
Dockerfile		Dockerfile
INSTALL.rst		INSTALL.rst
MANIFEST.in		MANIFEST.in
NOTICE		NOTICE
README.rst		README.rst
ROADMAP-ABOUTCODE.rst		ROADMAP-ABOUTCODE.rst
ROADMAP.rst		ROADMAP.rst
apache-2.0.LICENSE		apache-2.0.LICENSE
azure-pipelines.yml		azure-pipelines.yml
cc-by-4.0.LICENSE		cc-by-4.0.LICENSE
configure		configure
configure.bat		configure.bat
conftest.py		conftest.py
extractcode		extractcode
extractcode.bat		extractcode.bat
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements-linux.txt		requirements-linux.txt
requirements-native.txt		requirements-native.txt
requirements.txt		requirements.txt
scancode		scancode
scancode-toolkit.ABOUT		scancode-toolkit.ABOUT
scancode.bat		scancode.bat
setup-mini.cfg		setup-mini.cfg
setup.cfg		setup.cfg
setup.py		setup.py
vendorize.toml		vendorize.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ScanCode toolkit

Build and tests status

Why use ScanCode?

Documentation

Installation

Quick Start

Archive extraction

Support

Source code and downloads

License

About

Releases 73

Sponsor this project

Contributors 324

Languages

nexB/scancode-toolkit

Folders and files

Latest commit

History

Repository files navigation

ScanCode toolkit

Build and tests status

Why use ScanCode?

Documentation

Installation

Quick Start

Archive extraction

Support

Source code and downloads

License

About

Topics

Resources

Stars

Watchers

Forks

Releases 73

Sponsor this project

Contributors 324

Languages