Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


This is the open source repository for OSSPolice presented in paper: Identifying Open-Source License Violation and 1-day Security Risk at Large Scale [pdf]. For technical details, please refer to the paper.

The project consists of four components: osspolice, redis, postgresql and rabbitmq. For quick usage, please skip to the Usage section.

  1. osspolice is the main code base for this project. It can be configured using main/config, a template with explanation on the config options is available in main/config.tmpl. osspolice is used to index new C/C++ repos and Java artifacts, as well as query apk/jar/so/dex against the indexed database to find reused Open Source Software (OSS) and their versions.
    • For feature extractor, java parser is located in main/java_parser, and native parser is located in main/native_parser. The former is open source and the latter is not because (a) there are many good alternatives for native (C/C++) parser, such as ctags, gtags etc, (b) we have other works in progress. In the native parser, you can easily adapt the current code to use other parsers.
  2. rabbitmq is the scheduler/broker for distributed deployment. The option CELERY_BROKER_URL in osspolice should be set to the link of the scheduler.
  3. redis is the in-memory database used for indexing and searching. The option NATIVE_NODES, NATIVE_VERSION_NODES, JAVA_NODES, JAVA_VERSION_NODES and RESULT_NODES in osspolice should be set to the setup of redis database. Since there is no key collision between all these databases, they can be merged together.
    • NATIVE_NODES are prefixed with str-, func-, var-, file-, dir-, branch-, repo-. Reverse mapping replaces '-' with '_'.
    • JAVA_NODES are prefixed with strings-, classes-, normclasses-, centroids-, files-, dirs-, repo-. Reverse mapping prepends 'r-'.
    • NATIVE_VERSION_NODES and JAVA_VERSION_NODES are prefixed with software-, softwareversion-.
  4. postgresql is the database used for storing repo and artifact information. The option NATIVE_DBS and JAVA_DBS in osspolice should be set to the setup of postgresql database.


docker, docker-compose


If you are simply interested in testing the tool for your app, we are working on an online service. Please check back later!

If you are interested in building your own hierarchical indexing database, prebuilt postgresql databases are provided. You can load them using postgresql/ List of repos/artifacts used in the paper are also provided in the data folder. You can use them to build your own database.

If you are interested in comparing with our tool, we also have a prebuilt indexing database available. Please shoot us an email at for how to set this database up.

  1. start rabbitmq scheduler
    • cd rabbitmq && docker-compose up
  2. start redis database
    • cd redis && docker-compose up
  3. start postgresql database and load data
    • In one terminal, cd postgresql && docker-compose up
    • After postgresql starts, in another terminal, ./
  4. start osspolice
    • customized your main/config from main/config.tmpl, point the broker to rabbitmq, redis cluster to redis databases, and postgresql to postgresql database.
    • start osspolice worker
      • docker-compose up
    • start osspolice master
      • Start osspolice interactively, docker-compose run osspolice /bin/bash
      • Add jobs to broker, python apk_search $PATH_TO_APK

Helper Scripts

  • Create GitHub accounts automatically. This script exploits the fact that GitHub accounts can be created with invalid email address.
    • main/
  • Check the status of redis database. This script prints the status of indexed Native and Java database.
    • main/


  1. Support iOS and Windows app binaries
  2. Robustness of native_parser and java_parser
  3. Add support for Python, JS etc.


This software is licensed under GPL-3.0. Please check the terms and restrictions at


Identifying Open-Source License Violation and 1-day Security Risk at Large Scale








No releases published


No packages published