Automated review of open source software projects
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
latest_cache
original_cache
test_dataset
.gitignore
.pylintrc
Black-Duck-Letter-6SJan2015.pdf
ChangeLog.txt
LICENSE
Makefile
OSS-2015-06-19.docx
OSS-2015-06-19.odt
OSS-2015-06-19.pdf
README.md
apt_cache_dumpavail.txt
by_inst
dataflow-analysis.pptx
oss_package_analysis.py
projects_to_examine.csv
results.csv
sqlite_results.sh

README.md

CII Best Practices

Core Infrastructure Initiative Census

Automated quantitative review of open source software projects.

This project contains programs and documentation to help identify open source software (OSS) projects that may need additional investment to improve security, by combining a variety of metrics.

Key files include:

The Python analysis program is released under the MIT license and requires BeautifulSoup to work. The program requires an API key from Black Duck Open Hub to work.

The documentation is released under the Creative Commons CC-BY license.

Some supporting data was sourced from the Black Duck Open HUB (formerly Ohloh), a free online community resource for discovering, evaluating, tracking and comparing open source code and projects. We thank Black Duck for the data!

Description of this project

The Heartbleed vulnerability in OpenSSL highlighted that while some open source software (OSS) is widely used and depended on, vulnerabilities can have serious ramifications, and yet some projects have not received the level of security analysis appropriate to their importance. Some OSS projects have many participants, perform in-depth security analyses, and produce software that is widely considered to have high quality and strong security. However, other OSS projects have small teams that have limited time to do the tasks necessary for strong security. The trick is to identify which critical projects fall into the second bucket.

We have focused on automatically gathering metrics, especially those that suggest less active projects. We also provided a human estimate of the program's exposure to attack, and developed a scoring system to heuristically combine these metrics. These heuristics identified especially plausible candidates for further consideration. For our initial set of projects to examine, we took the set of packages installed by Debian base and added a set of packages that were identified as potentially concerning.

Collaboration

We invite you to contribute via:

  • pull request - if you have a specific change to propose in the documentation, code, or data. We prefer these, since these are easy to merge and show exactly what the proposer has in mind.
  • issue - if you have an idea or bug report (but no specific change to pull).
  • mailing list - for general discussion of this project.

If you have a vulnerability report, please privately send an email to Marcus Streets mstreets@linuxfoundation.org and David A. Wheeler dwheeler@ida.org. Please try to use TLS encryption when you send the email (many providers, like Gmail, will try to do this automatically).

Here are some examples of things you could do:

  • try different metrics and heuristics. Send us pull requests for the ones that you find experimentally make the most sense.
  • try different data sources.
  • review the data in projects_to_examine.csv and send corrections and elaborations.
  • suggest more projects to consider in the future.
  • mention additional relevant literature in the field.

Changes to the Python code should generally comply with Python PEP 8 but use 2 spaces per indentation level. Changes must pass "make analyze" (which runs the static analysis tool pyflakes) and "make test" (which runs the automated test suite). Changes that add major new functionality must extend the automated test suite as necessary to cover it. We use the "-t" and "-3" warning flags ("-3" detects some Python 2/3 problems).

In the future we hope to add using an additional static analysis tool, pylint. So changes shouldn't add new pylint reports, and fixing pylint reports is welcome (you can see them by running "make pylint"). It's written in Python2, but the goal is to avoid any construct that 2to3 can't automatically fix.

Background

This work was sponsored by the Linux Foundation's Core Infrastructure Initiative