New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EIP-1470: Smart Contract Weakness Classification (SWC) #1469

Open
thec00n opened this Issue Oct 4, 2018 · 4 comments

Comments

Projects
None yet
2 participants
@thec00n
Copy link
Contributor

thec00n commented Oct 4, 2018


eip: 1470
title: Smart Contract Weakness Classification (SWC)
author: Gerhard Wagner (@thec00n)
discussions-to: #1469
status: Draft
type: Informational
created: 2018-09-18

Simple Summary

This EIP proposes a classification scheme for security weaknesses in Ethereum smart contracts.

Abstract

The SWC is a smart contract specific software weakness classification scheme for developers, tool vendors and security practitioners. The SWC is loosely aligned to the terminologies and structure used in the Common Weakness Enumeration - CWE scheme while overlaying a wide range of weakness variants that are specific to smart contracts.

The goals of the SWC scheme are as follows:

  • Provide a straightforward way to classify weaknesses in smart contract systems.
  • Provide a straightforward way to identify the weakness(es) that lead to a vulnerability in a smart contract system.
  • Define a common language for describing weaknesses in smart contract systems' architecture, design and code.
  • Train and increase the performance of smart contract security analysis tools.

Motivation

In the software security industry, it is a widely accepted practice to use a common terminology and to classify security related bugs and errors with a standardized scheme. While this has not stopped vulnerabilities from appearing in software, it has helped communities focusing on web applications, network protocols, IOT devices and various other fields to educate users and developers to understand the nature of security related issues in their software. It has also allowed the security community to quickly understand vulnerabilities that occur in production systems to perform root cause analysis or triage findings from various security analysis sources. In recent years various organizations and companies also published vulnerability data to find the most widespread security issues based on collected vulnerability data. Two examples that are widely used and referred to are the SANS TOP 25 Most Dangerous Software Errors and the OWASP TOP 10. None of those publications would have been possible without a common classification scheme.

At present no such weakness classification scheme exists for weaknesses specific to Ethereum Smart Contracts. Common language and awareness of security weaknesses is mostly derived from academic papers, best practice guides and published articles. Findings from audit reports and security tool analysis add to the wide range of terminologies that is used to describe the discovered weaknesses. It is often time consuming to understand the technical root cause and the risk associated to findings from different sources even for security experts.

Rationale

While recognizing the current gap, the SWC does not aim to reinvent the wheel in regards to classification of security weaknesses. It rather proposes to build on top of what has worked well in other parts of the software security community - specifically the Common Weakness Enumeration (CWE), a list of software vulnerability types that stands out in terms of adoption and breadth of coverage. While CWE does not describe any weaknesses specific to smart contracts, it does describe related weaknesses at higher abstraction layers. This EIP proposes to create smart contract specific variants while linking back to the larger spectrum of software errors and mistakes listed in the CWE that different platforms and technologies have in common.

Specification

Before discussing the SWC specification it is important to describe the terminology used:

  • Weakness: A software error or mistake that in the right conditions can by itself or coupled with other weaknesses lead to a vulnerability.
  • Vulnerability: A weakness or multiple weaknesses which directly or indirectly lead to an undesirable state in a smart contract system.
  • Variant: A specific weakness that is described in a very low detail specific to Ethereum smart contracts. Each variant is assigned an unique SWC ID.
  • Relationships: CWE has a wide range of Base and Class types that group weaknesses on higher abstraction layers. The CWE uses Relationships to link SWC smart contract weakness variants to existing Base or Class CWE types. Relationships are used to provide context on how SWCs are linked to the wider group of software security weaknesses and to be able to generate useful visualisations and insights through issue data sets. In its current revision it is proposed to link a SWC to its closest parent in the CWE.
  • SWC ID: A numeric identifier linked to a variant (e.g. SWC-101).
  • Test Case: A test case constitutes a micro-sample or real-world smart contract that demonstrates concrete instances of one or multiple SWC variants. Test cases serve as the basis for meaningful weakness classification and are useful to security analysis tool developers.

The SWC in its most basic form links a numeric identifier to a weakness variant. For example the identifier SWC-101 is linked to the Integer Overflow and Underflow variant. While a list with the weakness title and a unique id is useful by itself, it would also be ambiguous without further details. Therefore the SWC recommends to add a definition and test cases to any weakness variant.

SWC definition

A SWC definition is formated in markdown to allow good readability and tools to process them easily. It consists of the following attributes.

  • Title: A name for the weakness that points to the technical root cause.
  • Relationships: Links a CWE Base or Class type to its CWE variant. The Integer Overflow and Underflow variant for example is linked to CWE-682 - Incorrect Calculation.
  • Description: Describes the nature and potential impact of the weakness on the contract system.
  • Remediation: Describes ways on how to fix the weakness.
  • References: Links to external references that contain relevant additional information on the weakness.

Test cases

Test cases include crafted as well as real-world samples of vulnerable smart contracts. A single test case consists of three components:

  1. Source code of a smart contract sample; e.g. Solidity, Vyper, etc.
  2. Compiled asset from an EVM compiler in machine readable format; e.g. JSON or ethPM.
  3. Test result configuration that describes which and how many instances of a weakness variant can be found in a given sample. The YAML schema for the proposed test result configuration is listed below.
description:
  type: string
  required: true
issues:
- id:
    type: string
    required: true
  count:
    type: number
    required: true
  locations:
  - bytecode_offsets:
    - type: number
    line_numbers:
    - type: number

Implementation

The Smart Contract Weakness Classification registry located in this Github repository uses the SWC scheme proposed in this EIP. A Github Pages rendered version is also available here.

Copyright

Copyright and related rights waived via CC0.

@thec00n thec00n changed the title EIP-XXXX: Smart Contract Weakness Classification (SWC) EIP-1470: Smart Contract Weakness Classification (SWC) Oct 4, 2018

@fubuloubu

This comment has been minimized.

Copy link
Member

fubuloubu commented Oct 4, 2018

To reword the test case structure:

  1. Compiled asset (JSON format) from an EVM compiler (solc, Vyper, etc.). Uses the solc output format, but eventually could use the ethPM packaging spec.
  2. Source code (for reference only). Note: ethPM has an option to store source code and bytecode-sourcecode mappings.
  3. Expected results (YAML format). Contains a test case description and a listing of bugs that are expectes to be found. If no list is provided, expected result is that no bug is found.

Do you agree with this rewording?


Also, the naming scheme should differentiate on compiler, so different languages can be tried.

@thec00n

This comment has been minimized.

Copy link
Contributor

thec00n commented Oct 5, 2018

Thanks for your feedback @fubuloubu! I have reworded the paragraph based on your suggestions. I also realised that this should be worded more generically as we need to consider other languages and file formats in the future.

Test cases include crafted as well as real-world samples of vulnerable smart contracts. A single test case consists of three components:

  1. Source code of a smart contract sample; e.g. Solidity, Vyper, etc.
  2. Compiled asset from an EVM compiler in machine readable format; e.g. JSON or ethPM.
  3. Test result configuration that describes which and how many instances of a weakness variant can be found in a given sample. The YAML schema for the proposed test result configuration is listed below.
@fubuloubu

This comment has been minimized.

Copy link
Member

fubuloubu commented Oct 5, 2018

Let me also note that I think the primary result this EIP should seek to codify is in defining the "test result configuration" file and any supplemental files. ethPM packages are a great format for compiled assets; source code and execution paths may also make sense too.

Perhaps the configuration file should reference the relative location of these supplemental files?

The specification of a common weakness classification may be out of scope for an EIP. Indeed, there may be multiple such classifications covering different topics. We should seek to make them highly interoperable so as to enable the consensus and colloboration of different benchmarks and classifications.

@thec00n

This comment has been minimized.

Copy link
Contributor

thec00n commented Oct 7, 2018

@fubuloubu It has become quickly apparent to me that with a growing set of test cases (or any security issue data set) it is very difficult to differentiate between weakness variants without a classification scheme as you create overlap and you get lost in the ambiguity of what different people call something. Also it’s helpful to discuss weakness classification based on micro-samples and real-world smart contracts where instances of a weakness occur together with description, mitigation strategies, references and how the weakness relates to the bigger picture of software security weaknesses -> CWE. So I think the test cases and the classification scheme go hand in hand and doing one them in isolation would be less meaningful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment