Skip to content

EDK II Code Scanning

Michael Kubacki edited this page Nov 16, 2022 · 1 revision

EDK II Code Scanning

CodeQL is a code analysis engine developed by Github to automate security checks.

It is used for Code Scanning in the TianoCore edk2 repository.

Table of Contents

  1. Overview
  2. CodeQL Usage in edk2
  3. CodeQL CLI Local Commands
  4. The CodeQL Project

Overview

CodeQL is open source and free for open source projects. It is maintained by GitHub and naturally has excellent integration with GitHub projects. CodeQL uses a semantic code analysis engine to discover vulnerabilities in a number of programming languages (both compiled and interpreted).

General CodeQL Information

TianoCore uses CodeQL C/C++ queries to find common programming errors and security vulnerabilities in firmware code. Many open-source queries are officially supported and comprise the vulnerability analysis performed against the database.

CodeQL Query Repository

In addition, anyone can leverage the code analysis engine by writing a custom query. Information around writing a custom query is available in the official documentation.

CodeQL Query Documentation.

The edk2 repository uses GitHub's Code Scanning feature (free for public repositories on GitHub.com) to show alerts directly in the repository and run CodeQL on pull requests and pushes to the repository.

About GitHub Code Scanning

Current CodeQL scanning results in the edk2 project are available in the "Actions" page of the GitHub repository.

edk2 CodeQL Workflow

A CodeQL command-line interface (CLI) is also available which can be run locally. A CodeQL CLI reference and manual are available in the documentation to learn how to use the CLI.

CodeQL CLI Documenation

At a high-level, there's two main phases of CodeQL execution to be aware of.

  1. CodeQL database generation
  2. CodeQL database analysis

The CodeQL CLI hooks into the normal firmware build process to generate a CodeQL database. Once the database is generated, any number of CodeQL queries can be run against the database for analysis.

CodeQL analysis results can be stored in the SARIF (Static Analysis Results Interchange Format) file format.

CodeQL SARIF documentation

SARIF files are JSON following the SARIF specification/schema. The files can be opened with SARIF viewers to more conveniently view the results in the file.

For example, the SARIF Viewer extension for VS Code can open a .sarif file generated by the CodeQL CLI and allow you to click links directly to the problematic line in source files.

In summary, the edk2 repository runs CodeQL on pull requests and CI builds. Any alerts will be flagged in the pull request status checks area. The queries used by the edk2 repository are stored in the edk2 CodeQL query set file.

edk2 CodeQL Query Set

CodeQL Usage in edk2

CodeQL provides the capability to debug the actual queries and for our (TianoCore) community to write our own queries and even contribute back to the upstream repo when appropriate. In other cases, we might choose to keep our own queries in a separate TianoCore repo or within a directory in the edk2 code tree.

This is all part of CodeQL Scanning. Information on the particular topic of running additional custom queries in Code Scanning is documented here in that page.

In addition, CodeQL offers the flexibility to:

  • Build databases locally
  • Retrieve databases from server builds
  • Relatively quickly test queries locally against a database for a fast feedback loop
  • Suppress false positives
  • Customize the files and queries used in the edk2 project and quickly keep this list in sync between the server and local execution

Query Target List

While CodeQL can scan various languages including Python and C/C++, the TianoCore project is only focused on C/C++ checks at this time. TianoCore has an initial set of queries to evaluate shown below (checked boxes are done).

Additional queries completed:

Query Filtering in edk2

CodeQL query files (.ql files) contain metadata about the query. For example, cpp/conditionally-uninitialized-variable states the following about the query:

/**
 * @name Conditionally uninitialized variable
 * @description An initialization function is used to initialize a local variable, but the
 *              returned status code is not checked. The variable may be left in an uninitialized
 *              state, and reading the variable may result in undefined behavior.
 * @kind problem
 * @problem.severity warning
 * @security-severity 7.8
 * @id cpp/conditionally-uninitialized-variable
 * @tags security
 *       external/cwe/cwe-457
 */

edk2 automatically include queries against certain criteria using "query filters". For example, this could include any problem query above a certain security-severity level. Or all queries with security in tags.

Because edk2 favors consistency in CI results, the project maintains a relatively fixed query set that is updated with individual queries over time.

Note: Additional queries can be found here as well - https://lgtm.com/search?q=cpp&t=rules

Process for Suggesting New Queries for edk2

New query adoption in edk2 can be proposed by sending an RFC to the TianoCore development mailing list (devel@edk2.groups.io) with the query link and justification for adopting the query in edk2.

Everyone is welcome to suggest new queries.

Query Enabling Process

Enabling a new query may trigger zero to thousands of alerts. Therefore, two paths are used to enable a new query in the project.

  1. A single patch series - The first set of patches fixes the issues needed for the query to pass. The later set of patches enables the query.
  2. A query enabling branch - A branch is created where multiple contributors can work together on fixing issues related to enabling a new query. Once the branch is ready, the history is cleaned up into a patch series that is submitted to the edk2 project.

(1) is recommended if the query is relatively simple to enable and one or two people are doing the work. (2) is recommended if a lot of effort is needed to fix issues for the query especially issues spanning across packages.

If a query is deemed fruitless during enabling testing, it can simply be rejected. The goal for CodeQL in edk2 is to enable an effective set of queries that improve the codebase. As the list of enabled queries grows, total CodeQL coverage will increase against active pull requests. We want to have relevant and effective coverage.

CodeQL in Pull Requests

TianoCore is enabling CodeQL in a step-by-step fashion. The goal with this approach is to make steady progress enabling CodeQL to become more comprehensive and useful while not impacting day-to-day code contributions.

Throughout the process described in this section, CodeQL Code Scanning is be a mandatory status check for edk2 pull requests.

Dismissing CodeQL Alerts

The following documentation describes how to dismiss alerts: Dismissing Alerts

Note: If query has a false positive a GitHub Issue can be submitted in the CodeQL repo issues page with the false-positive tag to help improve the query.

CodeQL CLI Local Commands

The CodeQL CLI can be used as follows to wrap around the edk2 build process (MdeModulePkg in this case) to generate a database in the directory cpp-database. The example shown uses stuart build commands.

codeql database create cpp-database --language=cpp --command="stuart_ci_build -c .pytool/CISettings.py -p MdeModulePkg
-a IA32,X64 TOOL_CHAIN_TAG=VS2019 Target=DEBUG --clean" --overwrite

The following command can be used to generate a SARIF file (called query-results.sarif) from that database with the results of the cpp/conditionally-uninitialized-variable query:

codeql database analyze cpp-database codeql\cpp\ql\src\Security\CWE\CWE-457\ConditionallyUninitializedVariable.ql --format=sarifv2.1.0 --output=query-results.sarif

SARIF logs can be read by log viewers such as the Sarif Viewer extension for VS Code.

The CodeQL Project

CodeQL is an actively maintained project. Here is a comparison of edk2 commit activity versus CodeQL for reference:

Because CodeQL does maintain a strong open-source presence, the TianoCore community should be able to file issues and pull requests into the project.


The original RFC for adoption of CodeQL in edk2 is available here for reference: Adoption of CodeQL in edk2

Clone this wiki locally