A beginner’s guide to running and managing custom CodeQL queries

Transform your code into a structured database that you can use to surface security vulnerabilities and discover new insights.

Artwork: Micha Huigen

Denys Lashchevskyi // Staff Software Engineer, Betsson

The ReadME Project amplifies the voices of the open source community: the maintainers, developers, and teams whose contributions move the world forward every day.

Have you ever wished you could query your code the same way you query a SQL database? Well, that’s exactly what GitHub’s CodeQL enables you to do. It’s a semantic code analysis engine that transforms your code into a structured database that you can use to surface security vulnerabilities or discover new insights.

You don’t need to learn a thing about static analysis or structured queries to benefit from CodeQL. GitHub’s code scanning feature runs hundreds of predefined queries right out of the box—for free on public repositories, or as part of GitHub Advanced Security for enterprises. There are also many more niche “query packs” available that go far beyond the default scans. But while the number of ready-made queries is growing all the time, you can also create your own queries to meet your specific needs.

We’ve been writing custom CodeQL queries at Betsson for about two years, including ones to moderate package use, research and quantify code and quality metrics, and facilitate adherence to code structure and preferred architecture design. In this guide, I share some of what we’ve learned to help you get up and running with custom queries as quickly as possible.

In this guide, you will learn:

How to build a quick and minimal local setup.
How to create and run a simple custom query.
How to to add CodeQL scans to your CI with GitHub Actions.
More advanced custom query possibilities.

Set up a simple local environment

In this guide, we will use JavaScript and Visual Studio Code, but you should be able to follow along regardless of your language and code editor of choice. I invite you to fork this small repository where I collected most of the setup used for this article, including a minimal application called “health-app” that we can scan.

You need the CodeQL command-line interface (CLI) tool to create and configure databases, a language pack for your programming language of choice to convert your code into a query-able database, and one or more query packs. You can find the CLI, packs, and the Visual Studio Code plugin on the CodeQL tools page. For help setting everything up, you can refer to the CodeQL CLI quick-start documentation.

You’ll do most of your CodeQL work in the plugin for Visual Studio Code, or a similar plugin for your code editor of choice. The Visual Studio Code plugin enables you to connect to different scan targets, design queries using IntelliSense, and run/view results of your scans from produced SARIF (Static Analysis Results Interchange Format) reports.

Create and run your first custom query

Before you can run a scan, you need the following:

The project’s source code/repository
A CodeQL database built from that repository
A CodeQL configuration file for the project

Remember, your CodeQL setup—which includes scripts, packages, and databases—will live in a separate directory from the project you’re scanning.

Let’s start by running all commands from the project’s root directory (if you’re using my health-app repository, all of this has been done already):

Initiate CodeQL by running the following (“.” stands for the current directory):

codeql pack init -d . codeql

This will create the qlpack.yml file in a new subdirectory called codeql in the project’s root directory.

Configure qlpack.yml by adding the JavaScript language reference:

codeql pack add --dir ./codeql codeql/javascript-all

Create a database from your codebase:

codeql database create codeql/db -s . -l javascript

This creates a new subdirectory within the root directory called db.

Now let’s make our first custom query! Create a new file in your code editor with the following:

import javascript

from PackageDependencies deps, string name

where deps.getADependency(name, _)

select deps, "Dependency found'" + name + "'."

This is a simple query that will return all of a project’s dependencies. Save it as a .ql file inside the newly created codeql subdirectory of the project’s directory. Of course, you can create far more interesting and sophisticated queries, but let’s start here.

From the VS Code plugin, select the db directory you just created. Then right-click anywhere within the .ql file to run the first scan. The query should produce a list of package.json dependencies.

You can perform many different types of scans with CodeQL. For example, you could block vulnerable log4j usage at scale by disallowing affected versions of the package. You could update the example query we created above to explicitly disallow any library (dotenv in our case) by assigning appropriate security severity level (read on about security severity and alert settings for available options).

/**

* @name dependencies

* @description finds and lists referenced dependencies

* @kind problem

* @problem.severity error

* @security-severity 10.0

* @tags setup_check

* @id setup

*/

import javascript

from PackageDependencies deps, string name

where deps.getADependency(name, _) and name.matches("dotenv")

select deps, "Dependency found'" + name + "'."

You can learn more about static analysis and using CodeQL for vulnerability detection from GitHub’s recent tutorial. A more exotic use for CodeQL would be implementing fitness functions to proactively pursue architectural designs in a measurable way.

As you can see, running custom queries locally is quite simple. Now let’s take it up a level with GitHub Actions.

Automating CodeQL scans with GitHub Actions

The easiest way to run a custom query with GitHub Actions is with GitHub’s CodeQL Analysis workflow, which uses GitHub’s CodeQL action. It has three main components: setup, runner, and reporter. The setup and runner components are pretty self-explanatory. The reporter uploads scan results and a snapshot of your database to your repository context store, and makes them available in your Security tab. The best part is that you can download the database using a GitHub API call, should you want to investigate further or explore results in a semi-manual mode.

To run the custom dependency query we created above, be sure to add both the .ql and qlpack.yml files to your repository. Then set up the Actions workflow.

If you haven’t already enabled GitHub Actions for the repository, click Settings under your repository name. If you cannot see the Actions tab, select the “...” dropdown menu, then click Actions. Click the button that says I understand my workflows, go ahead and enable them.

On the Actions tab, click New workflow and search for CodeQL Analysis. There should be one result. Click the Configure button.

You should see an Actions workflow YAML file. Add this line to the file in the github/codeql-action/init section (remember to include the white space):

          queries: +./${{ env.CI_TMP_DIR }}/codeql/deps.ql

Click Commit. This should kick off a CodeQL scan. When the scan is complete you should see something like this in the repository’s Security tab:

Note: If you’re using my health-app repository, please be aware that the included codeql-custom.yml workflow requires GitHub Advanced Security. If you don’t have Advanced Security, you can still test the custom workflow by following the steps above.

While this process will work for testing our workflow, in the long run, it’s better to use a custom CodeQL configuration file, not the Actions workflow, to manage which custom queries you run.

Exploring further possibilities

You can create multiple-language or multiple-configuration setups to quickly gather more information from a single run or perform multiple scans at once. For example, instead of specifying languages up front, you can automatically detect which languages are used in a repository and spawn appropriate scans based on the results. Here is an example of working with the GitHub CLI to fetch information:

gh api repos/${{ env.CI_REPOSITORY }}/languages -q 'keys[]'

And this documentation details how to customize your CodeQL scans.

Make something cool? Share it!

Of course, we just scratched the surface of what can and should be done with CodeQL. There is much more to be discovered in the documentation and the application itself. As you explore this powerful platform, you’ll probably find yourself making things that other people can use. If you create a query that could be useful in practically all codebases, you can submit your query to the open source CodeQL query repository. If it’s a bit more niche—for example, a query that’s only applicable to actions written in JavaScript—you can create your own query pack and share it through GitHub Packages. I look forward to seeing what you come up with.

Denys Lashchevskyi is a staff software engineer for Betsson Group, with experience in DevOps and building developer tooling for automation, integrations, and scripting. He’s been an active GitHub Actions developer for the last two years, creating and integrating actions, script development, and repository access management solutions. He loves to share his knowledge with the community, prefers simple solutions for complex problems, and tries to delete code whenever possible.

About The
ReadME Project

Coding is usually seen as a solitary activity, but it’s actually the world’s largest community effort led by open source maintainers, contributors, and teams. These unsung heroes put in long hours to build software, fix issues, field questions, and manage communities.

The ReadME Project is part of GitHub’s ongoing effort to amplify the voices of the developer community. It’s an evolving space to engage with the community and explore the stories, challenges, technology, and culture that surround the world of open source.

Explore

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A beginner’s guide to running and managing custom CodeQL queries

Set up a simple local environment

Create and run your first custom query

Automating CodeQL scans with GitHub Actions

Exploring further possibilities

Make something cool? Share it!

More stories

From chaos to clarity: Use code visibility to illuminate unfamiliar code

The case for using Rust in MLOps

Accelerate test-driven development with AI

About The
ReadME Project

Follow us:

Nominate a developer

Support the community

A beginner’s guide to running and managing custom CodeQL queries

Set up a simple local environment

Create and run your first custom query

Automating CodeQL scans with GitHub Actions

Exploring further possibilities

Make something cool? Share it!

More stories

From chaos to clarity: Use code visibility to illuminate unfamiliar code

The case for using Rust in MLOps

Accelerate test-driven development with AI

About The ReadME Project

Follow us:

Nominate a developer

Support the community

Sign Up For Newsletter

About The
ReadME Project