Skip to content

Add CodeQL Quantum models and queries (Java, C++) to experimental #19469

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 139 commits into from
May 12, 2025

Conversation

nicolaswill
Copy link
Contributor

@nicolaswill nicolaswill commented May 8, 2025

This pull request introduces libraries and queries for inventorying and analyzing cryptography through a shared language-independent model library, language-specific implementations, and library-specific models.

There is currently modelling for the Java Cryptography Architecture (JCA) and OpenSSL (C++), with DGML-based inventory graph output for both currently supported languages. Our Java implementation further provides a set of analysis queries and inventory subset ("slice") output queries.

nicolaswill and others added 30 commits January 23, 2025 12:46
Concepts for elliptic curves and misc. updates.
Move language-agnostic model to shared library
…nwrap and doFinal calls. Corrected pathing for init tracing to detect what mode is being set along a path. Added support for tracing the init operation mode argument to source. Since this involved creating an Operation Mode, changes were also made to make cipher block modes (CBC) more explicit (previously just called mode, but now that term is used for various purposes).
@nicolaswill nicolaswill added the no-change-note-required This PR does not need a change note label May 8, 2025
@michaelnebel
Copy link
Contributor

A couple of questions/comments:

  • Should we make a DCA suite for testing the quantum queries (let me know, if you need some assistance with this).
  • I have not reviewed the quality of the QL code itself. Is the intention that we do this as a part of migrating it out of experimental?

@michaelnebel
Copy link
Contributor

@nicolaswill

  • Should we make a DCA suite for testing the quantum queries (let me know, if you need some assistance with this)?
  • I have not reviewed the quality of the QL code itself. Is the intention that we do this as a part of migrating it out of experimental?
  • The internal PR with the qlpack name needs to be updated as well (and the submodule pointer needs to be bumped)

@nicolaswill
Copy link
Contributor Author

Thanks for your review!

Should we make a DCA suite for testing the quantum queries (let me know, if you need some assistance with this)?

Over the next month, we will be revising the existing queries and rapidly adding new ones. Already using DCA would be great if frequent changes to existing queries won't impede DCA testing.

I have not reviewed the quality of the QL code itself. Is the intention that we do this as a part of migrating it out of experimental?

That's the idea. At this stage, the CodeQL architecture in Model.qll is ready for a holistic review (and help there would be very much appreciated), but it's under active development, and and I am aware of changes that should be done before we move it out of experimental.

The internal PR with the qlpack name needs to be updated as well (and the submodule pointer needs to be bumped)

Done. I've bumped the submodule pointer to this branch's HEAD. We should probably squash this PR's messy commit history before merging though.

@nicolaswill nicolaswill requested a review from michaelnebel May 12, 2025 08:56
Copy link
Contributor

@michaelnebel michaelnebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some sporadic comments; With a change of this magnitude, I don't think I can provide a meaningful semantic review.
Maybe it is worth considering to start adding

  • Tests
  • .qhelp.

/**
* An element that is flow-aware, i.e., it has an input and output node implicitly used for data flow analysis.
*/
abstract class FlowAwareElement extends LocatableElement {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FlowAwareElement currently has to be exposed for modeling generic data-flow between its instances at the language level. It's used in Language.qll in the current implementation:

predicate isSink(DataFlow::Node sink) {
  sink = any(Crypto::FlowAwareElement other).getInputNode()
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, I should have elaborated:
For abstract classes that are not supposed to be extended outside the module where they are declared, we typically do something like:

abstract private class FlowAwareElementImpl extends LocatableElement {
...
}

final class FlowAwareElement = FlowAwareElementImpl;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I've changed FlowAwareElement to use that pattern as an example -- am also aware other similar instances exist where this pattern should be applied, but I will set that aside for later.

}

// Artifacts that may be outputs or inputs
newtype TKeyArtifactType =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider making this private and use KeyArtifactType outside this module.
Introduce helper predicates isSymmetricKeyType, isAssymetricKeyType and isUnknownKeyType on KeyArtifactType.

SM4() or
OtherSymmetricCipherType()

newtype TAsymmetricCipherType =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these types can be made private.
I think it is best practice to avoid exposing IPA types if possible.

Copy link
Contributor Author

@nicolaswill nicolaswill May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're currently using IPA types to enumerate various disjoint algorithms and their unions, so those attributes, explicit typing, and the minimal syntax required made them an attractive option. We would otherwise have to make a separate class for each type in order to have an interface that enables writing code like the following:

  AESGCMAlgorithmNode() {
    this.getAlgorithmType() = Crypto::KeyOpAlg::TSymmetricCipher(Crypto::KeyOpAlg::AES()) and
    this.getModeOfOperation().getModeType() = Crypto::GCM()
  }
exists(string upper | upper = name.toUpperCase() |
      upper.matches("AES%") and
      type = KeyOpAlg::TSymmetricCipher(KeyOpAlg::AES())
      or
      upper = "DES" and
      type = KeyOpAlg::TSymmetricCipher(KeyOpAlg::DES())
      or
...

Would this ever be acceptable or do we have to implement an alternative?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have to - these are just generic recommendations 😄
If you think it is best to keep it as it is - then definitely do so!

@nicolaswill
Copy link
Contributor Author

Annotating things with private is something I'm aware of needing to refactor in particular before promotion. We will do that once we have interfaces for the *Node classes we're comfortable with keeping stable, and those interfaces will likely live in another user-facing library file (perhaps Inventory.qll) that only private imports other libraries intended for library modeling (perhaps Model.qll / Instances.qll).

Thank you, though! A full-blown semantic review will have to be planned a bit or resourced and isn't what I had in mind right now before this goes on experimental.

@michaelnebel
Copy link
Contributor

michaelnebel commented May 12, 2025

DCA run: https://github.com/github/codeql-dca-main/issues/28769 (not sure if it works - but lets see 😄 )
This only covers problem/path problem queries - but it also looks like there are some table kind queries.

@michaelnebel
Copy link
Contributor

DCA finished - approximately 800 results on our Java suite: https://github.com/github/codeql-dca-main/blob/data/michaelnebel/quantum-experim__nightly__quantum-queries__1/reports/alert-comparison.md

Copy link
Contributor

@michaelnebel michaelnebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic! 🎉

@nicolaswill
Copy link
Contributor Author

Thanks @michaelnebel. Lastly, should I squash the 139-commit history before merging?

@MathiasVP
Copy link
Contributor

There's no requirements to squash merge when merging into github/codeql. So it's fine to simply hit merge 🙂

@nicolaswill nicolaswill merged commit d3282a9 into github:main May 12, 2025
46 checks passed
@nicolaswill nicolaswill deleted the quantum-experimental branch May 12, 2025 17:38
@yalan2ny
Copy link

اجرای DCA: https://github.com/github/codeql-dca-main/issues/28769 (مطمئن نیستم که کار می‌کند - اما ببینیم 😄) این فقط پرس‌وجوهای مشکل/مشکل مسیر را پوشش می‌دهد - اما به نظر می‌رسد که tableپرس‌وجوهای دیگری هم وجود دارند.

##Hi, I tried it, it didn't work for me and showed a 404 error.thank you

@michaelnebel
Copy link
Contributor

اجرای DCA: github/codeql-dca-main#28769 (مطمئن نیستم که کار می‌کند - اما ببینیم 😄) این فقط پرس‌وجوهای مشکل/مشکل مسیر را پوشش می‌دهد - اما به نظر می‌رسد که tableپرس‌وجوهای دیگری هم وجود دارند.

##Hi, I tried it, it didn't work for me and showed a 404 error.thank you

Yes, sorry, that is an internal performance measurement tool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ Java no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants