Skip to content

Conversation

@IdrissRio
Copy link
Contributor

@IdrissRio IdrissRio commented Nov 25, 2025

See internal PR.

@github-actions github-actions bot added the C++ label Nov 25, 2025
@IdrissRio IdrissRio added depends on internal PR This PR should only be merged in sync with an internal Semmle PR labels Nov 25, 2025
@IdrissRio IdrissRio changed the title C/C++ overlay: Add basic Overlay.qll file C/C++ overlay: Add basic Overlay.qll file Nov 25, 2025
@IdrissRio IdrissRio marked this pull request as ready for review November 25, 2025 16:12
@IdrissRio IdrissRio requested a review from a team as a code owner November 25, 2025 16:12
Copilot AI review requested due to automatic review settings November 25, 2025 16:12
Copilot finished reviewing on behalf of IdrissRio November 25, 2025 16:19
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces overlay analysis support for C/C++ by adding a basic Overlay.qll file that defines predicates for handling entity discarding in overlay variants.

Key Changes:

  • Adds predicates to distinguish between overlay and base variants of C/C++ analysis
  • Implements logic to discard elements from the base variant when files have changed

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
cpp/ql/lib/semmle/code/cpp/internal/Overlay.qll New file defining overlay-specific predicates including variant detection, location handling, and entity discard logic
cpp/ql/lib/cpp.qll Imports the new Overlay module to integrate overlay functionality into the C/C++ library

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@IdrissRio IdrissRio added the no-change-note-required This PR does not need a change note label Nov 25, 2025
Copy link
Contributor

@jketema jketema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions below.

Comment on lines 27 to 30
var_decls(e, _, _, _, loc)
or
// Indirect location (entities)
exists(@var_decl vd | var_decls(vd, e, _, _, _) | var_decls(vd, _, _, _, loc))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's so special about var decls that only those occur here? Is that just because the internal test requires these?

Why is the "indirect" case needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing special about var_decls. It is the only one present because of our internal test. It gives us a way to verify the most basic behavior before expanding coverage later.

The indirect case filters out variable entities that doesn't have direct locations. Their locations come from their @var_decl declarations. If we skip this, queries that use variables without checking locations would still pick up base entities coming from changed files.

If we modify our internal test to a simple query, i.e.,:

import cpp

from Variable v
select v

we would not filter out the variable from the base database unless the indirect case is there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this all means. I'm generally confused by the terminology you're using here ("indirect"), because that does not match anything I ever encountered.

What problem are you trying to solve by adding

exists(@var_decl vd | var_decls(vd, e, _, _, _) | var_decls(vd, _, _, _, loc))

here?

My concern with all of this is, that by adding all these intricacies here, we're going to get something that is non-maintainable/scalable.

de.existsInBase() and
overlayChangedFiles(de.getFilePath()) and
// Only discard if ALL file paths are in changed files
forall(string path | path = de.getFilePath() | overlayChangedFiles(path))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that there are two cases here:

  • Elements that have a location in the relation that define them (like variable declarations). In this case I don't think we need the forall, and we can do what you were doing before.
  • Elements that do not come with their own location, but where the location can come from one or more other elements (this is probably just variables, functions, and types?).

Do we want to make that distinction more explicit?

overlay[discard_entity]
private predicate discardable(@element e) {
e = any(Discardable d | d.existsInBase() and overlayChangedFiles(d.getFilePath()))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're going to need to do something more complex for C. For example, if you have foo.h:

#define A 1

and bar.h:

#if A
int a;
#elif B
int b;
#endif

then if you change foo.h to

#define B 1

then bar.h's declarations change, but bar.h has not changed.

It gets even more complex when you might have different header variants, e.g. we might have one include or bar.h where A is defined, but a different include elsewhere where B is defined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something we should do here, or should it happen when we detect which files changed?

…n unchanged file

Previously, an entity would be discarded if it had any location in a changed
file. This caused issues for entities with multiple declaration entries, such
as extern variables declared in one file and defined in another.

For example, given:
  // a.c (changed)       // b.c (unchanged)
  extern int x;          int x;

The variable `x` should be preserved because it has a location in the
unchanged file b.c, even though it also has a location in the changed file a.c.
@IdrissRio IdrissRio force-pushed the idrissrio/cpp/overlay/overlay.qll branch from a06c6d9 to 3d69286 Compare November 27, 2025 08:22
igfoo
igfoo previously approved these changes Nov 27, 2025
Split the discard predicate into two: one for single-location elements and one for multi-location elements.
@IdrissRio IdrissRio force-pushed the idrissrio/cpp/overlay/overlay.qll branch from 1624c09 to eac06dd Compare November 28, 2025 10:31
@IdrissRio IdrissRio merged commit 9fd31bf into main Nov 28, 2025
17 checks passed
@IdrissRio IdrissRio deleted the idrissrio/cpp/overlay/overlay.qll branch November 28, 2025 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C++ depends on internal PR This PR should only be merged in sync with an internal Semmle PR no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants