Skip to content

Conversation

asgerf
Copy link
Contributor

@asgerf asgerf commented Aug 27, 2025

Background: To ensure MaD library models have a near-zero cost for codebases that don't use the modelled library, we prune models based on what packages are imported in the current codebase. This means we don't parse the access paths or synthesise data flow nodes for irrelevant models.

This means that in order to generate DataFlow::Node, we first have to compute imported paths. There is thus a dependency on the Import class. However, the Import class also depends on local data flow. We therefore have TEarlyStageNode, which is used by Import but does not contain flow summary-generated nodes.

For overlay mode, the TEarlyStageNode has no effect on locality as the entire newtype needs to be made local. The dependency above puts us in an "all or nothing" situation where a lot needs to be made local in order for DataFlow::Node to become local.

In order to simplify the problem, I'm cutting the dependency in this PR, so pruning of flow summaries only depends on an over-approximation, roughly based on what string literals appear in the program.

Note that there are more dependencies to be cut, but I'm trying to split this into small independent PRs.

Evaluation looks neutral.

@asgerf asgerf requested a review from a team as a code owner August 27, 2025 11:41
@Copilot Copilot AI review requested due to automatic review settings August 27, 2025 11:41
@github-actions github-actions bot added the JS label Aug 27, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR changes the pruning mechanism for MaD library models to avoid dependency on the Import class, simplifying the dependency graph for constructing DataFlow::Node. The change replaces precise import path detection with an over-approximation based on string literals to break circular dependencies between imports and data flow analysis.

  • Removes dependency on Import class in isPackageUsed predicate
  • Uses string literals as an over-approximation for imported packages
  • Adds support for string literal type expressions used in dynamic imports

package = any(JS::Import imp).getImportedPathString()
// To simplify which dependencies are needed to construct DataFlow::Node, we don't want to rely on `Import` here.
// Just check all string literals.
package = any(JS::Expr imp).getStringValue()
Copy link
Preview

Copilot AI Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using any(JS::Expr imp).getStringValue() will iterate over all expressions in the codebase to find string values, which could be inefficient. Consider using a more specific expression type like JS::StringLiteral to reduce the search space.

Suggested change
package = any(JS::Expr imp).getStringValue()
package = any(JS::StringLiteral imp).getStringValue()

Copilot uses AI. Check for mistakes.

@asgerf asgerf added the no-change-note-required This PR does not need a change note label Aug 27, 2025
@asgerf asgerf merged commit 4437f47 into github:main Aug 28, 2025
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
JS no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants