Skip to content

Re-resolve lexical accesses when new closures reach a shared node#532

Merged
khatchad merged 4 commits into
masterfrom
fix-690-lexical-refresh
Jul 3, 2026
Merged

Re-resolve lexical accesses when new closures reach a shared node#532
khatchad merged 4 commits into
masterfrom
fix-690-lexical-refresh

Conversation

@khatchad

@khatchad khatchad commented Jul 3, 2026

Copy link
Copy Markdown
Member

Problem

At whole-project scale on the NLPGNN subject, tests/TG/EN/interactive.py's GenGPT2.predict/call lose their call-graph nodes while the same-named generation.py sibling keeps its own and absorbs both call sites' data (wala#690, split from wala#678). Both constructors bind and run correctly; the loss is in method dispatch downstream of the shared sample_sequence.step closure.

Root Cause

WALA's AstConstraintVisitor.visitLexical resolves a lexical access's defining frames from a one-time snapshot of the node's function-object (v1) points-to set; the re-evaluation side effect is commented out upstream. Under 1-CFA, call-string truncation makes one step node serve both siblings' closures. The closure whose dispatch created the node gets its model frame wired; the sibling's ScopeMappingInstanceKey arriving after the snapshot never does, so its model.predict dispatch never materializes and its method nodes silently vanish. First-wins by fixpoint order: a ddmin over the fixture showed no single trigger file (each half of the extra scripts passes alone; the union fails), and at small scale both closures are present at snapshot time, which is the wala#685 cross-union instead.

Fix

PythonConstraintVisitor overrides visitAstLexicalRead/visitAstLexicalWrite to register the missing side effect on v1's points-to set: every growth re-runs the superclass's lexical resolution, whose constraint additions are idempotent. This restores, scoped to the Python builder, the mechanism upstream disabled (reported upstream at wala/WALA#1990).

Guards

The NLPGNN subject is vendored verbatim (nlpgnn_full_proj, all 94 .py files, matching the consumer run's 94 seeded toplevels): testNlpgnnFullGeneration/testNlpgnnFullInteractive pin both siblings' predict nodes with symmetric parameter typing ([Dynamic, 1] of unknown dtype on vn 3). Before the fix, the interactive guard fails Function must exist in call graph; per-sibling precision at the shared closure (the cross-sibling union) remains wala#685.

Closes wala#690.

🤖 Generated with Claude Code

https://claude.ai/code/session_01Lk6J7cxYf5vsjm139L25pc

…la#690).

WALA's `AstConstraintVisitor.visitLexical` resolves a lexical access's
defining frames from a one-time snapshot of the node's function-object
points-to set; the re-evaluation side effect is commented out upstream.
When distinct closures of the same function share a call-graph node
under call-string context truncation, the closure arriving after the
snapshot never gets its frame wired, starving every dispatch downstream
of the access — at whole-project scale, one same-named sibling's method
nodes silently vanish. `PythonConstraintVisitor` now re-registers the
resolution as a side effect on the function value's points-to set, so
closure growth re-runs it; the constraint additions are idempotent.

Guards: the NLPGNN subject vendored verbatim (94 files) with both
`tests/TG/EN` siblings pinned to symmetric `predict` typing.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lk6J7cxYf5vsjm139L25pc
The pre-commit hook pins an older Black than CI's formatting check; the
newer style adds a blank line after module docstrings.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Lk6J7cxYf5vsjm139L25pc

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a whole-project-scale call-graph soundness issue in WALA’s Python builder where lexical accesses in shared closure nodes can miss newly arriving closure instances, causing downstream dispatch (e.g., model.predict) to vanish from the call graph. The fix adds a Python-specific side-effect to re-resolve lexical reads/writes whenever the node’s function-object (VN 1) points-to set grows, and it adds a vendored NLPGNN “full project” fixture to guard the regression.

Changes:

  • Add a side-effect in PythonSSAPropagationCallGraphBuilder.PythonConstraintVisitor to re-run lexical resolution on closure growth for AstLexicalRead/AstLexicalWrite.
  • Vendor NLPGNN full-project Python sources and scripts into the test data tree to reproduce the cross-closure shared-node scenario.
  • Add/extend NLPGNN-related scripts in the fixture to ensure both generation.py and interactive.py patterns are present for regression coverage.

Reviewed changes

Copilot reviewed 96 out of 96 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
com.ibm.wala.cast.python/source/com/ibm/wala/cast/python/ipa/callgraph/PythonSSAPropagationCallGraphBuilder.java Adds a Python-builder-only side effect to refresh lexical resolution when new closure function objects reach VN 1.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/TG/EN/interactive.py Adds NLPGNN “interactive” generation script fixture to reproduce missing sibling predict nodes.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/TG/EN/generation.py Adds NLPGNN “generation” script fixture used as the sibling/control case.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_ZH/ner_data_preprocess.py Adds vendored NER preprocessing script fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_ZH/bert_ner_train.py Adds vendored BERT NER training script fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_ZH/bert_ner_test.py Adds vendored BERT NER test script fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_ZH/bert_ner_crf_train.py Adds vendored BERT+CRF NER training script fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_ZH/bert_ner_crf_test.py Adds vendored BERT+CRF NER test script fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_EN/data_processing.py Adds vendored English NER data processing fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_EN/bert_ner_train.py Adds vendored English BERT NER training fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_EN/bert_ner_test.py Adds vendored English BERT NER test fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_EN/albert_ner_train.py Adds vendored English ALBERT NER training fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_EN/albert_ner_test.py Adds vendored English ALBERT NER test fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/KG2E/run_tucker.py Adds vendored KG embedding training script fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/GNN/nodes_graph_classfication/train_graphsage.py Adds vendored GraphSAGE training script fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/GNN/nodes_graph_classfication/train_gin.py Adds vendored GIN training script fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/GNN/nodes_graph_classfication/train_gcn.py Adds vendored GCN training script fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/GNN/nodes_graph_classfication/train_gan.py Adds vendored GAT training script fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/GNN/gnn_for_nlp/text_sage.py Adds vendored text-graph GraphSAGE script fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/GNN/BERT-TextGCN/transformer.py Adds vendored BERT-TextGCN transformer layer fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/GNN/BERT-TextGCN/bert.py Adds vendored BERT-TextGCN model fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/GNN/BERT-TextGCN/attention.py Adds vendored BERT-TextGCN attention fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/GNN/auto_encoder/GAAE.py Adds vendored graph autoencoder script fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/CLS/TextCNN/text_cnn_train.py Adds vendored TextCNN training script fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/CLS/TextCNN/text_cnn_test.py Adds vendored TextCNN test script fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/CLS/BilstmAttention/bilstm_attention_train.py Adds vendored BiLSTM+attention training fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/CLS/BilstmAttention/bilstm_attention_test.py Adds vendored BiLSTM+attention test fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/CLS/BERT/bert_classification_train.py Adds vendored BERT classification training fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/CLS/BERT/bert_classification_test.py Adds vendored BERT classification test fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/CLS/ALBERT/albert_cls_train.py Adds vendored ALBERT classification training fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/CLS/ALBERT/albert_cls_test.py Adds vendored ALBERT classification test fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/setup.py Adds vendored setup script for the NLPGNN fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/tokenizers/gpt2_tokenization.py Adds vendored GPT-2 tokenization implementation used by the fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/tokenizers/init.py Adds vendored tokenizers package init (auto-import).
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/savers.py Adds vendored checkpoint saver helper.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/sample/samples.py Adds vendored sampling utilities; contains the shared step closure scenario.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/sample/init.py Adds vendored sample package init.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/optimizers/init.py Adds vendored optimizers package init (auto-import).
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/tucker.py Adds vendored TuckER model fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/TextGCN2019.py Adds vendored TextGCN model fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/TextCNN.py Adds vendored TextCNN model fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/RGCN.py Adds vendored RGCN model fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/PCNN.py Adds vendored PCNN placeholder fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/GraphSage.py Adds vendored GraphSAGE model fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/gpt2.py Adds vendored GPT-2 model fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/GIN.py Adds vendored GIN model fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/GCN.py Adds vendored GCN model fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/GAT.py Adds vendored GAT model fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/GAAE.py Adds vendored graph autoencoder layer fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/bert.py Adds vendored BERT model fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/albert.py Adds vendored ALBERT model fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/init.py Adds vendored models package init (explicit exports).
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/metrics/type.py Adds vendored typing helpers used by metrics.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/metrics/Losess.py Adds vendored loss helpers.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/metrics/init.py Adds vendored metrics package init (auto-import).
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/layers/transformer.py Adds vendored transformer layer fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/layers/normalization.py Adds vendored normalization utilities.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/layers/gpt2_transformer.py Adds vendored GPT-2 transformer block fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/layers/decoder.py Adds vendored decoding helper fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/layers/bilstm.py Adds vendored BiLSTM layer fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/layers/albert_transformer.py Adds vendored ALBERT transformer block fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/layers/init.py Adds vendored layers package init (auto-import).
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/gnn/TGCNConv.py Adds vendored TextGCN convolution fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/gnn/RGCNConv.py Adds vendored RGCN convolution fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/gnn/GSConv.py Adds vendored GraphSAGE convolution fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/gnn/glob.py Adds vendored graph utility placeholder.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/gnn/GINConv.py Adds vendored GIN convolution fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/gnn/GCNConv.py Adds vendored GCN convolution fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/gnn/GATConv.py Adds vendored GAT convolution fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/gnn/GAAEConv.py Adds vendored graph attention autoencoder convolution fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/gnn/init.py Adds vendored gnn package init (auto-import).
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/datas/word2vec.py Adds vendored word2vec/glove embedding loader fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/datas/init.py Adds vendored datas package init (explicit exports).
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/callbacks.py Adds vendored early-stopping callbacks fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/bpemd/bpe.py Adds vendored BPE training utilities fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/abandoned/scatter.py Adds vendored “abandoned” scatter utility fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/abandoned/GCNConvv0.py Adds vendored “abandoned” GCN implementation fixture.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/abandoned/init.py Adds vendored abandoned package init.
com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/init.py Adds vendored package root init.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov

codecov Bot commented Jul 3, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 89.28571% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.71%. Comparing base (3c77559) to head (01472ef).

Files with missing lines Patch % Lines
...allgraph/PythonSSAPropagationCallGraphBuilder.java 85.71% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master     #532      +/-   ##
============================================
+ Coverage     73.40%   73.71%   +0.30%     
- Complexity     3321     3372      +51     
============================================
  Files           298      298              
  Lines         22720    22748      +28     
  Branches       3830     3832       +2     
============================================
+ Hits          16678    16768      +90     
+ Misses         4466     4434      -32     
+ Partials       1576     1546      -30     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 96 out of 96 changed files in this pull request and generated 2 comments.

@khatchad

khatchad commented Jul 3, 2026

Copy link
Copy Markdown
Member Author

❌ Patch coverage is 88.88889% with 3 lines in your changes missing coverage. Please review.

The uncovered residue is the write-side twin and diagnostics: the guards exercise the read path (visitAstLexicalRead's refresh restores the starved model read on the 94-file fixture), while visitAstLexicalWrite's registration—the same operator wired symmetrically for soundness on closure-captured stores—plus the operator's toString and the negative equals branch have no dedicated fixture. Project coverage rises overall (+0.31%).

@khatchad khatchad enabled auto-merge July 3, 2026 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

A same-named class from a sibling script captures both constructor call sites in whole-project analysis

2 participants