Re-resolve lexical accesses when new closures reach a shared node#532
Conversation
…la#690). WALA's `AstConstraintVisitor.visitLexical` resolves a lexical access's defining frames from a one-time snapshot of the node's function-object points-to set; the re-evaluation side effect is commented out upstream. When distinct closures of the same function share a call-graph node under call-string context truncation, the closure arriving after the snapshot never gets its frame wired, starving every dispatch downstream of the access — at whole-project scale, one same-named sibling's method nodes silently vanish. `PythonConstraintVisitor` now re-registers the resolution as a side effect on the function value's points-to set, so closure growth re-runs it; the constraint additions are idempotent. Guards: the NLPGNN subject vendored verbatim (94 files) with both `tests/TG/EN` siblings pinned to symmetric `predict` typing. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lk6J7cxYf5vsjm139L25pc
The pre-commit hook pins an older Black than CI's formatting check; the newer style adds a blank line after module docstrings. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lk6J7cxYf5vsjm139L25pc
There was a problem hiding this comment.
Pull request overview
This PR addresses a whole-project-scale call-graph soundness issue in WALA’s Python builder where lexical accesses in shared closure nodes can miss newly arriving closure instances, causing downstream dispatch (e.g., model.predict) to vanish from the call graph. The fix adds a Python-specific side-effect to re-resolve lexical reads/writes whenever the node’s function-object (VN 1) points-to set grows, and it adds a vendored NLPGNN “full project” fixture to guard the regression.
Changes:
- Add a side-effect in
PythonSSAPropagationCallGraphBuilder.PythonConstraintVisitorto re-run lexical resolution on closure growth forAstLexicalRead/AstLexicalWrite. - Vendor NLPGNN full-project Python sources and scripts into the test data tree to reproduce the cross-closure shared-node scenario.
- Add/extend NLPGNN-related scripts in the fixture to ensure both
generation.pyandinteractive.pypatterns are present for regression coverage.
Reviewed changes
Copilot reviewed 96 out of 96 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| com.ibm.wala.cast.python/source/com/ibm/wala/cast/python/ipa/callgraph/PythonSSAPropagationCallGraphBuilder.java | Adds a Python-builder-only side effect to refresh lexical resolution when new closure function objects reach VN 1. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/TG/EN/interactive.py | Adds NLPGNN “interactive” generation script fixture to reproduce missing sibling predict nodes. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/TG/EN/generation.py | Adds NLPGNN “generation” script fixture used as the sibling/control case. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_ZH/ner_data_preprocess.py | Adds vendored NER preprocessing script fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_ZH/bert_ner_train.py | Adds vendored BERT NER training script fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_ZH/bert_ner_test.py | Adds vendored BERT NER test script fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_ZH/bert_ner_crf_train.py | Adds vendored BERT+CRF NER training script fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_ZH/bert_ner_crf_test.py | Adds vendored BERT+CRF NER test script fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_EN/data_processing.py | Adds vendored English NER data processing fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_EN/bert_ner_train.py | Adds vendored English BERT NER training fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_EN/bert_ner_test.py | Adds vendored English BERT NER test fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_EN/albert_ner_train.py | Adds vendored English ALBERT NER training fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/NER/NER_EN/albert_ner_test.py | Adds vendored English ALBERT NER test fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/KG2E/run_tucker.py | Adds vendored KG embedding training script fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/GNN/nodes_graph_classfication/train_graphsage.py | Adds vendored GraphSAGE training script fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/GNN/nodes_graph_classfication/train_gin.py | Adds vendored GIN training script fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/GNN/nodes_graph_classfication/train_gcn.py | Adds vendored GCN training script fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/GNN/nodes_graph_classfication/train_gan.py | Adds vendored GAT training script fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/GNN/gnn_for_nlp/text_sage.py | Adds vendored text-graph GraphSAGE script fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/GNN/BERT-TextGCN/transformer.py | Adds vendored BERT-TextGCN transformer layer fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/GNN/BERT-TextGCN/bert.py | Adds vendored BERT-TextGCN model fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/GNN/BERT-TextGCN/attention.py | Adds vendored BERT-TextGCN attention fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/GNN/auto_encoder/GAAE.py | Adds vendored graph autoencoder script fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/CLS/TextCNN/text_cnn_train.py | Adds vendored TextCNN training script fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/CLS/TextCNN/text_cnn_test.py | Adds vendored TextCNN test script fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/CLS/BilstmAttention/bilstm_attention_train.py | Adds vendored BiLSTM+attention training fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/CLS/BilstmAttention/bilstm_attention_test.py | Adds vendored BiLSTM+attention test fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/CLS/BERT/bert_classification_train.py | Adds vendored BERT classification training fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/CLS/BERT/bert_classification_test.py | Adds vendored BERT classification test fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/CLS/ALBERT/albert_cls_train.py | Adds vendored ALBERT classification training fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/tests/CLS/ALBERT/albert_cls_test.py | Adds vendored ALBERT classification test fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/setup.py | Adds vendored setup script for the NLPGNN fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/tokenizers/gpt2_tokenization.py | Adds vendored GPT-2 tokenization implementation used by the fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/tokenizers/init.py | Adds vendored tokenizers package init (auto-import). |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/savers.py | Adds vendored checkpoint saver helper. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/sample/samples.py | Adds vendored sampling utilities; contains the shared step closure scenario. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/sample/init.py | Adds vendored sample package init. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/optimizers/init.py | Adds vendored optimizers package init (auto-import). |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/tucker.py | Adds vendored TuckER model fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/TextGCN2019.py | Adds vendored TextGCN model fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/TextCNN.py | Adds vendored TextCNN model fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/RGCN.py | Adds vendored RGCN model fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/PCNN.py | Adds vendored PCNN placeholder fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/GraphSage.py | Adds vendored GraphSAGE model fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/gpt2.py | Adds vendored GPT-2 model fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/GIN.py | Adds vendored GIN model fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/GCN.py | Adds vendored GCN model fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/GAT.py | Adds vendored GAT model fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/GAAE.py | Adds vendored graph autoencoder layer fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/bert.py | Adds vendored BERT model fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/albert.py | Adds vendored ALBERT model fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/models/init.py | Adds vendored models package init (explicit exports). |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/metrics/type.py | Adds vendored typing helpers used by metrics. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/metrics/Losess.py | Adds vendored loss helpers. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/metrics/init.py | Adds vendored metrics package init (auto-import). |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/layers/transformer.py | Adds vendored transformer layer fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/layers/normalization.py | Adds vendored normalization utilities. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/layers/gpt2_transformer.py | Adds vendored GPT-2 transformer block fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/layers/decoder.py | Adds vendored decoding helper fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/layers/bilstm.py | Adds vendored BiLSTM layer fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/layers/albert_transformer.py | Adds vendored ALBERT transformer block fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/layers/init.py | Adds vendored layers package init (auto-import). |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/gnn/TGCNConv.py | Adds vendored TextGCN convolution fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/gnn/RGCNConv.py | Adds vendored RGCN convolution fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/gnn/GSConv.py | Adds vendored GraphSAGE convolution fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/gnn/glob.py | Adds vendored graph utility placeholder. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/gnn/GINConv.py | Adds vendored GIN convolution fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/gnn/GCNConv.py | Adds vendored GCN convolution fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/gnn/GATConv.py | Adds vendored GAT convolution fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/gnn/GAAEConv.py | Adds vendored graph attention autoencoder convolution fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/gnn/init.py | Adds vendored gnn package init (auto-import). |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/datas/word2vec.py | Adds vendored word2vec/glove embedding loader fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/datas/init.py | Adds vendored datas package init (explicit exports). |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/callbacks.py | Adds vendored early-stopping callbacks fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/bpemd/bpe.py | Adds vendored BPE training utilities fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/abandoned/scatter.py | Adds vendored “abandoned” scatter utility fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/abandoned/GCNConvv0.py | Adds vendored “abandoned” GCN implementation fixture. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/abandoned/init.py | Adds vendored abandoned package init. |
| com.ibm.wala.cast.python.test/data/nlpgnn_full_proj/nlpgnn/init.py | Adds vendored package root init. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #532 +/- ##
============================================
+ Coverage 73.40% 73.71% +0.30%
- Complexity 3321 3372 +51
============================================
Files 298 298
Lines 22720 22748 +28
Branches 3830 3832 +2
============================================
+ Hits 16678 16768 +90
+ Misses 4466 4434 -32
+ Partials 1576 1546 -30 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lk6J7cxYf5vsjm139L25pc
The uncovered residue is the write-side twin and diagnostics: the guards exercise the read path ( |
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lk6J7cxYf5vsjm139L25pc
Problem
At whole-project scale on the NLPGNN subject,
tests/TG/EN/interactive.py'sGenGPT2.predict/calllose their call-graph nodes while the same-namedgeneration.pysibling keeps its own and absorbs both call sites' data (wala#690, split from wala#678). Both constructors bind and run correctly; the loss is in method dispatch downstream of the sharedsample_sequence.stepclosure.Root Cause
WALA's
AstConstraintVisitor.visitLexicalresolves a lexical access's defining frames from a one-time snapshot of the node's function-object (v1) points-to set; the re-evaluation side effect is commented out upstream. Under 1-CFA, call-string truncation makes onestepnode serve both siblings' closures. The closure whose dispatch created the node gets itsmodelframe wired; the sibling'sScopeMappingInstanceKeyarriving after the snapshot never does, so itsmodel.predictdispatch never materializes and its method nodes silently vanish. First-wins by fixpoint order: a ddmin over the fixture showed no single trigger file (each half of the extra scripts passes alone; the union fails), and at small scale both closures are present at snapshot time, which is the wala#685 cross-union instead.Fix
PythonConstraintVisitoroverridesvisitAstLexicalRead/visitAstLexicalWriteto register the missing side effect on v1's points-to set: every growth re-runs the superclass's lexical resolution, whose constraint additions are idempotent. This restores, scoped to the Python builder, the mechanism upstream disabled (reported upstream at wala/WALA#1990).Guards
The NLPGNN subject is vendored verbatim (
nlpgnn_full_proj, all 94.pyfiles, matching the consumer run's 94 seeded toplevels):testNlpgnnFullGeneration/testNlpgnnFullInteractivepin both siblings'predictnodes with symmetric parameter typing ([Dynamic, 1]of unknown dtype on vn 3). Before the fix, the interactive guard failsFunction must exist in call graph; per-sibling precision at the shared closure (the cross-sibling union) remains wala#685.Closes wala#690.
🤖 Generated with Claude Code
https://claude.ai/code/session_01Lk6J7cxYf5vsjm139L25pc