GitHub - ChristLBUPT/IPBP: Source code for ICLR 2026 paper An Information-Theoretic Parameter-Free Bayesian Framework for Probing Labeled Dependency Trees from Attention Score.

This is the source code for paper An Information-Theoretic Parameter-Free Bayesian Framework for Probing Labeled \Dependency Trees from Attention Score

Prerequistes

It's suggested that you use python 3.12 or at least python 3.10+. Install requirements following the requirements.txt.
The code is runnable on Linux, if you run on Windows or MacOS, I'm not sure some packages are supported (like triton or peft).
You should put the checkpoints of used LLMs in the directory called pretrained-models, under the parent directory of this project root.

To run the IPBP process

The implementation of IPBP is in explaination.ipynb. However, considering your hardware limit, the process might vary. A safe process (tested on a server of 4090 24G GPU and 64GiB RAM) is as follows (Automatic notebook execution tools (like papermill) is suggested to simplify the process):

Step one: attention score ($\mathcal{A}_{\cdot,\cdot;\cdot}$) gathering: In the #papermill_description='INITIALIZE_ARGS' cell, please set LOAD_ATTN_SAMPLES to False and BREAK_AFTER_EXTRACT to True. After setting the args, you can run the notebook cell-by-cell and it will run the explaination.ipynb notebook until the cell with #papermill_description=SAVE_CONCATENATED_FEATURES.
Step two: distribution ($\hat{f}(A{b,h}|L=l)$, $\hat{f}(A{b,h},L=l)$) and MI estimation:__ Please set LOAD_KDE_ESTIMS=False, LOAD_MI=False. Then execute the notebook. If your machine has enough memory (more than 100GiB), you can run until the cell with #papermill_description=INFERENCE_BY_ATTN_FEATURES. This will do posterior-based tree extracting. If not, break before that cell.

3 (If machine memory is not enough): Let all args in #papermill_description='INITIALIZE_ARGS' be default, and run the cell until #papermill_description=INFERENCE_BY_ATTN_FEATURES.

To run baselines

(For non-parametric baselines): The implementation of Probless and IoU is in explaination_baselines.ipynb, you can run it and save Probeless and IoU matrices.
(For parametric baseliens): The impelmentation of LFF and $\mathcal{V}-Network$ is in explanation_baselines.py, running python explaination.py mlp [ARGS] will train the LFF networks, and python explaination.py individual_mlp [ARGS] trains the $\mathcal{V}-Network$ network.

After running the baselines and getting the matrix replacements for MI, run the explainaiton.ipynb, set LOAD_BASELINES=True (with other parameters as defaults), modify the paths in cell #papermill_description=LOAD_BASELINES to make them consistent with your saved baseline matrices. Set the INFER_MODE environ and run the notebook until cell with #papermill_description=INFERENCE_BY_ATTN_FEATURES.

To run analysis in Section 4.5 (Further analysis) and Appendix A/B.

conclusion.ipynb includes the code to varify the following conclusions:

Left/right dependency label proportion (Section 4.5)
Average height/MI-weighted layer index correlation calculation (Section 4.5)
Joint density visualization (Appendix A)

For token rank calculation, please run python tokenrank_calculation.py -m <path_to_llm_checkpoint> -d <path_to_conll_data_file>.

Finally

This repository is currently the same as the anonymized version (https://anonymous.4open.science/r/IPBP-99F1) during this paper's reviewing process. The implementations of Many of the experiments I added during the rebuttal phase are not integrated into it yet. So if you meet any problems regarding running the code or missing some functionalities, do remind me by sending an email to hongxu001@e.ntu.edu.sg (preferred) or leaving a github issue. I'll handle ASAP.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Prerequistes

To run the IPBP process

To run baselines

To run analysis in Section 4.5 (Further analysis) and Appendix A/B.

Finally

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
LICENSE		LICENSE
conclusion.ipynb		conclusion.ipynb
dep_model.py		dep_model.py
explaination.ipynb		explaination.ipynb
explaination.py		explaination.py
explaination_baselines.ipynb		explaination_baselines.ipynb
explaination_baselines.py		explaination_baselines.py
readme.md		readme.md
requirements.txt		requirements.txt
tokenrank_calculation.py		tokenrank_calculation.py

Folders and files

Latest commit

History

Repository files navigation

Prerequistes

To run the IPBP process

To run baselines

To run analysis in Section 4.5 (Further analysis) and Appendix A/B.

Finally

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages