This is the source code for paper An Information-Theoretic Parameter-Free Bayesian Framework for Probing Labeled \Dependency Trees from Attention Score
- It's suggested that you use python 3.12 or at least python 3.10+. Install requirements following the
requirements.txt. - The code is runnable on Linux, if you run on Windows or MacOS, I'm not sure some packages are supported (like
tritonorpeft). - You should put the checkpoints of used LLMs in the directory called
pretrained-models, under the parent directory of this project root.
The implementation of IPBP is in explaination.ipynb. However, considering your hardware limit, the process might vary. A safe process (tested on a server of 4090 24G GPU and 64GiB RAM) is as follows (Automatic notebook execution tools (like papermill) is suggested to simplify the process):
-
Step one: attention score (
$\mathcal{A}_{\cdot,\cdot;\cdot}$ ) gathering: In the#papermill_description='INITIALIZE_ARGS'cell, please setLOAD_ATTN_SAMPLEStoFalseandBREAK_AFTER_EXTRACTtoTrue. After setting the args, you can run the notebook cell-by-cell and it will run theexplaination.ipynbnotebook until the cell with#papermill_description=SAVE_CONCATENATED_FEATURES. -
Step two: distribution ($\hat{f}(A{b,h}|L=l)$, $\hat{f}(A{b,h},L=l)$) and MI estimation:__ Please set
LOAD_KDE_ESTIMS=False,LOAD_MI=False. Then execute the notebook. If your machine has enough memory (more than100GiB), you can run until the cell with#papermill_description=INFERENCE_BY_ATTN_FEATURES. This will do posterior-based tree extracting. If not, break before that cell.
3 (If machine memory is not enough): Let all args in #papermill_description='INITIALIZE_ARGS' be default, and run the cell until #papermill_description=INFERENCE_BY_ATTN_FEATURES.
-
(For non-parametric baselines): The implementation of Probless and IoU is in
explaination_baselines.ipynb, you can run it and save Probeless and IoU matrices. -
(For parametric baseliens): The impelmentation of LFF and
$\mathcal{V}-Network$ is inexplanation_baselines.py, runningpython explaination.py mlp [ARGS]will train the LFF networks, andpython explaination.py individual_mlp [ARGS]trains the$\mathcal{V}-Network$ network.
After running the baselines and getting the matrix replacements for MI, run the explainaiton.ipynb, set LOAD_BASELINES=True (with other parameters as defaults), modify the paths in cell #papermill_description=LOAD_BASELINES to make them consistent with your saved baseline matrices. Set the INFER_MODE environ and run the notebook until cell with #papermill_description=INFERENCE_BY_ATTN_FEATURES.
conclusion.ipynb includes the code to varify the following conclusions:
- Left/right dependency label proportion (Section 4.5)
- Average height/MI-weighted layer index correlation calculation (Section 4.5)
- Joint density visualization (Appendix A)
For token rank calculation, please run python tokenrank_calculation.py -m <path_to_llm_checkpoint> -d <path_to_conll_data_file>.
This repository is currently the same as the anonymized version (https://anonymous.4open.science/r/IPBP-99F1) during this paper's reviewing process. The implementations of Many of the experiments I added during the rebuttal phase are not integrated into it yet. So if you meet any problems regarding running the code or missing some functionalities, do remind me by sending an email to hongxu001@e.ntu.edu.sg (preferred) or leaving a github issue. I'll handle ASAP.