Skip to content

KaiyangWan/InfoQA

Repository files navigation

InfoQA

This repository includes code and data to reproduce all results: dataset construction, single-pass baselines, InfoQA, and the fitting scripts that align empirical accuracy with our theoretical capacity curves.

Repository Layout

 ```
InfoQA/
├── datasets/ # Synthetic benchmark (controllable hops & noise)
│ ├── 1hop/ 2hop/ 3hop/ 4hop/ # JSON contexts for each hop and length bucket
│ ├── general_noise.json
│ ├── multi_hop_chain_company_stats.json
│ └── syn_data.py # Script to (re)generate the datasets
│
├── fitting/
│ └── draw_all.py # Fit empirical results to theory & plot curves
│
├── utils/
│ ├── utils.py
│ ├── infoqa_wo_decom.py # Ablation: without decomposition
│ ├── infoqa_wo_pru.py # Ablation: without pruning
│ ├── MHQA_direct.py # Direct prompting
│ ├── MHQA_cot.py # Chain-of-Thought
│ ├── MHQA_SC.py # Self-Consistency
│ ├── MHQA_SFRF.py # Self-Refine
│ ├── MHQA_ReAct.py # ReAct
│ ├── MHQA_plan_and_solve.py # Plan-and-Solve
│ ├── MHQA_self_ask.py # Self-Ask
│ └── MHQA_infoqa.py # InfoQA (proof-of-concept)
│
├── run_all.sh # Reproduce main experiments end-to-end
├── requirements.txt # Python dependencies
└── README.md
``` 

Setup

  • Python 3.10
  • Create a virtual environment and install dependencies:
conda create -n infoqa python=3.10
conda activate infoqa
pip install -r requirements.txt

Reproducing Results

  1. (Optional) Regenerate the Synthetic Benchmark
cd datasets
python syn_data.py
cd ..

Run All Methods (Baselines + InfoQA + Ablations)

bash run_all.sh

Default settings match the paper (temperature = 0.2, max generation length = 4096). Outputs, including metrics and logs, are written to method-specific folders (see run_all.sh).

Fit Theory & Plot

cd fitting
python draw_all.py
cd ..

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors