ForesightLM studies whether token-level autoregressive language models can acquire sentence-level foresight through an auxiliary sentence-boundary future semantic objective.
The project includes:
- Core ForesightLM training scripts
- Baseline / Core / K-horizon ablations
- Semantic reranking experiments
- Future-head-aware reranking and calibration sweep
- WikiText-103 and WritingPrompts domain-transfer experiments
- Domain generation diagnostics
- Bootstrap confidence intervals
- Qualitative example mining
- Compute-cost accounting
- Human-evaluation blind item sheet
ForesightLM preserves token-level autoregressive generation while adding an auxiliary objective at sentence boundaries. The model predicts the embedding of a future sentence using a frozen sentence encoder and a learned projection head.
See:
results/reproducibility/foresightlm_repro_package/README_REPRODUCIBILITY.mdresults/reproducibility/foresightlm_repro_package/manifest.jsonresults/reproducibility/foresightlm_repro_package/referenced_large_files_manifest.csv
Large JSONL files, model checkpoints, caches, and cluster logs are not committed. Their paths and SHA256 hashes are listed in the reproducibility manifests when available.
Blind annotator files are under:
results/human_eval/human_eval_blind_items.csvresults/human_eval/human_eval_instructions.md
The answer key is intentionally excluded from Git while annotation is active.
The Core ForesightLM DistilGPT-2 checkpoint is available at:
https://huggingface.co/Mandotosh/foresightlm-core-distilgpt2