This is an unoffical implementation of TaskLAMA.
🚧 Currently, this repo is under construction. Please don't take the results seriously.
Original data link: [link] . This data is licensed under CC-BY 4.0.
Huggingface data link: [link] .
Below lists the statistics of the test data.
- An complex example of "Task: ake coconut rice - Assumption: Making coconut rice with ghee. - 15 steps"
- Task statistics
#Task | #Task w/ line-order steps | #Task w/ DAG-order steps | #Task w/o Assumption |
---|---|---|---|
478 | 341 | 137 | 12 |
- The number of steps in test set
- The number of DAG roots (initial steps)
- The distribution of DAG width and depth
@misc{yuan2023tasklama,
title={TaskLAMA: Probing the Complex Task Understanding of Language Models},
author={Quan Yuan and Mehran Kazemi and Xin Xu and Isaac Noble and Vaiva Imbrasaite and Deepak Ramachandran},
year={2023},
eprint={2308.15299},
archivePrefix={arXiv},
primaryClass={cs.CL}
}