Skip to content

sidharthranjan/Tree-Linearizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

The nltk version is 3.0.0.

Hindi-Urdu Treebank (HUTB) Corpus is in CONLL format. The tree linearizer code (variant-generation-code.py) implements an algorithm that takes the dependency trees as input. The output that we get through this code is a set of counterfactual sentences with preverbal constituents re-ordered pre-verbally. The public version of the HUTB corpus is here: https://verbs.colorado.edu/hindiurdu/

Please cite the following work if you are using this code:
1. Ranjan, Sidharth, Rajakrishnan Rajkumar, and Sumeet Agarwal. "Locality and expectation effects in Hindi preverbal constituent ordering." Cognition 223 (2022): 104959.
2. Ranjan, Sidharth, et al. "Discourse Context Predictability Effects in Hindi Word Order." Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 2022.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages