Data for srl4il

We manually annotate predicate–argument structures for the 600 L2-L1 pairs as the basis for the semantic analysis of learner Chinese. The dataset includes four typologically different mother tongues, i.e., English (ENG), Japanese (JPN), Russian (RUS) and Arabic (ARA). Sub-corpus of each language consists of 150 sentence pairs.

The work is published in EMNLP 2018, entitled with "Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Parsing and L2-L1 Parallel Data". This project maintains the dataset. Hope the data can be helpful for your research in the field of semantic parsing for interlanguage. If you use the dataset, please cite the following papers:

Zi Lin, Yuguang Duan, Yuanyuan Zhao, Weiwei Sun and Xiaojun Wan. Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Parsing and L2-L1 Parallel Data. The 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018).

Yuanyuan Zhao, Nan Jiang, Weiwei Sun and Xiaojun Wan. Overview of the NLPCC 2018 Shared Task: Grammatical Error Correction. Natural Language Processing and Chinese Computing (NLPCC 2018).

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Parallel_Corpus.zip		Parallel_Corpus.zip
README.md		README.md
srl4il-master.zip		srl4il-master.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data for srl4il

About

Releases

Packages

Contributors 2

pkucoli/srl4il

Folders and files

Latest commit

History

Repository files navigation

Data for srl4il

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages