Skip to content

ruoyuxie/interpretable_dialect_classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Interpretable Dialect Classifier

This is the official repository for Extracting Lexical Features from Dialects via Interpretable Dialect Classifiers . We provide the code for training and evaluating the dialect classifiers, as well as the code for extracting and evaluating the lexical features.

Requirements

pip install -r requirements.txt

Data

The data used in this work are the FRMT, LSDC, ITDI, and Europarl v8 datasets. The processed data and data processing scripts are placed in the data directory.

Training

Both training code for LOO and selfExplain are placed in the model directory.

Evaluation

The evaluation code and data including plasusibility, sufficiency, and human evaluation are placed in the evaluation directory.

Citation

If you use our tool, we'd appreciate if you cite the following paper:

@misc{xie2024extracting,
      title={Extracting Lexical Features from Dialects via Interpretable Dialect Classifiers}, 
      author={Roy Xie and Orevaoghene Ahia and Yulia Tsvetkov and Antonios Anastasopoulos},
      year={2024},
      eprint={2402.17914},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

About

The official repository for Extracting Lexical Features from Dialects via Interpretable Dialect Classifiers

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published