Skip to content

kornosk/log-odds-ratio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Log-odds-ratio with Informative Dirichlet priors

This is an implementation based on the paper Fightin’ Words: Lexical Feature Selection and Evaluation for Identifying the Content of Political Conflict.

This is used for the language modeling for stance detection in the paper - Knowledge Enhanced Masked Language Model for Stance Detection.

Please see our stance detection repo 🚀

Usage

  1. Run the following commands.
python log_odds_ratio.py \
    --filepath_corpus_i=$FP_CORPUS_I \
    --filepath_corpus_j=$FP_CORPUS_J \
    --filepath_background_corpus=$BACKGROUND_CORPUS
  1. Among generated files, check out the z_scores.txt containing words sorted by Z-score. The top words more likely belong to corpus I while the botton words likely belong to corpus J, with respect to the background corpus.

About

Log-odds-ratio with informative Dirichlet priors

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages