Source Code and data used for CIKM Readability 2015 short paper
You can obtain the text at:
The code used in this work takes advantage of the ReadabilityCalculator python module, that can be downloaded using pip:
$ pip install ReadabilityCalculator
It contains the readability scores for every document from CLEF eHealth 2014/2015 dataset.
It contains the distribution of words and sentences for each preprocessing variant for the documents in CLEF eHealth 2014/2015 dataset.
It is the lucene result list based on a default VSM search using the topics from CLEF eHealth 2014.
Script that creates table 2 from the paper.
Script to unpack the original '.dat' files from CLEF and preprocess them using any of the boilerplate removal options.
Python script used to create the files in readability_score.tar.gz
Calculates the correlations between the ranking list generated by different readability measure for the same Lucene based initial ranked list.