Open-source Java library for learning rule models from decision examples and applying these models to classify or rank new examples.
Rule models are induced according to VC-DomLEM sequential covering algorithm presented in [1]. The learning is preceded by analysis of consistency of data, which is based on rough set theory. More precisely, this library implements dominance-based rough et approaches: original one (DRSA) [2] and its variable consistency extensions (VC-DRSA) [3]. Rule models can be used to classify new examples, using VC-DRSA classifier [4], or MODE classifier [5]. During data analysis, missing attribute values are handled [6].
ruleLearn also allows to validate constructed rule models in stratified cross-validation.
Data sets analyzed by ruleLearn are represented as decision tables, which are composed of objects described by attributes. Data sets should be provided in JSON
+CSV
or JSON
+JSON
format (metadata always in JSON
format, evaluations of objects either in JSON
or in CSV
format).
We consider the following use cases, which are typical forms of use of ruleLearn.
In this case we will have to calculate lower and upper approximations of unions od ordered decision classes represented in data.
repositories {
maven { url 'https://jitpack.io' }
}
$RL_VERSION = '0.25.0'
dependencies {
compile("com.github.ruleLearn:rulelearn:${RL_VERSION}")
}
<repositories>
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>com.github.ruleLearn</groupId>
<artifactId>rulelearn</artifactId>
<version>0.25.0</version>
</dependency>
</dependencies>
Developers should either extend/supply gradle.properties
file (in the main directory of the project
or in the USER_HOME/.gradle
directory), specifying inside this file a local path to the installed Java JDK,
e.g., on a Windows machine:
org.gradle.java.home=C:\\Program Files\\Java\\jdk-21
or set JAVA_HOME
environmental variable to local path to the installed Java JDK.
When importing ruleLearn into an IDE (e.g., Eclipse, IntelliJ IDEA), one should specify the following settings: UTF-8
encoding, and LF
(i.e., line feed) line endings.
[1]: Błaszczyński, J., Słowiński, R. , Szeląg, M., Sequential Covering Rule Induction Algorithm for Variable Consistency Rough Set Approaches. Information Sciences, 181, 2011, pp. 987-1002.
[2]: Greco, S., Matarazzo, B., Słowiński, R., Rough Sets Theory for Multicriteria Decision Analysis. European Journal of Operational Research, 129(1), 2001, pp. 1-47.
[3]: Błaszczyński, J., Greco, S., Słowiński, R., Szeląg, M., Monotonic Variable Consistency Rough Set Approaches. International Journal of Approximate Reasoning, 50(7), 2009, pp. 979-999.
[4]: Błaszczyński, J., Greco, S., Słowiński, R., Multi-criteria classification - A new scheme for application of dominance-based decision rules. European Journal of Operational Research, 181(3), 2007, pp. 1030-1044.
[5]: Szeląg, M., Słowiński, R., Dominance-based Rough Set Approach to Bank Customer Satisfaction Analysis. [In]: P. Jędrzejowicz, I. Czarnowski, A. Skakovski, M. Forkiewicz, M. Szarmach, P. Wolski (Eds.), PP-RAI'2022, Proceedings of the 3rd Polish Conference on Artificial Intelligence, April 25-27, 2022, Gdynia, Poland, Publishing House of Gdynia Maritime University, Gdynia, Poland, pp. 147-150.
[6]: Szeląg, M., Błaszczyński, J., Słowiński, R., Rough Set Analysis of Classification Data with Missing Values. [In]: L. Polkowski et al. (Eds.): Rough Sets, International Joint Conference, IJCRS 2017, Olsztyn, Poland, July 3–7, 2017, Proceedings, Part I. Lecture Notes in Artificial Intelligence, vol. 10313, Springer, 2017, pp. 552–565.