Java library and command-line application for converting R models to PMML.
Table of Contents
- Additional information
- Fast and memory-efficient:
- Can produce a 5 GB Random Forest PMML file in less than 1 minute on a desktop PC
- Supported model and transformation types:
ada- Stochastic Boosting (SB) classification
bagging- Bagging classification
boosting- Boosting classification
preProcess- Transformation methods "range", "center", "scale" and "medianImpute"
train- Selected JPMML-R model types
caretEnsemble- Ensemble regression and classification
party- CHi-squared Automated Interaction Detection (CHAID) classification
earth- Multivariate Adaptive Regression Spline (MARS) regression
elmNN- Extreme Learning Machine (ELM) regression
party- Evolutionary Learning of Trees (EvTree) regression and classification
naiveBayes- Naive Bayes (NB) classification
svm- Support Vector Machine (SVM) regression, classification and anomaly detection
gbm- Gradient Boosting Machine (GBM) regression and classification
multnetsubtypes) - Generalized Linear Model with lasso or elasticnet regularization (GLMNet) regression and classification
cv.glmnet- Cross-validated GLMNet regression and calculation
iForest- Isolation Forest (IF) anomaly detection
WrappedModel- Selected JPMML-R model types.
nn- Neural Network (NN) regression
multinom- Multinomial log-linear classification
nnet.formula- Neural Network (NNet) regression and classification
ctree- Conditional Inference Tree (CIT) classification
party- Recursive Partytioning (Party) regression and classification
mvr- Multivariate Regression (MVR) regression
randomForest- Random Forest (RF) regression and classification
ranger- Random Forest (RF) regression and classification
lrm- Binary Logistic Regression (LR) classification
ols- Ordinary Least Squares (OLS) regression
rpart- Recursive Partitioning (RPart) regression and classification
scorecard- Scorecard regression
glm- Generalized Linear Model (GLM) regression and classification
kmeans- K-Means clustering
lm- Linear Model (LM) regression
xgb.Booster- XGBoost (XGB) regression and classification
- Data pre-processing using model formulae:
- Interaction terms
- Logical operators
- Relational operators
- Arithmetic operators
- Exponentiation operators
- Arithmetic functions
- Logical operators
- Production quality:
- Complete test coverage.
- Fully compliant with the JPMML-Evaluator library.
- Java 1.8 or newer.
Enter the project root directory and build using Apache Maven:
mvn clean install
The build produces an executable uber-JAR file
A typical workflow can be summarized as follows:
- Use R to train a model.
- Serialize the model in RDS data format to a file in a local filesystem.
- Use the JPMML-R command-line converter application to turn the RDS file to a PMML file.
The R side of operations
The following R script trains a Random Forest (RF) model and saves it in RDS data format to a file
library("randomForest") rf = randomForest(Species ~ ., data = iris) saveRDS(rf, "rf.rds")
The JPMML-R side of operations
Converting the RDS file
rf.rds to a PMML file
java -jar target/jpmml-r-executable-1.4-SNAPSHOT.jar --rds-input rf.rds --pmml-output rf.pmml
java -jar target/jpmml-r-executable-1.4-SNAPSHOT.jar --help
The conversion of large files (1 GB and beyond) can be sped up by increasing the JVM heap size using
java -Xms4G -Xmx8G -jar target/jpmml-r-executable-1.4-SNAPSHOT.jar --rds-input rf.rds --pmml-output rf.pmml
- Converting logistic regression models to PMML documents
- Deploying R language models on Apache Spark ML
JPMML-R is licensed under the terms and conditions of the GNU Affero General Public License, Version 3.0.
If you would like to use JPMML-R in a proprietary software project, then it is possible to enter into a licensing agreement which makes JPMML-R available under the terms and conditions of the BSD 3-Clause License instead.
JPMML-R is developed and maintained by Openscoring Ltd, Estonia.