Java library and command-line application for converting R models to PMML
Java R Rebol
Clone or download

README.md

JPMML-R

Java library and command-line application for converting R models to PMML.

Features

  • Fast and memory-efficient:
    • Can produce a 5 GB Random Forest PMML file in less than 1 minute on a desktop PC
  • Supported model and transformation types:
    • ada package:
      • ada - Stochastic Boosting (SB) classification
    • adabag package:
      • bagging - Bagging classification
      • boosting - Boosting classification
    • caret package:
      • preProcess - Transformation methods "range", "center", "scale" and "medianImpute"
      • train - Selected JPMML-R model types
    • caretEnsemble package:
      • caretEnsemble - Ensemble regression and classification
    • CHAID package:
      • party - CHi-squared Automated Interaction Detection (CHAID) classification
    • earth package:
      • earth - Multivariate Adaptive Regression Spline (MARS) regression
    • elmNN package:
      • elmNN - Extreme Learning Machine (ELM) regression
    • evtree package:
      • party - Evolutionary Learning of Trees (EvTree) regression and classification
    • e1071 package:
      • naiveBayes - Naive Bayes (NB) classification
      • svm - Support Vector Machine (SVM) regression, classification and anomaly detection
    • gbm package:
      • gbm - Gradient Boosting Machine (GBM) regression and classification
    • glmnet package:
      • glmnet (elnet, fishnet, lognet and multnet subtypes) - Generalized Linear Model with lasso or elasticnet regularization (GLMNet) regression and classification
      • cv.glmnet - Cross-validated GLMNet regression and calculation
    • IsolationForest package:
      • iForest - Isolation Forest (IF) anomaly detection
    • neuralnet package:
      • nn - Neural Network (NN) regression
    • nnet package:
      • multinom - Multinomial log-linear classification
      • nnet.formula - Neural Network (NNet) regression and classification
    • party package:
      • ctree - Conditional Inference Tree (CIT) classification
    • partykit package:
      • party - Recursive Partytioning (Party) regression and classification
    • pls package
      • mvr - Multivariate Regression (MVR) regression
    • randomForest package:
      • randomForest - Random Forest (RF) regression and classification
    • ranger package:
      • ranger - Random Forest (RF) regression and classification
    • rms package:
      • lrm - Binary Logistic Regression (LR) classification
      • ols - Ordinary Least Squares (OLS) regression
    • rpart package:
      • rpart - Recursive Partitioning (RPart) regression and classification
    • r2pmml package:
      • scorecard - Scorecard regression
    • stats package:
      • glm - Generalized Linear Model (GLM) regression and classification
      • kmeans - K-Means clustering
      • lm - Linear Model (LM) regression
    • xgboost package:
      • xgb.Booster - XGBoost (XGB) regression and classification
  • Production quality:
    • Complete test coverage.
    • Fully compliant with the JPMML-Evaluator library.

Prerequisites

  • Java 1.8 or newer.

Installation

Enter the project root directory and build using Apache Maven:

mvn clean install

The build produces an executable uber-JAR file target/jpmml-r-executable-1.3-SNAPSHOT.jar.

Usage

A typical workflow can be summarized as follows:

  1. Use R to train a model.
  2. Serialize the model in RDS data format to a file in a local filesystem.
  3. Use the JPMML-R command-line converter application to turn the RDS file to a PMML file.

The R side of operations

The following R script trains a Random Forest (RF) model and saves it in RDS data format to a file rf.rds:

library("randomForest")

rf = randomForest(Species ~ ., data = iris)

saveRDS(rf, "rf.rds")

The JPMML-R side of operations

Converting the RDS file rf.rds to a PMML file rf.pmml:

java -jar target/jpmml-r-executable-1.3-SNAPSHOT.jar --rds-input rf.rds --pmml-output rf.pmml

Getting help:

java -jar target/jpmml-r-executable-1.3-SNAPSHOT.jar --help

The conversion of large files (1 GB and beyond) can be sped up by increasing the JVM heap size using -Xms and -Xmx options:

java -Xms4G -Xmx8G -jar target/jpmml-r-executable-1.3-SNAPSHOT.jar --rds-input rf.rds --pmml-output rf.pmml

License

JPMML-R is dual-licensed under the GNU Affero General Public License (AGPL) version 3.0, and a commercial license.

Additional information

JPMML-R is developed and maintained by Openscoring Ltd, Estonia.

Interested in using JPMML software in your application? Please contact info@openscoring.io