Skip to content

bgreenwell/rstratx

Repository files navigation

rstratx

The rstratx provides an interface to stratx, a Python library for A Stratification Approach to Partial Dependence for Codependent Variables. Currently, only the StratPD algorithm is supported (which only applies to numeric features).

WARNING: This package is under heavy development. The underlying Python code needs cleaned up, and imports aren’t really handled that gracefully on the R side. Use at your own risk.

Installation

# You can install the development version from GitHub:
if (!("remotes" %in% installed.packages()[, "Package"])) {
  install.packages("remotes")
}
remotes::install_github("bgreenwell/rstratx")

Example

Here’s a basic example using the well-known Boston housing data set:

# Load required packages
library(pdp)      # for ordinary partial dependence
library(ranger)   # for random forest algorithm
#> Warning: package 'ranger' was built under R version 3.5.2
library(reticulate)  # for interfacing with Python
#> Warning: package 'reticulate' was built under R version 3.5.2
use_python("/Users/b780620/anaconda3/bin/python3", required = TRUE)  # FIXME
library(rstratx)  # for stratified partial dependence

# Load the Boston housing data
data(boston, package = "pdp")

#
# Ordinary partial dependence
#

# Fit a (default) random forest model and construct PDP for age
set.seed(1818)  # for reproducibility
rfo <- ranger(cmedv ~ ., data = boston)
partial(rfo, pred.var = "age", plot = TRUE)

#
# Stratified partial dependence
#

# Compute stratified partial dependence for age (auto fits an RF)
spd <- stratpd(
  X = subset(boston, select = -cmedv), 
  y = boston[, "cmedv", drop = FALSE],  # needs a one-column data frame (for now)
  feature_name = "age"
)

# Plot results
par(mar = c(4, 4, 1, 1) + 0.1)
plot(spd, type = "l", lwd = 2, las = 1, ylim = c(-10, 10))

About

An R interface to the stratx Python library

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages