R
Switch branches/tags
Nothing to show
Clone or download
1
Latest commit 3b877d5 Feb 2, 2017
Permalink
Failed to load latest commit information.
R refine Feb 1, 2017
README_files/figure-html refine README Jan 26, 2017
man refine Feb 1, 2017
tests add test Jun 12, 2016
vignettes refine vignette Jan 25, 2017
.Rbuildignore refine documents Aug 20, 2016
.gitignore fix Mar 28, 2016
.travis.yml refine documents Aug 20, 2016
DESCRIPTION refine documents Aug 20, 2016
LICENSE 2017 Jan 25, 2017
NAMESPACE fix Mar 28, 2016
README.Rmd refine README Jan 26, 2017
README.md refine README Jan 26, 2017
densratio.Rproj fix Mar 28, 2016

README.md

An R Package for Density Ratio Estimation

Koji MAKIYAMA (@hoxo_m)

Travis-CI Build Status CRAN Version CRAN Downloads Coverage Status

1. Overview

Density ratio estimation is described as follows: for given two data samples x and y from unknown distributions p(x) and q(y) respectively, estimate w(x) = p(x) / q(x), where x and y are d-dimensional real numbers.

The estimated density ratio function w(x) can be used in many applications such as anomaly detection [1] and covariate shift adaptation [2]. Other useful applications about density ratio estimation were summarized by Sugiyama et al. (2012) [3].

The package densratio provides a function densratio(). The function outputs an object that has a function to estimate density ratio.

For example,

set.seed(3)
x <- rnorm(200, mean = 1, sd = 1/8)
y <- rnorm(200, mean = 1, sd = 1/2)

library(densratio)
result <- densratio(x, y)

The function densratio() estimates the density ratio of p(x) to q(y), w(x) = p(x)/q(y) = Norm(1, 1/8) / Norm(1, 1/2), and provides a function to compute estimated density ratio. The result object has a function compute_density_ratio() that can compute the estimated density ratio w_hat(x) for any d-dimensional input x (now d=1).

new_x <- seq(0, 2, by = 0.06)
w_hat <- result$compute_density_ratio(new_x)

plot(new_x, w_hat, pch=19)

In this case, the true density ratio w(x) = p(x)/q(y) = Norm(1, 1/8) / Norm(1, 1/2) can be computed precisely. So we can compare w(x) with the estimated density ratio w_hat(x).

true_density_ratio <- function(x) dnorm(x, 1, 1/8) / dnorm(x, 1, 1/2)

plot(true_density_ratio, xlim=c(-1, 3), lwd=2, col="red", xlab = "x", ylab = "Density Ratio")
plot(result$compute_density_ratio, xlim=c(-1, 3), lwd=2, col="green", add=TRUE)
legend("topright", legend=c(expression(w(x)), expression(hat(w)(x))), col=2:3, lty=1, lwd=2, pch=NA)

2. Installation

You can install the densratio package from CRAN.

install.packages("densratio")

You can also install the package from GitHub.

install.packages("devtools") # if you have not installed "devtools" package
devtools::install_github("hoxo-m/densratio")

The source code for densratio package is available on GitHub at

3. Details

densratio() has method argument that you can pass "uLSIF" or "KLIEP".

  • uLSIF (unconstrained Least-Squares Importance Fitting) is the default method. This algorithm estimates density ratio by minimizing the squared loss. You can find more information in Hido et al. (2011) [1].

  • KLIEP (Kullback-Leibler Importance Estimation Procedure) is the another method. This algorithm estimates density ratio by minimizing Kullback-Leibler divergence. You can find more information in Sugiyama et al. (2007) [2].

There is a vignette for the package. For more detail, read it.

vignette("densratio")

You can also find it on CRAN.

4. Related work

We have also developed a Python package for density ratio estimation.

The package is available on PyPI (Python Package Index).

5. References

[1] Hido, S., Tsuboi, Y., Kashima, H., Sugiyama, M., & Kanamori, T. Statistical outlier detection using direct density ratio estimation. Knowledge and Information Systems 2011.

[2] Sugiyama, M., Nakajima, S., Kashima, H., von Bünau, P. & Kawanabe, M. Direct importance estimation with model selection and its application to covariate shift adaptation. NIPS 2007.

[3] Sugiyama, M., Suzuki, T. & Kanamori, T. Density Ratio Estimation in Machine Learning. Cambridge University Press 2012.