Skip to content

Stochastic Simulation for IR Evaluation Research: Effectiveness Scores

License

Notifications You must be signed in to change notification settings

julian-urbano/simIReff

Repository files navigation

Travis-CI Build Status License CRAN version CRAN downloads

simIReff

Provides tools for the stochastic simulation of effectiveness scores to mitigate data-related limitations of Information Retrieval evaluation research. These tools include:

  • Fitting of continuous and discrete distributions to model system effectiveness.
  • Plotting of effectiveness distributions.
  • Selection of distributions best fitting to given data.
  • Transformation of distributions towards a prespecified expected value.
  • Proxy to fitting of copula models based on these distributions.
  • Simulation of new evaluation data from these distributions and copula models.

For reference please refer to Julián Urbano and Thomas Nagler, "Stochastic Simulation of Test Collections: Evaluation Scores", ACM SIGIR, 2018.

Installation

You may install the stable release from CRAN

install.packages("simIReff")

or the latest development version from GitHub

devtools::install_github("julian-urbano/simIReff", ref = "develop")

Usage

Fit a marginal AP distribution and simulate new data

x <- web2010ap[,10] # sample AP scores of a system
e <- effContFitAndSelect(x, method = "BIC") # fit and select based on BIC
plot(e) # plot pdf, cdf and quantile function
e$mean # expected value
y <- reff(50, e) # simulation of 50 new topics

and transform the distribution to have a pre-specified expected value.

e2 <- effTransform(e, mean = .14) # transform for expected value of .14
plot(e2)
e2$mean # check the result

Build a copula model of two systems

d <- web2010ap[,2:3] # sample AP scores
e1 <- effCont_norm(d[,1]) # force the first margin to follow a truncated gaussian
e2 <- effCont_bks(d[,2]) # force the second margin to follow a beta kernel-smoothed
cop <- effcopFit(d, list(e1, e2)) # copula
y <- reffcop(1000, cop) # simulation of 1000 new topics
c(e1$mean, e2$mean) # expected means
colMeans(y) # observed means

and modify the model so both systems have the same distribution

cop2 <- cop # copy the model
cop2$margins[[2]] <- e1 # modify 2nd margin
y <- reffcop(1000, cop2) # simulation of 1000 new topics
colMeans(y) # observed means

Automatically build a gaussian copula to many systems,

d <- web2010p20[,1:20] # sample P@20 data from 20 systems
effs <- effDiscFitAndSelect(d, support("p20")) # fit and select margins
cop <- effcopFit(d, effs, family_set = "gaussian") # fit copula
y <- reffcop(1000, cop) # simulate new 1000 topics

compare observed vs. expected mean,

E <- sapply(effs, function(e) e$mean)
E.hat <- colMeans(y)
plot(E, E.hat)
abline(0:1)

compare observed vs. expected variance,

Var <- sapply(effs, function(e) e$var)
Var.hat <- apply(y, 2, var)
plot(Var, Var.hat)
abline(0:1)

and compare original vs. simulated distributions.

o <- order(colMeans(d))
boxplot(d[,o])
points(colMeans(d)[o], col = "red", pch = 4) # plot means
boxplot(y[,o])
points(colMeans(y)[o], col = "red", pch = 4) # plot means

License

simIReff is released under the terms of the MIT License.

When using this archive, please cite the above paper:

@inproceedings{urbano2018simulation,
  author = {Urbano, Juli\'{a}n and Nagler, Thomas},
  booktitle = {International ACM SIGIR Conference on Research and Development in Information Retrieval},
  title = {{Stochastic Simulation of Test Collections: Evaluation Scores}},
  pages = {695--704},
  year = {2018}
}

About

Stochastic Simulation for IR Evaluation Research: Effectiveness Scores

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages