Switch branches/tags
Nothing to show
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
..
Failed to load latest commit information.
README.Rmd
mee6.pbs
simuls6.R
svabuRD.R
svabuRDm.R
svabu_link.R

README.Rmd

title author date output
Revisiting resource selection probability functions and single-visit methods: Clarification and extensions
Sólymos, P., and Lele, S. R.
2015-06-22
pdf_document

Revisiting resource selection probability functions and single-visit methods: Clarification and extensions

Introduction

Supporting Information for the paper Sólymos, P., and Lele, S. R., 2015. Revisiting resource selection probability functions and single-visit methods: clarification and extensions. Methods in Ecology and Evolution, in press.

Here we give rationale and code for the simulation used in the paper. The organization of this archive is as follows:

An R markdown version of this document can be found at GitHub.

Examples for the RSPF condition

The following figure from the paper demosntrates that the RSPF condition defines a class of functions that are fairly flexible. Here we are generating regression coefficients (cf) randomly but users can supply their own.

m <- 5 # how many lines to draw
x <- seq(-1, 1, by=0.01) # predictor
col <- 1:m

## Regression coefficients. 
set.seed(12345)
cf <- sapply(1:m, function(i) 
    c(rnorm(1, 0, 1), rnorm(1, 0, 1+i/2),
    rnorm(1, 0, 2), rnorm(1, 0, 4)))

## plot the response curves: 
## Linear or Quadratic or Cubic terms included 
## in the link (logit or cloglog).

pdf("fig-poly-123.pdf", width=8, height=12)
xlim <- 0.75
op <- par(mfrow=c(3,2), las=1, cex=1.2, mar=c(3,4,4,2)+0.1)

plot(0, type="n", ylim=c(0,1), xlim=c(-xlim,xlim),
    xlab="Predictor", ylab="Probability", main="logit, linear")
for (i in 1:m)
    lines(x, plogis(cf[1,i]+cf[2,i]*x), col=col[i], lwd=3)

plot(0, type="n", ylim=c(0,1), xlim=c(-xlim,xlim),
    xlab="Predictor", ylab="Probability", main="cloglog, linear")
for (i in 1:m)
    lines(x, binomial("cloglog")$linkinv(cf[1,i]+cf[2,i]*x), col=col[i], lwd=3)

plot(0, type="n", ylim=c(0,1), xlim=c(-xlim,xlim),
    xlab="Predictor", ylab="Probability", main="logit, quadratic")
for (i in 1:m)
    lines(x, plogis(cf[1,i]+cf[2,i]*x+cf[3,i]*x^2), col=col[i], lwd=3)

plot(0, type="n", ylim=c(0,1), xlim=c(-xlim,xlim),
    xlab="Predictor", ylab="Probability", main="cloglog, quadratic")
for (i in 1:m)
    lines(x, binomial("cloglog")$linkinv(cf[1,i]+cf[2,i]*x+cf[3,i]*x^2), 
        col=col[i], lwd=3)

plot(0, type="n", ylim=c(0,1), xlim=c(-xlim,xlim),
    xlab="Predictor", ylab="Probability", main="logit, cubic")
for (i in 1:m)
    lines(x, plogis(cf[1,i]+cf[2,i]*x+cf[3,i]*x^2+cf[4,i]*x^3), 
        col=col[i], lwd=3)

plot(0, type="n", ylim=c(0,1), xlim=c(-xlim,xlim),
    xlab="Predictor", ylab="Probability", main="cloglog, cubic")
for (i in 1:m)
    lines(x, binomial("cloglog")$linkinv(cf[1,i]+cf[2,i]*x+cf[3,i]*x^2+cf[4,i]*x^3), 
        col=col[i], lwd=3)

par(op)
dev.off()

In the following, we illustrate the estimation of the abundance and probability of detection when the logit link has cubic terms included in the model. The probability of detection does not reach 1 for any covariate value. This illustrates that the critical condition is that the model satisfy the RSPF condition and not that probability of detection is 1 for some combination of the covariates.

We simulate data under BP N-mixture with logit link for detection with cubic terms.

n <- 1000
k <- 3 # order of polynomial
link <- "logit"
beta <- c(1,1)
theta <- cf[1:(1+k),3]

x1 <- rnorm(n, 0, 1)
X <- model.matrix(~ x1)
lambda <- exp(X %*% beta)

z1 <- sort(runif(n, -1, 0.75))
z2 <- z1^2
z3 <- z1^3
Z <- model.matrix(~ z1 + z2 + z3)[,1:(1+k)]
p <- binomial(link)$linkinv(Z %*% theta)
## This shows that the probability of detection 
## does not reach 1 for any covariate value. 
summary(p)    
N <- rpois(n, lambda)
Y <- rbinom(n, N, p)
table(Y)

Now we estimate the coefficients for BP N-mixture. Things to observe:

  • AIC for the true model is the lowest, as it should be.
  • The best fitting model approximates the true probabilities of detection quite well.
library(detect)
m1 <- svabu(Y ~ x1 | z1, zeroinfl=FALSE, link.det=link)
m2 <- svabu(Y ~ x1 | z1 + z2, zeroinfl=FALSE, link.det=link)
m3 <- svabu(Y ~ x1 | z1 + z2 + z3, zeroinfl=FALSE, link.det=link)
maic <- AIC(m1, m2, m3)
maic$dAIC <- maic$AIC - min(maic$AIC)
## AIC for the true model is the lowest, as it should be. 
maic

## How do the fitted probabilities of detection compare with the true ones?
plot(z1, p, type="l", ylim=c(0,1))
lines(z1, m1$detection.probabilities, col=2)
lines(z1, m2$detection.probabilities, col=3)
lines(z1, m3$detection.probabilities, col=4)
legend("bottomright", lty=1, col=1:4,
    legend=paste(c("truth", "linear", "quadratic", "cubic"),
        "AIC =", c("NA",round(maic$AIC, 2))))

Based on this single run, we can also do simulations to plot response curves based on model selection.

S <- 100
theta_hat <- matrix(0, 4, S)
p_hat <- matrix(0, n, S)
for (i in 1:S) {
    cat(i, "\n");flush.console()
    Y <- rbinom(n, N, p)
    m <- list(m1 = svabu(Y ~ x1 | z1, zeroinfl=FALSE, link.det=link),
        m2 = svabu(Y ~ x1 | z1 + z2, zeroinfl=FALSE, link.det=link),
        m3 = svabu(Y ~ x1 | z1 + z2 + z3, zeroinfl=FALSE, link.det=link))
    m_best <- m[[which.min(sapply(m, AIC))]]
    theta_hat[1:length(coef(m_best, "det")), i] <- coef(m_best, "det")
    p_hat[,i] <- m_best$detection.probabilities
}
z1_Nmix <- z1
p_Nmix <- p

Now let us see if we can estimate the absolute probability of selection if the RSPF condition is satisfied. We generate the use-available data under a cubic model (same as the probability of detection model above). User can use any other model that satisfies the RSPF condition.

We simulate data under RSPF model. The simulateUsedAvail function generates the used points under probability sampling with replacement. The data frame dat also includes a sample of available points that is obtained using simple random sampling with replacement. (See Lele and Keim 2006: Animal as a sampler description in the discussion).

library(ResourceSelection)

n <- 3000
k <- 3 # order of polynomial
link <- "logit"
beta <- c(1,1)
theta <- cf[1:(1+k),3]

x1 <- rnorm(n, 0, 1)
X <- model.matrix(~ x1)
lambda <- exp(X %*% beta)

z1 <- sort(runif(n, -1, 0.75))
z2 <- z1^2
z3 <- z1^3

Z <- model.matrix(~ z1 + z2 + z3)[,1:(1+k)]
p <- binomial(link)$linkinv(Z %*% theta)
summary(p)     # Notice that the probability of selection never reaches 1. 

## This is the distribution of the covariates: The available distribution. 
z <- data.frame(z1=z1, z2=z2, z3=z3)[,1:k]   

dat <- simulateUsedAvail(z, theta, n, m=10, link=link)  

Estimate coefficients for RSPF. The best fitting model approximates the true probabilities quite well. As usual, larger the sample size, better will be the fit.

r1 <- rspf(status ~ z1, dat, m=0, B=0, link=link)
r2 <- rspf(status ~ z1 + z2, dat, m=0, B=0, link=link)
r3 <- rspf(status ~ z1 + z2 + z3, dat, m=0, B=0, link=link)
raic <- AIC(r1, r2, r3)
raic$dAIC <- raic$AIC - min(raic$AIC)
raic   # The best fitting model is the cubic model as it should be. 

## How do the estimated probability of selection compare 
## with the true probabilities of selection? 
plot(z1, p, type="l", ylim=c(0,1))
points(dat$z1, fitted(r1), col=2, pch=".")
points(dat$z1, fitted(r2), col=3, pch=".")
points(dat$z1, fitted(r3), col=4, pch=".")
legend("bottomright", lty=1, col=1:4,
    legend=paste(c("truth", "linear", "quadratic", "cubic"),
        "AIC =", c("NA",round(raic$AIC, 2))))

Now we write simulation after the single run, but keeping the available distribution identical:

theta_hat2 <- matrix(0, 4, S)
p_hat2 <- matrix(0, sum(dat$status==0), S)
for (i in 1:S) {
    cat(i, "\n");flush.console()
    dat1 <- simulateUsedAvail(z, theta, n, m=10, link=link)
    dat <- rbind(dat1[dat1$status==1,], dat[dat$status==0,])
    m <- list(m1 = rspf(status ~ z1, dat, m=0, B=0, link=link),
        m2 = rspf(status ~ z1 + z2, dat, m=0, B=0, link=link),
        m3 = rspf(status ~ z1 + z2 + z3, dat, m=0, B=0, link=link))
    m_best <- m[[which.min(sapply(m, AIC))]]
    theta_hat2[1:length(coef(m_best)), i] <- coef(m_best)
    p_hat2[,i] <- fitted(m_best, "avail")
}
z1_rspf <- z1
p_rspf <- p

Now let's plot both the N-mixture and RSPF simulation results with the true response curve that was used

## model selection frequencies for N-mixture
table(colSums(abs(theta_hat) > 0))
## model selection frequencies for RSPF
table(colSums(abs(theta_hat2) > 0))

pdf("fig-poly-fit.pdf", width=10, height=5)
op <- par(mfrow=c(1,2), las=1)

ii <- which(!duplicated(dat$z1[dat$status==0]))
ii <- ii[order(dat$z1[dat$status==0][ii])]
plot(z1_rspf, p_rspf, type="n", ylim=c(0,1),
    xlab="Predictor", ylab="Probability of selection",
    main="RSPF")
matlines(dat$z1[dat$status==0][ii], p_hat2[ii,], type="l", lty=1, col="grey")
lines(z1_rspf, p_rspf, lwd=3)

plot(z1_Nmix, p_Nmix, type="n", ylim=c(0,1),
    xlab="Predictor", ylab="Probability of detection",
    main="Single-visit N-mixture")
matlines(z1_Nmix, p_hat, type="l", lty=1, col="grey")
lines(z1_Nmix, p_Nmix, lwd=3)

par(op)
dev.off()

Quasi-Bayesian single-visit occupancy model

Generate data

library(dclone)
library(rjags)

OccData.fun = function(psi, p1){
    N <- length(p1)
    Y = rbinom(N, 1, psi)
    W = rbinom(N, 1, Y * p1)
    list(Y=Y, W=W)
}

Penalized estimation

This is basically a Bayesian approach but prior is based on the currently observed data. The prior on beta is centered at the naive estimator. This forces the estimator to not veer off too far from the naive estimator. This penalty function is somewhat easier and simpler than the convoluted penalty function used in Lele et al. (2012). We write it using the conditional distribution form. These estimators are stable but biased for small samples (as are all other estimators used here).

beta.naive are from the naive estimator. We want to keep it close to this estimator unless the information in the data forces us to move.

## Data list includes: W, beta.naive, penalty,X1,Z1,nS
OccPenal.est = function(){
    for (i in 1:nS){
        W[i] ~ dbin(p1[i] * Y[i], 1)
	logit(p1[i]) <- inprod(Z1[i,], theta)
	Y[i] ~ dbin(psi[i], 1)
	logit(psi[i]) <- inprod(X1[i,], beta)
    }
    for (i in 1:c1){
	beta[i] ~ dnorm(beta.naive[i], penalty)     
    }	
    for (i in 1:c2){
	theta[i] ~ dnorm(0, 0.01)
    }	
}

Data generation and analysis

There are two ways to increase the information in the data under the regression setting:

  • by increasing the sample size,
  • and by increasing the variability on the covariate space.

The other is much more cost effective and is under the control of the researcher.

N <- 1000
beta <- c(1,2)      # Occupancy parameters
theta <- c(-1,0.6)  # Detection parameters

X <- runif(N, -2,2)  # Occupancy covariate
Z <- runif(N, -2,2)  # Detection covariate

X1 <- model.matrix(~X)
Z1 <- model.matrix(~Z)

psi <- plogis(X1 %*% beta)
p1 <- plogis(Z1 %*% theta)

mean(psi)   # Check the average occupancy probability
mean(p1)    # Check the average detection probability

occ.data <- OccData.fun(psi, p1)

occ.data <- list (Y = occ.data$Y, W = occ.data$W, X1 = X1, Z1 = Z1)

Using only the regularized likelihood function

Penalty is the precision for the beta parameters. This should be at least as small as the 1/SE^2 for beta.naive but it could be substantially smaller than that (but not too small).

S <- 10  # Number of simulations
out1 <- matrix(0, S, 5)   # Parameter estimates
out2 <- matrix(0, S, 3)

nS <- 400    # Sample size from the population

penalty <- 0.5    

for (s in 1:S) {

    sample.index <- sample(seq(1:N), nS, replace = FALSE)
    beta.naive <- glm(occ.data$W[sample.index] ~ X1[sample.index,2], family="binomial")

    dat <- list(W = occ.data$W[sample.index], 
	X1 = occ.data$X1[sample.index,], 
	Z1 = occ.data$Z1[sample.index,], 
	c1 = ncol(occ.data$X1), 
	c2 = ncol(occ.data$Z1),
	nS = nS, 
	beta.naive = beta.naive$coef, 
	penalty = penalty)
    inits <- list(Y = rep(1, nS))
	
    OccPenalty.fit <- jags.fit(dat, c("beta","theta"), OccPenal.est, 
	inits=inits, n.adapt=10000, n.update=10000, n.iter=1000)
    diag <- gelman.diag(OccPenalty.fit)  # Check for convergence.  
	
    # Median value and the 95% credible interval.
    parms.est <- summary(OccPenalty.fit)$quantiles[,c(1,3,5)]   

    OccPenalty.fit <- jags.fit(dat, c("p1","psi"), OccPenal.est, 
	inits=inits, n.adapt=10000, n.update=10000, n.iter=1000)
	
	tmp <- summary(OccPenalty.fit)
	naive.medianOcc <- summary(beta.naive$fitted)[3]
	true.medianOcc <- summary(psi[sample.index])[3]
	Adj.medianOcc <- summary(tmp$quantiles[(nS+1):(2*nS), 3])[3]
	
	# Only spits out the median for simulation study.
	out1[s,] <- c(parms.est[,2],diag$mpsrf)   
	out2[s,] <- c(naive.medianOcc, true.medianOcc, Adj.medianOcc)

}

out1
out2

Bias in generalized N-mixture model

Simulation setup

The simulations were conducted using MPI cluster on WestGrid (http://www.westgrid.ca). The simuls6.R file was run using the portable batch system by running the file mee6.pbs. The end result is an Rdata file that is used below.

Summarizing results

options("CLUSTER_ACTIVE" = FALSE)
library(lattice)
library(MASS)

load("MEE-rev2-simul-4-constant.Rdata")
res1 <- res[!vals$overlap]
res2 <- res[vals$overlap]

vals <- vals[!vals$overlap,]
vals$overlap <- NULL
vals$cinv <- round(1/vals$cvalue, 1)
vals$omega <- round(vals$omega, 1)

## bias in lambda
ff <- function(zz) {
    tr <- zz[c("lam1","lam2","lam3"), "truth"]
    dm <- zz[c("lam1","lam2","lam3"), "DM"]
    sv <- zz[c("lam1","lam2","lam3"), "SV"]
    c(DM=(dm - tr) / tr, SV=(sv - tr) / tr)
}
fun_blam <- function(z, fun=mean) {
    apply(sapply(z, ff), 1, fun)
}

bias1 <- t(sapply(res1, fun_blam, fun=mean))
bias2 <- t(sapply(res2, fun_blam, fun=mean))

ylim <- c(-1, 2)
xlim <- c(0,1)
Cols <- c("DM.lam1","SV.lam1") #colnames(bias1)
om <- sort(unique(vals$omega))

fPal <- colorRampPalette(c("red", "blue"))
Col <- fPal(length(unique(vals$omega)))

op <- par(mfrow=c(2,2), mar=c(4,5,2,2)+0.1, las=1)
for (j in 1:2) { # setup
    for (i in 1:length(Cols)) { # method
        bias <- switch(as.character(j),
            "1" = bias1,
            "2" = bias2)
        main0 <- switch(as.character(j),
            "1" = "no overlap",
            "2" = "overlap")
        CC <- switch(Cols[i],
            "DM.lam1" = "DM (T=3, t=1)",
            "SV.lam1" = "SV (t=1)")
        main <- paste0(CC, " - ", main0)
        plot(vals$cinv, bias[, Cols[i]], type="n",
            main=main, ylim=ylim, xlim=xlim,
            xlab="1/c", ylab="Relative bias")
        abline(h=0, lty=1, col="grey")
        for (k in seq_len(length(om))) {
            ii <- vals$omega == om[k]
            lines(vals[ii, "cinv"], bias[ii, Cols[i]], col=Col[k],
            lwd=2)
        }
    }
}
for (jj in 1:100) {
    lines(c(0, 0.1), rep(0.5 + 0.01*jj, 2),
        col=fPal(100)[jj])
}
text(0.15, 1.75, expression(omega), cex=1.5)
text(rep(0.15, 6), 0.5 + c(0, 0.5, 1),
c("0.0","0.5","1.0"), cex=0.8)
par(op)

B-B-ZIP identifiability

The goal is to establish identifiability for the Binomial-Binomial-Poisson in the following cases:

  • p_i * q_i case when q_i varies from location to location,
  • p_i * q case when q is constant but data is binned according to distance bands.

B-B-ZIP simulation with variable q

We simulate under the single visit distance sampling model where observations are not binned in multiple distance bands but vary in terms of their truncation distance to prove identifiability (constant EDR situation shown).

R <- 100
n <- 2000
K <- 250

setwd(set_wd)
source("svabuRD.R")

set.seed(4321)
link <- "logit"
linkinvfun.det <- binomial(link=make.link(link))$linkinv

x1 <- runif(n,0,1)
x2 <- rnorm(n,0,1)
x3 <- runif(n,-1,1)

X <- model.matrix(~ x1)
ZR <- model.matrix(~ x3)
ZD <- model.matrix(~ x2)

beta <- c(-1,1)
thetaR <- c(0.9, -1.5) # for singing rate
edr <- 0.65

## point count radius
r <- sample(c(0.5,1), n, replace=TRUE)

q <- (edr / r)^2 * (1 - exp(-(r / edr)^2))

D <- exp(drop(X %*% beta))
lambda <- D * pi*r^2

p <- linkinvfun.det(drop(ZR %*% thetaR))

summary(D)
summary(p)
summary(q)

res2 <- list()
for (i in 1:R) {
cat(i, "\n");flush.console()

Y <- rpois(n, lambda * p * q)

m <- svabuRD.fit(Y, X, ZR, NULL, Q=NULL, zeroinfl=FALSE, r=r, N.max=K,
    inits=c(beta, thetaR, log(edr))+rnorm(5, 0, 0.2))

res2[[i]] <- coef(m)

}
save.image("out-sim-2.Rdata")

Results from B-B-ZIP with variable q

load("out-sim-2.Rdata")

True2 <- c(beta, thetaR, "logedr"=log(edr))
cf2 <- t(sapply(res2, unlist))
bias2 <- t(sapply(res2, unlist) - True2)

lamTrue <- mean(exp(X %*% beta))
pTrue <- mean(plogis(drop(X %*% thetaR)))
qTrue <- mean((edr / r)^2 * (1 - exp(-(r / edr)^2)))

lam_hat <- apply(cf2[,1:2], 1, function(z) {
    mean(exp(X %*% z))
})
p_hat <- apply(cf2[,3:4], 1, function(z) {
    mean(plogis(drop(X %*% z)))
})
q_hat <- sapply(cf2[,5], function(z) {
    mean((exp(z) / r)^2 * (1 - exp(-(r / exp(z))^2)))
})

rr <- 0:150 / 100
qq <- sapply(cf2[,5], function(z) 
    (exp(z) / rr)^2 * (1 - exp(-(rr / exp(z))^2)))
qq_true <- (edr / rr)^2 * (1 - exp(-(rr / edr)^2))

est <- cbind(lam=lam_hat, p=p_hat, q=q_hat)
bias <- cbind(lam=(lam_hat-lamTrue)/lamTrue, 
    p=(p_hat-pTrue)/pTrue, q=(q_hat-qTrue)/qTrue)

## figure
pdf("svabuRD.pdf", width=12, height=4)
par(mfrow=c(1,3), mar=c(5, 5, 4, 2) + 0.1)
boxplot(bias2,
    col="grey",
    ylab=expression(hat(theta)-theta),
    names=c(expression(alpha[0]),
      expression(alpha[1]), 
      expression(beta[0]),
      expression(beta[1]),
      expression(log(tau))))
abline(h=0, lty=2, col=1)
matplot(rr*100, qq, lty=1, col="grey", type="l",
    ylab="q", xlab="Distance from observer (m)")
lines(rr*100, qq_true, col=1, lwd=2)
boxplot(bias,
    col="grey",
    ylab="Relative bias",
    names=c(expression(bar(lambda)),
      expression(bar(p)), 
      expression(bar(q))))
abline(h=0, lty=2, col=1)
dev.off()

B-B-ZIP simulation with constant q

We simulate under the single visit distance sampling model where observations are binned in multiple distance bands to prove identifiability (constant EDR situation shown).

set.seed(1234)
library(detect)
library(unmarked)
source("svabuRDm.R")
n <- 200
T <- 3
K <- 50
B <- 100
link <- "logit"
linkinvfun.det <- binomial(link=make.link(link))$linkinv

x1 <- runif(n,0,1)
x2 <- rnorm(n,0,1)
x3 <- runif(n,-1,1)
x4 <- runif(n,-1,1)
x5 <- rbinom(n,1,0.6)
x6 <- rbinom(n,1,0.4)
x7 <- rnorm(n,0,1)

X <- model.matrix(~ x1)
ZR <- model.matrix(~ x3)
ZD <- model.matrix(~ x2)
Q <- model.matrix(~ x7)

beta <- c(0,1)
thetaR <- c(1, -1.5) # for singing rate
#thetaD <- c(-0.5, 1.2) # for EDR

edr <- 0.8 # exp(drop(ZD %*% thetaD))
Dm <- matrix(c(0.5, 1), n, 2, byrow=TRUE)
## truncation distance must be finite
r <- apply(Dm, 1, max, na.rm=TRUE)
Area <- pi*r^2
q <- (edr / r)^2 * (1 - exp(-(Dm / edr)^2))
q <- q - cbind(0, q[,-ncol(Dm), drop=FALSE])

## test case 1
D <- exp(drop(X %*% beta))
lambda <- D * Area
p <- linkinvfun.det(drop(ZR %*% thetaR))
delta <- cbind(p * q, 1-rowSums(p * q))
summary(delta)
summary(lambda)
summary(rowSums(q))
summary(p)

res_mn0 <- list()
res_mnp <- list()
res_sv <- list()
res_mn <- list()
for (i in 1:B) {
    cat("variable p, run", i, "of", B, ":\t");flush.console()

    N <- rpois(n, lambda)
    Y10 <- t(sapply(1:n, function(i)
        rmultinom(1, N[i], delta[i,])))
    Y <- Y10[,-ncol(Y10)]

    zi <- FALSE

    cat("mn_p,  ");flush.console()
    m0 <- try(svabuRDm.fit(Y, X, NULL, NULL, Q=NULL, zeroinfl=zi, D=Dm, N.max=K))
    res_mn0[[i]] <- try(cbind(est=unlist(coef(m0)), true=c(beta, mean(qlogis(p)), log(edr))))

    cat("mn_pi,  ");flush.console()
    m1 <- try(svabuRDm.fit(Y, X, ZR, NULL, Q=NULL, zeroinfl=zi, D=Dm, N.max=K))
    res_mnp[[i]] <- try(cbind(est=unlist(coef(m1)), true=c(beta, thetaR, log(edr))))

    cat("mn,  ");flush.console()
    umf <- unmarkedFrameDS(y=Y,
        siteCovs=data.frame(x1=x1,x2=x2,x3=x3),
        dist.breaks=c(0,50,100), unitsIn="m", survey="point")
    m <- distsamp(~1 ~x1, umf, output="abund")
    sig <- exp(coef(m, type="det"))
    ea <- 2*pi * integrate(grhn, 0, 100, sigma=sig)$value # effective area
    logedr <- log(sqrt(ea / pi)/100) # effective radius
    res_mn[[i]] <- cbind(est=c(coef(m)[1:2], logedr), true=c(beta, log(edr)))

    cat("b_pi.\n")
    yy <- rowSums(Y)
    m2 <- svabu.fit(yy, X, ZR, Q=NULL, zeroinfl=zi, N.max=K)
    res_sv[[i]] <- cbind(est=unlist(coef(m2)), true=c(beta, thetaR))

}
save.image("out-multinom_final.Rdata")

Results from B-B-ZIP with constant q

load("out-multinom_final.Rdata")
f <- function(res) {
    true <- res[[1]][,"true"]
    true[!is.finite(true)] <- 0
    est <- t(sapply(res, function(z)
        if (inherits(z, "try-error")) rep(NA, length(true)) else z[,"est"]))
    bias <- t(t(est) - true)
    list(true=true, est=est, bias=bias)
}
ylim <- c(-2,0.5)
toPlot <- cbind(f(res_mn0)$bias[,1],
    f(res_sv)$bias[,1]-log(pi))
colnames(toPlot) <- c("Multinomial", "SV")
boxplot(toPlot, col="grey", ylab="Bias")
abline(h=0)
abline(h=log(1/2), lty=2)
text(0.6, log(1/2)+0.08, "log(q)", cex=0.8)

Associated files

Session info

sessionInfo()

Functions referenced in the document

The svabuRDm.fit function:

source("~/repos/detect/extras/revisitingSV/svabuRDm.R")
dput(svabuRDm.fit)

The svabuRD.fit function:

source("~/repos/detect/extras/revisitingSV/svabuRD.R")
dput(svabuRD.fit)

Content of files referenced in the document

R code for simulations from simulas6.R file:

cat(readLines("~/repos/detect/extras/revisitingSV/simuls6.R"), sep="\n")

MPI batch script mee6.pbs used in to run the simulations:

cat(readLines("~/repos/detect/extras/revisitingSV/mee6.pbs"), sep="\n")