Skip to content
This repository has been archived by the owner on Jun 13, 2023. It is now read-only.

What to do with a species with 0% or 100% encounters in any year

Jim Thorson edited this page Feb 15, 2018 · 3 revisions

I'm getting an error during Data_Fn about Some years and/or categories have either all or no encounters: what should I do?

As the message states, this error indicates that you have either 0% or 100% encounters in one or more year for at least one species. This will interfere with default model settings, and I have therefore added an error and informative message. Specifically, default settings for SpatialDeltaGLMM and VAST have an intercept for each year of the encounter-probability component (using a delta-model for continuous-valued data) or zero-inflation probability (for count-valued data). This is a problem as explained below. In either case, the intercept going to +/-Inf will then result in a Hessian matrix that is at best positive-semi-definite, such that standard-error computations using the delta-method will fail. Conceptually, it makes sense that the model with a fixed-effect intercept for each year will fail for any year with 0% encounters -- in this case, the best estimate for that year individually is that the species have zero-density everywhere! The easiest solution is to exclude any species where any year was 0% or 100% encounters. If you don't want to do this, there are several alternative solutions, and all are experimental.

Delta-model for continuous valued data

For the delta-model, this intercept will go to +Inf or -Inf for species-year combinations with 100% or 0% encounters.

  1. If using VAST and some species-year combinations have 100% encounter rates, then you can use ObsModel[2]=3, e.g.,ObsModel=c(1,3). This indicates that VAST should check for species-years combinations with 100% encounter rates and, for any such combination, fix the intercept for encounter probability to an extremely high value such that predicted encounter rates are essentially 100% for that year. This generally eliminates any leverage for data in that year on random effect epsilon2_stp, although you may also want to add some temporal structure to this random effect as well.

  2. If using VAST and specifying that each spatio-temporal term is independent among species and categories (using FieldConfig = c("Omega1"="IID", "Epsilon1"="IID", "Omega2"="IID", "Epsilon2"="IID")), you can identify any year-category combination that has 0% encounters and change all data from that year-category combination to NA. VAST will then turn-off intercepts for those species-category combinations, and SpatialDeltaGLMM::PlotIndex_Fn is designed to predict zero total abundance for those year-category combinations when calculating and plotting the abundance-indices. This is useful during compositional-expansion, when the other solutions are not feasible.

  3. You could make the intercept for encounter-probability/zero-inflation constant over time via Data_Fn input RhoConfig=c("Beta1"=3,"Beta2"=0,"Epsilon1"=0,"Epsilon2"=0). You can then over-ride the error message via Data_Fn( ..., "CheckForErrors"=FALSE).

  4. You could make the intercept for encounter-probability/zero-inflation a random effect that is independent among years, follows a random-walk, or follows a first-order autoregressive process using RhoConfig=c("Beta1"=1,"Beta2"=0,"Epsilon1"=0,"Epsilon2"=0) or RhoConfig=c("Beta1"=2,"Beta2"=0,"Epsilon1"=0,"Epsilon2"=0) or RhoConfig=c("Beta1"=4,"Beta2"=0,"Epsilon1"=0,"Epsilon2"=0), respectively

  5. You could use an alternative Poisson-link delta-model that ties together encounter probability and positive-catch-rate components, using ObsModel[2]=1 (instead of ObsModel[2]=0 as used by default). This may eliminate the issue if the problem is some years with 100% encounter probability and you restrict structure on the 2nd ("average-weight") component using RhoConfig=c("Beta1"=0,"Beta2"=3,"Epsilon1"=0,"Epsilon2"=0) and FieldConfig=c("Omega1"=1, "Epsilon1"=1, "Omega2"=0, "Epsilon2"=0)`. This will not help if the problem is some years with 0% encounter probability.

  6. In a multispecies model using VAST, you can implement one of these solutions for an single species (instead of for all species as the above-options do) by custom modifying the Map input. This involves building your model:

# Make data
  TmbData = Data_Fn(..., "CheckForErrors"=FALSE) # where ... is your existing inputs to Data_Fn

# Load model
TmbList = Build_TMB_Fn( ... )  # where ... is your existing inputs to Build_TMB_Fn

# Extract pre-made `Map`
Map_customized = TmbList[["Map"]]

# Add custom-edits for `Map`
Map_customized $beta1_ct <- ### Add structure here

# Reload model
TmbList = VAST::Build_TMB_Fn("Map"=Map_customized, ...) # where ... is your previous inputs to Build_TMB_Fn

This will build a TMB model with customized restrictions on parameters, but likely requires understanding the structure of the model in detail, as well as how to use the map input to TMB::MakeADFun

Please note that none of these solutions are "conventional" (because the conventional delta-GLMM involves an intercept for each year) but they each could overcome the issue of having 0% or 100% encounters in any year.

Zero-inflated model for whole-number valued data

For the zero-inflation model, the zero-inflation intercept will go to +Inf for any species-year combination with 0% encounters in any year. Experimental solutions include:

  1. If including a species that has 100% encounter rate in one or a few years (but not all years), you can impose some temporal structure on intercepts, using methods #3-4 above.

  2. You can customize the Map input following instructions in method #6 above.