Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specification of random effects for nested+longitudinal study design #32

Open
ChrKoenig opened this issue Dec 3, 2019 · 4 comments
Open

Comments

@ChrKoenig
Copy link

Hi!

I'd appreciate your advice regarding the specification of random effects in Hmsc. My dataset consists of standardized, annually repeated surveys across an evenly spaced grid of sampling plots. The exact location of individuals within each plot is recorded, allowing me to derive species occurrences at finer grain sizes by subdividing the the entire plot into a number of equally-sized sub-plots. I am mostly interested in the change in species association patterns at different grain sizes, but I'm not quite sure whether my specification of random effects is correct.
I need random effects for three reasons: (1) to make Hmsc estimate pairwise residual correlations, (2) to account for the fact that sub-plots are nested within sampling plots, and (3) to account for the repeated sampling in different years. From my understanding, the specification of study design and random effects at the coarsest level, i.e. for entire sampling plots, should be as follows:

sampling_design

sampling_plot | year
--------------|--------
1             | 2010
1             | 2011
2             | 2010
2             | 2011
...           | ...

random effects

rnd_eff = list("sampling_plot" = HmscRandomLevel(units = study_design$sampling_plot),   # obtain residual correlations
               "year" = HmscRandomLevel(units = study_design$year))   # account for temporal autocorrelation

However, when I want to look at the sub-plot level, I am not quite sure if the following is correct:

sampling_design

sampling_plot |  sub_plot  | year
--------------|------------|--------
1             |     1      | 2010
1             |     2      | 2010
1             |     3      | 2010
1             |     4      | 2010
1             |     1      | 2011
1             |     2      | 2011
1             |     3      | 2011
1             |     4      | 2011
2             |     5      | 2010
...           |     ...    | ...

random effects

rnd_eff = list("sampling_plot" = HmscRandomLevel(units = study_design$sampling_plot),   # account for nested sampling,
               "sub_plot" = HmscRandomLevel(units = study_design$sub_plot),   # obtain residual correlations
               "year" = HmscRandomLevel(units = study_design$year))   # account for temporal autocorrelation

Any thoughts on that?

@oysteiop
Copy link
Collaborator

oysteiop commented Dec 3, 2019

Hi, I have made a suggested setup below. Note use of unique() to avoid repeating the same plot several times:

rL1 = HmscRandomLevel(units = unique(study_design$sampling_plot))

For the subplots, you should first create unique sub-plot IDs, for example:

study_design$unique_sub_plot = paste(study_design$sampling_plot, study_design$sub_plot, sep="_")

rL2 = HmscRandomLevel(units = unique(study_design$unique_sub_plot))

rL3 = HmscRandomLevel(units = unique(study_design$year))

This implementation allows you to ask e.g. if some species tend to occur in the same years. Alternatively, to make the analysis temporally explicit, see the argument sData in HmscRandomLevel(). (Temporal data can be modelled the same way as spatial data, but with only one dimension.)

Finally, the complete list of random effects:
rnd_eff = list("sampling_plot" = rL1, # account for nested sampling,
"sub_plot" = rL2, # obtain residual correlations
"year" = rL3) # account for temporal autocorrelation

@ChrKoenig
Copy link
Author

Thanks, @oysteiop! In fact, I do have unique sub-plot IDs already, so it was basically just a matter of wrapping a unique() around the sampling-plot IDs.
Just to be sure: To contrast patterns in residual correlations at different grain sizes (e.g. sampling-plots, subplots-large, subplots-medium, subplots-fine), I would compare the species association matrices with rL2, except for the coarsest resolution (sampling-plots), where I would use rL1?

@oysteiop
Copy link
Collaborator

oysteiop commented Dec 5, 2019

Yes, this sounds reasonable.

@gtikhonov
Copy link
Member

Hi @ChrKoenig ,
Your design sounds quite elaborate and I believe that there is more than a just single right option to incorporate it to the Hmsc. So it is more about what research questions you have in mind.
For example, in the design that you've discussed above, you do not really estimate "the most residual" associations, as your subplot effect is assumed to be constant over years. Well, you have an extra expert of the year, but it is assumed to affect all samples in a given year equally, so it captures the synchronous variation across years. In my opinion, to get the residual associations, you should include the effect that corresponds to the pair of (subplot x year), for instance in a way as @oysteiop demonstrated above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants