Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binding multiple simmat objects together #164

Closed
psolymos opened this issue Apr 16, 2016 · 8 comments
Closed

Binding multiple simmat objects together #164

psolymos opened this issue Apr 16, 2016 · 8 comments
Milestone

Comments

@psolymos
Copy link
Contributor

psolymos commented Apr 16, 2016

@jarioksa and I have discussed the the possibility of adding a function for binding multiple simmat objects. This is useful because:

  • vegan has no option for stratification in null models,
  • and parallel computing often returns a list of simmat objects that need to be coerced/bound.

For this end I created the bind-simmat branch and added the smbind function and its documentation. The naming refers to simmatbind. I did not create a generic because the function can take multiple object classes as input, e.g. smbind(x, y) or smbind(list(x, y)) which are the 2 common use cases I think (similarly to AIC()).

@psolymos
Copy link
Contributor Author

psolymos commented Apr 18, 2016

Start, end, thinning (~ts attributes), and number of simulated matrices need to be consistent for sequential algorithms. Here is the decision tree along which the current smbind implementation works:

  1. MARGIN != 3: stratification intended, all start, end, thin must equal (dimensions are checked too).
  2. MARGIN == 3:
    1. all start, end, thin equal: adequate for null model analyses but ts attributes are set to NA;
    2. thin equal, start and end is consistent with subsequent samples, ts attributes are updated accordingly.

In all other cases an error is produced, unless strict = FALSE.

@jarioksa
Copy link
Contributor

For most use cases we ignore the chains in sequential methods and treat the combined result only as a 3D array. However, for some diagnostic tools we would like to have the information on chain. That is, oecosimu can be ignorant of chains, but as.mcmc and the coda tools should have the information. It seems that coda is rather picky for multiple chains: ?mcmc.list says:

The list must be balanced: each chain in the list must have the same iterations and the same variables.

That means that start, end and thin (and hence length) must be equal, but are each repeated for each chain. Such a data will be generated if a sequential model is called chains times in parallel. Shouldn't such cases be handled? The only function that really must be aware of this is as.mcmc (and perhaps as.ts). This could happen, e.g., so that smbind adds a chains argument if it combines sequential models with equal parallel start, end and thin, and then as.mcmc breaks results in to a mcmc.list when needed. I don't think we need this even for permustats.oecosimu or at least we have no tools of handling permustats.oecosimu results as sequential results from different chains. Here an example of hand-crafted mcmc.list object that was a result of parallel processing of sipoo data using "swap" without burnin or thinning.
mcmc-chains

@psolymos
Copy link
Contributor Author

We can have a chains attribute. This would allow as.mcmc and as.mcmc.list to recognize the proper structure, and start, end, thin can be set (equality of nsim should follow, but it is checked newertheless in smbind).

The print.simmat would then add the chain info when attr(x, "chains") is not NULL and >1.

@jarioksa
Copy link
Contributor

The chains argument works finely. Got to take care of as.mcmc.oecosimu (and perhaps of as.ts.oecosimu) after this is merged to master.

@psolymos
Copy link
Contributor Author

psolymos commented Apr 20, 2016

Yep. I was wondering: what is the use of seed attribute when multiple objects combined. Maybe it should be set to NULL (as is now the default in smbind) when no further use intended. Alternativel it could be a list of seeds for the sake of retaining all info. I think NULL is appropriate until we find reason for retaining those, it can also be a signature of bound objects.

@jarioksa
Copy link
Contributor

NULL or NA would be appropriate. At the moment I have a slight preference for NA (but it can change...).

@psolymos
Copy link
Contributor Author

I had a look at as.ts and as.mcmc. When comm was supplied as a simmat object on oecosimu, burnin was 0, however, it should came from the start argument of the simmat object.

I also added an as.mcmc.list method to be able to handle multiple chains for coda tools. The as.ts and as.mcmc methods now return an error when object is based on multiple chains.

@psolymos
Copy link
Contributor Author

PR #175 closes this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants