Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"X" vs formula confusion #63

Closed
AdrianS85 opened this issue Nov 5, 2020 · 2 comments
Closed

"X" vs formula confusion #63

AdrianS85 opened this issue Nov 5, 2020 · 2 comments

Comments

@AdrianS85
Copy link

Dear Amy,

First of all - many thanks for producing great tools to You and Your team! : ]

As in the title, Im having problem with understanding X and formula parameters:

  1. In the divnet function description You write that X parameter is "The covariate matrix, with samples as rows and variables as columns". Yet, in the vignette, the actual value passed into the parameter is a string (in phyloseq object - a column name in sample data). I understand that some sub-setting is being done here, but still...

  2. From both vignette and Your answer to this issue: Understanding the use of "X" in divnet() function #53 it seemed to me as if "X" parameter was being used in the same way as formula (i.e. You provide examplary X value: X = season + plot). Yet, formula is a separate parameter...

  3. Reading the divnet function code it seems to me, that when using formula, the X value is ignored. This is because at the beginning of the function, You check:

  if (!is.null(formula)) {
    if ("phyloseq" %in% class(W)) {
      X <- data.frame(phyloseq::sample_data(W))
      X <- stats::model.matrix(object = formula, data = X)
    }

Which I think means, that if formula is provided, X is produced based only on W and formula. Are these two parameters exclusive?

Perhaps clarification of these concepts would be beneficial for users such as myself - with poor statistical knowledge and mediocre coding skills : ]

Thank You and best wishes,
Adrian

@ailurophilia
Copy link
Collaborator

Hi Adrian,

Thanks for your question! I believe the answer is, yes, you can specify your model either in terms of a model matrix or using a formula (which is a newer feature in DivNet).

More generally, in many cases, a formula is a convenient representation of the model matrix used in the linear (or generalized linear) model you are fitting. For more details, you might look into the documentation for lm() and model.matrix() in base R.

I hope this helps!

Best,
David

@paulinetrinh
Copy link
Collaborator

We've added in documentation to address this issue! Please refer to the updated Getting Started documentation.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants