Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added basic glance and tidy for lavaan models. #233

Merged
merged 6 commits into from Jun 14, 2018

Conversation

puterleat
Copy link
Contributor

@puterleat puterleat commented Aug 1, 2017

nb. the op column is left in the tidy data frame for ease of sorting/filtering.
This addresses issue #27

@puterleat puterleat mentioned this pull request Aug 1, 2017
@alexpghayes
Copy link
Collaborator

I'd love to get this merged in soon if you're still interested. If you are, can you:

  • Add tests for the methods you've implemented along the lines of those in test-lm.R. Ideal we're looking for fairly high coverage, with the tidiers tested for a decent variety of lavaan objects
  • Make all of your methods return tibbles
  • Add yourself as a contributor in DESCRIPTION

@benwhalley
Copy link

Cool. I've added some tests based on the test-lm.R file and amended the DESCRIPTION.
I have to admit, I don't fully understand the failures in Travis, which seem to be linked to rstanarm not installing. Do let me know if anything else is needed.

@alexpghayes
Copy link
Collaborator

Sorry for taking so long to get back to you. Thanks for adding the tests!

I've made a couple changes to align with current tidyverse style. These aren't fully documented at the moment, but we've started here. Couple followup requests:

  • The column names in tidy and glance output should be consistent with the rest of broom. Can you provide descriptions of the following so I can find the best match?
    • aic bic cfi chisq logl npar rmsea
    • std.lv std.all std.nox
  • The documentation should include examples of how use tidy.lavaan and glance.lavaan. It should also fully describe what each column in the output represents.
  • For glance I think it would be better to remove the fit.measures argument and always return the largest reasonable set of measures of fit. Users can then later remove the ones they aren't interested in with select.

The Travis failures are currently on our end, we're currently only asking PRs to pass the AppVeyor build.

@benwhalley
Copy link

benwhalley commented Jun 14, 2018

Just to respond inline below...

The column names in tidy and glance output should be consistent with the rest of broom. Can you provide descriptions of the following so I can find the best match? aic bic cfi chisq logl npar rmsea

logl is log likelihood
chi2 is the chi squared — a measure of discrepancy between data and model fit
aic, bic are measures of fit based on information theory
npar is number of params in the model
rmsea is root mean squared error of approximation — another fit measure

std.lv std.all std.nox: these are all different types of standardised coefficient. That is, the params are standardised with respect to the latent variables only (std.lv) or all, or just excluding exogenous variables (nox)

On reflection I wonder if it would be better to offer this dataframe unstandaqdized by default, and in long form, with an additional term variable if the user requests standardised coefs?

The documentation should include examples of how use tidy.lavaan and glance.lavaan. It should also fully describe what each column in the output represents.

I can add this

For glance I think it would be better to remove the fit.measures argument and always return the largest reasonable set of measures of fit. Users can then later remove the ones they aren't interested in with select.

The most common case here, I think, is to want to compare models on various common fit indices, and perhaps also check the number of params in the model, and the number of observations. The fit measures included here are based on recommendations/expectations when publishing a model.

On reflection, I think it might be best if I added the model N, and the estimator to this and always returned in long form. There's actually no point replicating lavaan::fitmeasures too closely... the trouble there is that one always has to subset the fitmeasures returned by default, and look elsewhere for the N and estimator. I'll the another look

@alexpghayes
Copy link
Collaborator

Thanks! I'll probably change some of the glance column names up before the release for consistency with the rest of broom, but this looks great! I appreciate your contribution!

@alexpghayes alexpghayes merged commit 445ee1b into tidymodels:master Jun 14, 2018
@github-actions
Copy link

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants