Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
[REVIEW]: insight: Easy Access to Model Information for Various Model Objects #1412
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@alexpghayes, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @karthik know.
Review checklist for @alexpghayes
Conflict of interest
Code of Conduct
referenced this issue
Apr 25, 2019
If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews
To fix this do the following two things:
For a list of things I can do to help you, just type:
This review is of the Github
My comments are largely about this interface. I think about this primarily in terms of complexity. In some sense
My primary concern is that
With such a broad interface, I had a hard time figuring out the entry point. Some questions I asked myself:
I strongly recommend shrinking this interface. In particular
providing nearly identical functionality. Also, I struggled to understand the difference between:
This meant I had to read the documentation for each of these methods before figuring out which one did what I wanted. These distinctions should be explained clearly on the front page of the pagedown website. The diagram in the JOSS paper helps somewhat but still needs an accompanying narrative.
For additional discussion of interface design and hiding complexity, I highly recommend A philosophy of software design by Ousterhoust, which I've been enjoying recently.
While the documentation is very clear about function returns, I had difficulty understanding why behavior differs from different model objects.
On the whole, I get the impression that model components that I conceive of as conceptually similar often end up in different parts of a return object. If I need to understand how
Examples: The documentation shows how a user can extract information from a model, but it doesn't show how the user can use this information to solve larger problems. Some applications showing how the various functions operate together (and how they are distinct) would be greatly beneficial.
My initial impression is that using
@alexpghayes Thanks for the detailed and comprehensive review! I think your comments are very helpful to make the package more accessible to users. We'll try to address your issues point-by-point and respond here once we think we have a revised version that is sufficient for re-reviewing again.
on behalf of the co-authors, let me thank you again for your thorough comments on our submission. We substantially revised our paper, the package documentation and the accompanying website, and revised package code and functions as suggested. Let me give you a more detailed response to your suggestions and our revisions.
Thank you for this comment. We have revised the paper accordingly, making it more clear for users where they can start. We adapted these revisions both to README of the package’s front page (https://github.com/easystats/insight), as well as the online documentation (https://easystats.github.io/insight/articles/insight.html).
We have added more details in the paper, to the README and the online documentation to make this essential distinction clearer to users. Furthermore, in addition with the figure showing how package functions are related to model components (Figure 1 in the paper), it should now be easier to grasp the differences between the get- and the find-functions.
See our previous point. In revising the paper and the online documentation, the need for both get- and find-functions should be clearer now. Since it is not possible to extract parameter names for specific model parts (like random or fixed effects, zero-inflated or conditional component) without additional effort when you have the data from one of the get-functions, and vice versa, getting the underlying data once you have the names of parameters or predictors, we see the need for having both find- and get-functions. There are several use cases where either one is needed, and getting the required information or data easily is one of the core aspects of the insight package.
Thanks for pointing this out. We have added a complete new section on the definition of model components to the paper. Furthermore, we added two figures that help making it more clear what the differences between these terms are, and why it is necessary to have these distinctions. We also added the new section to the README of the package’s front page, as well as to the online-documentation (including the new figures).
We agree with this point, and revised affected package functions accordingly. Now, there are less exceptions, making it more clear and expectable for users what will be returned. Regarding your particular example, the functions now return the elements “conditional” and “random”, being in line with the return values of other model objects.
We agree with this point. We implemented this function, knowing that it currently is still a bit “work in progress”, and there is an open GitHub issue on this (easystats/insight#38). So this is something we plan to address, however, we still need and want to ask other people with correspondent expertise for their opinion in order to adequately address this issue.
This is a design decision where we think it is not easily possible to find a closing decision. Our idea was to reflect the different parts of a model formula in the returned elements from the function. “lme()”, for instance, can have both a random-argument and a correlation-argument, and we wanted to preserve this distinction. For “feis”-objects (fixed effects regressions with individual slopes), we have a complicated formula structure with several components. Again, we wanted to preserve the distinction between these parts of the model formula. However, we are happy to reduce interface complexity here and are open for concrete suggestions that help users and developers.
Thanks for this suggestion. We have added a new section with examples of practical use cases (in different R packages) to our paper.
This is due to performance-reasons. Since each function should work for (almost) every model object, for each test we would need to fit all the models each time again, if tests were organized by models. To reduce computational time, especially for CRAN, we decided to organize tests differently that the code-files in the R-folder.
We have added explicit information about information, support and contributing to the README and website.
We have largely revised the paper, README and website (online documentation), providing a clearer narrative (also at the conceptual level) in order to make the idea and functioning of the package clearer for users.
One of the core aims of the insight package is to provide a unified syntax to access model information. Since this information is not easily accessible for many model objects, the idea is to work with as many models as possible. Against the background, it would run against the nature of this package to narrow the scope of supported models. We have added a section in the paper showing examples of use cases and hope that these examples explain why it makes sense to support many different models.
Again, thank your for this valuable review! We hope that we have addressed all your points sufficiently and are happy to receive your feedback on our revision.
@whedon generate pdf
If the paper.md does not yet compile, we probably can use a temporary PDF?
Response to revisions
Response based on comments by @strengejacke above and the Github master branch on 2019-06-16.
I'm impressed by the changes you've made, especially to the documentation. After reading more carefully, I agree with your comments that the broad interface is warranted.
Something clicked for me when I realized that the importance of differentiating between names and values. Some functions return the names of columns of data, some functions return values of columns of data. The difference is especially clear with
I suspect users will have an easier time with this sort of naming strategy and encourage you to consider this scheme, or something similar.
I appreciated the new definitions of the various components of a model. Instead of presenting this material all at once, I recommend you group the definitions by components that are easy to confuse and set these groups visually apart.
Note that you have chosen to define
Finally, I recommend grouping
@alexpghayes Thanks for your review! Since some of your remaining suggestions mean quite some substantial rework or breaking changes, I would like to clarify some points and suggest how we would address your issues, before we actually start working on it. Could you please give a short feedback if our planned steps are ok for you?
Since insight is part of the larger "easystats" project, we have developed some coding guidelines for our packages (https://github.com/easystats/easystats#convention-of-code-style). Your suggestion would break these guidelines, so for consistency within and across packages, we would prefer to keep the current names. However, we may add aliases to the existing functions.
I'm not sure about this, I could not find clear definitions yet (if you have some references/links, would be happy if you could share them). However, since these functions are currently only used by one package that depends on insight, we could flip the usage of term and variable here.
I'm not sure if this is possible with some pkgdown options, I think that we would have to combine the functions rd-files, so both
This seems to be new in R 3.6 (at least, since then it started appearing for me), and once namespaces are referred to, R warns about overwritten S3-methods. ggplot2 depends on rlang, afaik, and it seems ggplot2 overwrites some S3-methods from rlang (or vice versa). In short, this issues seems unrelated to insight.
We would add some more comprehensive working examples that clearly show some solutions to problems to the readme (and as further (sub) section to the paper) - would that be sufficient?
If you don't want to change these, I understand. Since the difference between
Definitely swap. It look like you found some references in easystats/insight#56 (comment)?
Yeah, you can do this all with
Great, wouldn't worry about it then.
Yep, that'd be ideal! The current examples show how to access certain kinds of information. The open question is when would users want to access this kind of information. What common problem does