-
-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Submission: treedata.table #367
Comments
👋 @uyedaj! Thanks for your submission and sorry for the delay. I'll assign an editor to this submission shortly, who'll get started on the editor checks next week at the latest. |
Hi @uyedaj! Editor checks:
Editor comments
goodpractice::gp output
───────────────────────────────────────────────────────────── I will look for reviewers after these small issues have been addressed. Feel free to let me know if you have any questions. |
Hey team, I have some free time to tackle this tomorrow! I'll push these changes and post an update here then. Thanks @jooolia for getting back with us! |
I think between my commit (7764ef2) and Scott's edit of the initial PR, we should have all these notes handled! |
Thanks @wrightaprilm! I will have a look tomorrow. Cheers, Julia |
In the interest of reducing load on reviewers and editors as we manage the COVID-19 crisis, rOpenSci is temporarily pausing new submissions for software peer review for 30 days (and possibly longer). Please check back here again after 17 April for updates. In this period new submissions will not be handled, nor new reviewers assigned. Reviews and responses to reviews will be handled on a 'best effort' basis, but no follow-up reminders will be sent. Other rOpenSci community activities continue. We express our continued great appreciation for the work of our authors and reviewers. Stay healthy and take care of one other. The rOpenSci Editorial Board |
In this period new submissions will not be handled, nor new reviewers assigned.
Reviews and responses to reviews will be handled on a 'best effort' basis, but
no follow-up reminders will be sent. Please check back here again after 25 May when we will be announcing plans to slowly start back up. We express our continued great appreciation for the work of our authors and reviewers. Stay healthy and take care of one other. The rOpenSci Editorial Board |
Hi @wrightaprilm and @uyedaj I will check the changes you made in March and get back to you. Then I will look for reviewers for the package. Thanks for your patience. |
Hi @wrightaprilm and @uyedaj, Thanks! Julia |
Reviewer: @karinorman |
Package Review
DocumentationThe package includes all the following forms of documentation:
Functionality
Final approval (post-review)
Estimated hours spent reviewing:
Review CommentsI've added some (optional) comments. These are not necessary for your package to follow the current best practices but standards are always evolving and I believe addressing these now will save you some trouble later down the road. It would be super useful for me if you could use a different commit for each the points raised in my review 👍. I hope everything makes sense. Please let me know if something is not clear. DESCRIPTION and other packaging filesI recommend you have a look at the packaging guidelines in rOpenSci's development guide. It lists a number of points that are currently missing from your package. Among others:
You can verify this with the
README
Code
should be if (!equal_T) { ... }
data(anolis)
td <- as.treedata.table(anolis$phy, anolis$dat)
p <- pull.treedata.table(td, type = "phy")
d <- pull.treedata.table(td, type = "dat")
td2 <- as.treedata.table(p, d)
# You would expect to get identical(td, td2) but this line actually throws an error
should read colnames(data) <- paste("trait", seq_len(length(data)), sep="") or better colnames(data) <- paste("trait", seq_along(data), sep="")
This will always surely cause issues upon CRAN submission if not corrected. See this detailed SO answer for more info about the differences between
message("this ", "is ", "a ", "test!")
#> this is a test!
Reproducible example showing the bug: library(treedata.table)
#> Loading required package: lazyeval
#> Loading required package: ape
#> Loading required package: geiger
#> Loading required package: data.table
data(anolis)
set.seed(20200723)
tre <- anolis$phy
dat <- anolis$dat
# shuffle columns
dat <- dat[, sample(ncol(dat), ncol(dat))]
as.treedata.table(tre, dat)
#> Phylo object detected
#> 100 tip(s) dropped from the original tree
#> 100 tip(s) dropped from the original dataset
#> Warning in ape::drop.tip(tree, tree_not_data): drop all tips of the tree:
#> returning NULL
#> $phy
#> 0 phylogenetic tree
#>
#> $dat
#> Empty data.table (0 rows and 11 cols): PCIV_lamella_num,tip.label,island,hostility,ecomorph,PCI_limbs... Created on 2020-07-23 by the reprex package (v0.3.0) This behaviour should be tested in tests.
This can be seen by using the two examples you provide data(anolis)
detectCharacterType(anolis$dat[,1])
#> [1] "discrete"
detectAllCharacters(anolis$dat)
#> [1] "continuous" "continuous" "continuous" "continuous" "continuous" "continuous" "continuous" "continuous"
#> [9] "continuous" "discrete" "discrete"
detectCharacterType(anolis$dat[,1]) == detectAllCharacters(anolis$dat)[1]
#> FALSE Regarding my later point about tests, this would make a good test: ensuring that you find the correct number of discrete/continuous characters in the anolis dataset.
data(anolis)
td <- as.treedata.table(anolis$phy, anolis$dat)
td_old <- td
td_new <- droptreedata.table(tdObject = td, taxa = c("chamaeleonides" ,"eugenegrahami"))
identical(td, td_old)
#> TRUE
identical(td, td_new)
#> FALSE
https://github.com/uyedaj/treedata.table/blob/7764ef2c4a0e56f1cb5fbcef6d910fb8ba0eea8c/R/tdt.R#L4
It's not obvious what's this OK is supposed to mean, I think it would be simpler if it was empty in this case.
Could you look into it please? I assume you meant Tests
that need to use
Documentation
but this is unnecessary since you don't use
I would change it to something like #' @param returnType Either discrete (the default) or continuous
extractVector(td, "SVL", "island")
Vignette
Misc
|
Hey @uyedaj and @cromanpa94, I'll start tackling these early next week. I'm devoting 100% of my time to class prep this week. |
Sounds perfect @wrightaprilm!! I'm also go over the @Bisaloo's comments during the weekend! |
BTW, feel of course free to challenge any of my comments. You have spent more time thinking about this package than me so it's entirely possible that some of my concerns are misplaced. |
Hi @cromanpa94 - I really enjoyed exploring your package. It seems like a really nice tool for making phylogeny data easier to work with and I'm sure it's going to get a lot of use! See below for my comments. Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Functionality
Final approval (post-review)
Estimated hours spent reviewing:
Review CommentsThis is my first review for ROpenSci and I tried to approach it from the perspective of a naive user, so most feedback is related to things that may be difficult with initial use. Of course you are the expert on your users and this kind of data so I'll look forward to learning more from you about what does and doesn't make sense in this context. General Comments:
Functions:
|
Thanks a lot Kari and Hugo for your careful and thoughtful reviews.
…On Tue, 4 Aug 2020 at 21:40, Kari Norman ***@***.***> wrote:
Hi @cromanpa94 <https://github.com/cromanpa94> - I really enjoyed
exploring your package. It seems like a really nice tool for making
phylogeny data easier to work with and I'm sure it's going to get a lot of
use! See below for my comments.
Package Review
*Please check off boxes as applicable, and elaborate in comments below.
Your review is not limited to these topics, as described in the reviewer
guide*
- *Briefly describe any working relationship you have (had) with the
package authors.*
- As the reviewer I confirm that there are no conflicts of interest
<https://devguide.ropensci.org/policies.html#coi> for me to review
this work (If you are unsure whether you are in conflict, please speak to
your editor *before* starting your review).
Documentation
The package includes all the following forms of documentation:
- *A statement of need* clearly stating problems the software is
designed to solve and its target audience in README
- *Installation instructions:* for the development version of package
and any non-standard dependencies in README
- *Vignette(s)* demonstrating major functionality that runs
successfully locally
Consider putting a link in the README.
- *Function Documentation:* for all exported functions in R help
- *Examples* for all exported functions in R Help that run
successfully locally
- The data manipulation in as.treedata.table() example should be
explained, or you could do it behind the scenes and just import the objects
with an explanation of what they are/their structure.
- No example for print.treedata.table
- *Community guidelines* including contribution guidelines in the
README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and
Maintainer (which may be autogenerated via ***@***.***).
The maintainer denoted in ***@***.*** defferent from Maintainer: in
DESCRIPTION. I don't see an explaination of contribution guidlines, the
ROpenSci package documentation (
https://devguide.ropensci.org/collaboration.html#friendlyfiles) gives
good guidance on adding it.
Functionality
- *Installation:* Installation succeeds as documented.
Not necessarily something to be fixed, but so you're aware, I'm still
running R 3.6.3 and the dependency mnormt will not install from CRAN
unless you're running R >4.0.0, so I had to manually install from an old
github version (e.g. ***@***.***'))
before your install would work.
-
*Functionality:* Any functional claims of the software been confirmed.
-
*Performance:* Any performance claims of the software been confirmed.
-
*Automated tests:* Unit tests cover essential functions of the package
and a reasonable range of inputs and conditions. All tests pass on the
local machine.
goodpractice::gp() output:
- linebreaks under 80 characters
- poor unit test coverage (only 34%)
-
*Packaging guidelines*: The package conforms to the rOpenSci packaging
guidelines
- Need to flesh out the README, a list of important aspects can be
found in the ROpenSci package guidelines (section 1.5)
https://devguide.ropensci.org/building.html.
Final approval (post-review)
- *The author has responded to my review and made changes to my
satisfaction. I recommend approving this package.*
Estimated hours spent reviewing:
4 hrs
- Should the author(s) deem it appropriate, I agree to be acknowledged
as a package reviewer ("rev" role) in the package DESCRIPTION file.
------------------------------
Review Comments
This is my first review for ROpenSci and I tried to approach it from the
perspective of a naive user, so most feedback is related to things that may
be difficult with initial use. Of course you are the expert on your users
and this kind of data so I'll look forward to learning more from you about
what does and doesn't make sense in this context.
General Comments:
- The warnings for co-indexing calls (e.g. td[, head(.SD, 1), by =
"ecomorph"] from the vignette don't seem particularly informative, I
would suppress unless necessary.
- I think having the user confirm the changes for droptreedata.table
is unnecessary. For me at least it made me assume that the function was
modifying an object in place, which I realized it wasn't after some
exploration.
Functions:
- Consider changing the naming convention to *.td.table rather than
*.treedata.table for the sake of brevity. You could also add a td to
other exported functions (e.g. td.extractVector).
- The detectCharacter functions, filterMatrix, forceNames, and hasNames
seem like they could be helper functions that facilitate the other main
functions, but maybe don't need to be exported. If it makes sense I would
not export them. If not, a little more explanation of how they fit into a
workflow in the vignette would be helpful.
- forceNames and hasNames could use more detailed description and/or
justification, I'm not sure their functionality even after running the
example. data(anolis) and forceNames(anolis$dat, "row") return
seemingly the same object.
- It could be useful to make dropping tips or dataframe entries
optional when matching trees/dataframes instead of automatically dropping
from either to match, so that the user has the option to preserve all data.
- filterMatrix would be cleaner if charType wasn't required as an
argument but instead calculated within the function. I have a hard time
imagining a scenario in which you would want a vector of character types
that didn't match the matrix you were already giving the function.
- I'm curious about the pull.treedata.table function. Is there some
utitlity in having a function that mimics the $ operator? Or maybe I'm
missing some of the applications of this function?
detectCharacter Functions:
- Please explain the definition of continuous or discrete in the
descriptions of detectCharacterType, detectCharacterChanges, and
filterMatrix.
- I would change the language around "character" which is a specific
object type, whereas this function appears to perform on multiple vector
types.detectVectorType or detectColumnType may be more intuitive.
- There is strange behavior in the examples for these functions. For
example detectCharacterType(anolis$dat[,1]) returns "discrete", but
detectAllCharacters(anolis$dat[,1:3]) returns three "continuous"
entries. From my understanding of how the functions work I would expect the
first entry to be "discrete" to match
detectCharacterType(anolis$dat[,1]).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#367 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOPZSQJR3VGICVPQI5UUMTR7BP2BANCNFSM4KOKZIEQ>
.
|
Thank you so much @karinorman and @Bisaloo for your comments 😄 We have addressed your concerns in a new branch (https://github.com/uyedaj/treedata.table/tree/cristian). Our answers are presented under issues in our repository (ropensci/treedata.table#1) but we're more than happy to post the same information here if necessary. |
Thank you for your work! I think the package has much improved. The authors have addressed all my comment either by updating the package or making a receivable answer as to why they'd rather not, so I recommend approving this package. I have submitted a PR with a couple of very minor changes. I hope you don't mind. It seems like it will save time to everyone to make the changes directly instead of copy/pasting the relevant lines here. Two last minor comments:
|
Thank you so much @Bisaloo! We just addressed all the remaining comments int the following commits: ropensci/treedata.table@4917b4a We also included additional details on why to talk about Matrix in We're happy to make any other changes if necessary! |
@cromanpa94 Thank you for your response and especially for additionally explanation of domain-specific language. @jooolia The reviewers have addressed my comments and I am happy to recommend the the package for approval. |
Great! Thanks @cromanpa94 , @wrightaprilm and @uyedaj for the submission and @karinorman and @Bisaloo for the reviews. I am a guest editor so I need to get help creating the github team that will allow us to transfer the repository over to ropensci and the next steps. I will follow up soon. Thanks, Julia |
Thanks @jooolia, @Bisaloo, and @karinorman for all the help and feedback! |
Awesome, thanks everyone for your hard work, especially during such a challenging time! |
Thanks again @uyedaj, @wrightaprilm,and @cromanpa94 for submitting and @karinorman and @Bisaloo for your reviews! :) To-dos:
Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @stefaniebutland in your reply. She will get in touch about timing and can answer any questions. We've put together an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here. Thanks a lot for your contribution and your patience with the process during this difficult time. Cheers, Julia |
Apologies @jooolia, can you resend the invitation to ropensci? It has just expired for me. |
No prob @uyedaj I will ask again. :) Thanks, Julia |
Hi @uyedaj, I sent a new invitation just now to juyeda@vt.edu. |
@cromanpa94 @uyedaj I re-sent an invitation to the ropensci GitHub organization. @wrightaprilm was already in the team and org. @stefaniebutland I think you mean the slack workspace, which is useful in any case! |
Yay! I made the three of you admin again. ✔️ |
I hope we have completed all of these satisfactorily:
Thank you for your help! |
Great thanks @uyedaj ! It looks good and I will close this issue. I think that you can also close your issue related to this review in your repo. |
Submitting Author: Josef C Uyeda (@uyedaj)
Repository: https://github.com/uyedaj/treedata.table
Version submitted: 0.1.0
Editor: @jooolia
Reviewer 1: @Bisaloo
Reviewer 2: @karinorman
Archive: TBD
Version accepted: TBD
Scope
Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
Explain how and why the package falls under these categories (briefly, 1-2 sentences):
The package allows manipulation of data frames with
data.table
while maintaining a matched ordering to anape
phylogenetic tree. This allows rapid data munging without worrying about amismatch with the tree order.
Who is the target audience and what are scientific applications of this package?
Researchers doing phylogenetic analyses.
Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?
treeplyr
works similarly but is built on the packagedplyr
rather thandata.table
, which has some advantages todplyr
.If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.
Technical checks
Confirm each of the following by checking the box. This package:
Publication options
JOSS Options
paper.md
matching JOSS's requirements with a high-level description in the package root or ininst/
.MEE Options
Code of conduct
The text was updated successfully, but these errors were encountered: