-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with paper #1
Comments
I thank the reviewer for his helpful feedback. I have updated the paper in response on branch "1-issues-with-paper". Below are my responses to each comment: Some comments on the manuscript in https://github.com/marccanby/linguiphyr/blob/main/paper/paper.md?plain=1
L32-33: "graphical user interface (GUI)" - should be capitalised if you're using it as an acronym. L42: "easily interpretable by linguists" - because linguists can't understand likelihood or Bayesian approaches? please reframe. Perhaps just "relatively simple interpretations" or "conceptually simple" (although I note that maximum parsimony is not as simple as it appears on the surface for many reasons. L46: "tree search. Other concerns about fully parametric approaches have been raised as well, such as the suggestion that non-parametric methods like parsimony are more accurate [@barbancondiachronica2013; @tutorialNicholsWarnow; @holmes2003statistics]. " -- this is an unfair reading of the literature. I've argued in a number of papers that Bayesian methods are far superior to MP for language trees, but there are (more importantly) long-standing issues with MP in terms of it being statistically problematic (e.g. Felsenstein 1978 Syst. Zool, Steel & Penny 2000, Mol. Biol. Evol.). Nor is it correct to say that MP is non-parametric, it has an implicit model behind it (Steel and Penny again).
L56: "<!---Currently, the de-facto"... it's a shame this is commented out as it makes a good point. I wouldn't, however, frame this as a 'standard', just that there's a set of tools in use. Some are parsimony based, some are likelihood, or Bayesian. There are dedicated packages for Bayesian methods out there and tutorials for them (e.g. https://doi.org/10.1093/jole/lzab005 or your citation to IndoEuropeanphylogeneticswithR), but MP is harder to use.
L64: "Over-emphasis on technical ability often hinders this work." -- is "over-emphasis" the right term? Who is emphasising this?
L112: "which can be standard, irreversible, or custom" -- should explain these.
L118: this coding scheme would also work for structural/typological data which is often used. It would be good to mention this.
L124: "An abundance of literature discusses good methodology for doing this [@ringe2002indo; @tutorialNicholsWarnow]." -- would it be better to cite some standard historical linguistics references here?
L152: probably worth noting why this might not hold (e.g. if any one language in the clade has lost this cognate or evolved a new cognate then this will not hold).
In responding to this, I also realize that l. 189 should have a clarifying sentence:
L154: the commented out information here looks useful. Can it be added back? L158: "integer" - given the emphasis of this package for non-computational uses, "integer" is probably better as "numeric" L178 etc: it would be helpful to have citations for e.g. "compatibility" so users can track down what these mean if they want.
Bigger questions and remarks: I would like to see some more justification of PAUP*. Many parsimony and distance algorithms etc are implemented directly in R packages like phangorn or ape -- is there a reason to shell out to PAUP*? I can see efficiency being a criterion (PAUP* will blow R out of the water) but the type of user using this package is unlikely to be using data where computational efficiency will become critical. |
ok, I'm happy with all of these responses. Thanks! |
Some comments on the manuscript in
https://github.com/marccanby/linguiphyr/blob/main/paper/paper.md?plain=1
As a proviso, I'm a proponent of a very different framework (Bayesian approaches) and my comments should be read through this lens. I have, however, tried not to be too partisan in my suggestions below.
L30: "undertaken by statisticians" - Please reword, this is incorrect. I only know of one statistician publishing language phylogenies. If you look here most authors are linguists, followed by biologists or computer scientists.
L32-33: "graphical user interface (GUI)" - should be capitalised if you're using it as an acronym.
L42: "easily interpretable by linguists" - because linguists can't understand likelihood or Bayesian approaches? please reframe. Perhaps just "relatively simple interpretations" or "conceptually simple" (although I note that maximum parsimony is not as simple as it appears on the surface for many reasons.
L46: "tree search. Other concerns about fully parametric approaches have been raised as well, such as the suggestion that non-parametric methods like parsimony are more accurate [@barbancondiachronica2013; @tutorialNicholsWarnow; @holmes2003statistics]. " -- this is an unfair reading of the literature. I've argued in a number of papers that Bayesian methods are far superior to MP for language trees, but there are (more importantly) long-standing issues with MP in terms of it being statistically problematic (e.g. Felsenstein 1978 Syst. Zool, Steel & Penny 2000, Mol. Biol. Evol.). Nor is it correct to say that MP is non-parametric, it has an implicit model behind it (Steel and Penny again).
L56: "<!---Currently, the de-facto"... it's a shame this is commented out as it makes a good point. I wouldn't, however, frame this as a 'standard', just that there's a set of tools in use. Some are parsimony based, some are likelihood, or Bayesian. There are dedicated packages for Bayesian methods out there and tutorials for them (e.g. https://doi.org/10.1093/jole/lzab005 or your citation to IndoEuropeanphylogeneticswithR), but MP is harder to use.
L64: "Over-emphasis on technical ability often hinders this work." -- is "over-emphasis" the right term? Who is emphasising this?
L112: "which can be standard, irreversible, or custom" -- should explain these.
L118: this coding scheme would also work for structural/typological data which is often used. It would be good to mention this.
L124: "An abundance of literature discusses good methodology for doing this [@ringe2002indo; @tutorialNicholsWarnow]." -- would it be better to cite some standard historical linguistics references here?
L152: probably worth noting why this might not hold (e.g. if any one language in the clade has lost this cognate or evolved a new cognate then this will not hold).
L154: the commented out information here looks useful. Can it be added back?
L158: "integer" - given the emphasis of this package for non-computational uses, "integer" is probably better as "numeric"
L178 etc: it would be helpful to have citations for e.g. "compatibility" so users can track down what these mean if they want.
Bigger questions and remarks:
is there any reason you don't allow likelihood approaches? paup* handles these quite happily
set criterion=likelihood;
I would like to see some more justification of PAUP*. Many parsimony and distance algorithms etc are implemented directly in R packages like
phangorn
orape
-- is there a reason to shell out to PAUP*? I can see efficiency being a criterion (PAUP* will blow R out of the water) but the type of user using this package is unlikely to be using data where computational efficiency will become critical.The text was updated successfully, but these errors were encountered: