-
-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW]: aPhyloGeo: a multi-platform Python package for analyzing phylogenetic trees with climatic parameters #6579
Comments
Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks. For a list of things I can do to help you, just type:
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
|
|
Software report:
Commit count by author:
|
Paper file info: 📄 Wordcount for ✅ The paper includes a |
License info: ✅ License found: |
@editorialbot check references |
|
@TahiriNadia - we've now started the "review" thread on github, ie, here. Please use this comment thread to ask questions and, once initial reviews are available, to respond to issues that the reviewers raise. |
Review checklist for @annazhukovaConflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
|
Review checklist for @mmore500Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
|
@mmore500 - how is the review going? Do you have any questions? |
Thanks for checking in. No issues so far! I have some time set aside shortly to sit down and complete my review. |
This sounds great, @mmore500 ! Please let me know if any questions arise. Thank you again! |
@annazhukova - how is the review going? do you have any questions yet? |
Some comments on the manuscript. Planning to follow up on the software content shortly. IntroductionThere is a grammar issue in “between a genetic of species and its habitat during the reconstruction” Statement of NeedAddress in more specific terms what specific scientific question(s) an be addressed through these analyses. State of the Field
figure
Pipeline:
Multiprocessing:What windows are you referring to? Dependencies:Citations to the software would be appropriate. Conclusion:
Overall:a specific application example or case study would greatly benefit the clarity of the manuscript |
I have filled in my checklist, and here are a few comments:
I have not ticked this iteam as looking at the contributors page I saw that the user my-linh-luu seemed to have contributed substantially to the software but does not seem to be on the authors’ list
From what I understood from the guidlines, the software needs to be already quite established (cited, used) which does not yet seem to be the case here.
I saw that the first tag was in June 2022, so one would expect a quite established software with quite some usages
I have counted about 600 on the contributors page
3 (on the paper, more on github)
I have assessed the LOC with
According to google-scholar the only citation is a self-citation
I think that a clear example of an analysis pipeline with the software would highly increase the chances of future citations (as people would know how to use the software for their data).
See this issue
I haven't managed to install it (see above)
I haven't managed to install it (see above)
See issue
See issue
There are guidlines but they could be better illustrated: see issue
From what I understood at first the goal of the software is to allow to analyse the correlation between the climate and species evolution. Though reading further and especially looking at the figure, it seems to me that the goal might be to select gene regions that have the most correlation (?)
There is the corresponding section, but it reads to me a bit too general. A more concrete example would help here too.
The authors only mention their own previous work in this section. I would expect here seeing what can be done with other packages, for instance used in classical phylogeography: Ancestral Character Reconstruction for geographic and or climate characters, GLM with climate as a factor etc.
Adding more information to the State of the field section would add more references too |
@TahiriNadia - it looks like @mmore500 and @annazhukova have offered comments on the submission. Do you have any questions on how to proceed? |
@mmore500 - when it's possible, please place check marks in your checklist above. It seems to be empty right now. Thanks again! |
@TahiriNadia - do you have any questions about the remaining tasks? Please let me know. Thanks again! |
No problem at all, and thanks for following up! Instead of using the dataset you originally suggested (due to some links being unavailable), we shifted our focus to the dataset from Uhlir et al. (2021) "Adding pieces to the puzzle: insights into diversity and distribution patterns of Cumacea (Crustacea: Peracarida) from the deep North Atlantic to the Arctic Ocean" [1]. Through this analysis, we derived substantial conclusions that demonstrate the scientific relevance and potential of the package. These findings offer compelling evidence of its utility for the research community. Additionally, we are currently in active communication with the lead author of the Uhlir et al. study to ensure the final validation of our results. You can find further details and examples of our study case in the documentation. I hope this provides clarity, and I'm happy to engage in further discussion if needed! [1] Uhlir, C., Schwentner, M., Meland, K., Kongsrud, J. A., Glenner, H., Brandt, A., Thiel, R., Svavarsson, J., Lörz, A.-N., & Brix, S. (2021). Adding pieces to the puzzle: insights into diversity and distribution patterns of Cumacea (Crustacea: Peracarida) from the deep North Atlantic to the Arctic Ocean. PeerJ, 9, e12379. |
Thank you, @fboehm. I’ve completed the revisions for the second tutorial as recommended in the review and am now addressing licensing and making a few final adjustments. I’ll present a new solution ASAP. |
Thank you, @fboehm. I will continue to work on it. |
Thank you @TahiriNadia ! I appreciate your continued work on this. I realize that you and your team have many duties, and I hope that I'm not pestering you. I'm hoping to keep communications happening, even in weeks when people might not have time to work on the remaining tasks. I'm also trying to respect the time and commitments of the reviewers. I appreciate your understanding. Thanks again! |
Thank you, @fboehm, for your thoughtful message. We appreciate your understanding, and we’ll continue to keep communication open even during busier weeks. Rest assured, we're working diligently to balance our ongoing responsibilities and will keep moving forward with the remaining tasks. We’re mindful of the reviewers’ time as well and are committed to maintaining progress. Thanks again for your patience and support! Thanks again. |
Dear @fboehm |
@editorialbot assign me as editor 👋 folks, @fboehm has asked me to help out with his submissions so I'm going to take this one over. @TahiriNadia – from your most recent comment here it seems like you feel like you have accommodated/responded to all of the reviewer feedback at this point? Reading the thread history, the most meaningful outstanding question seems to be that from @theosanderson here: tahiri-lab/aPhyloGeo#55 . @theosanderson do you feel like your questions/concerns have been adequately addressed? |
Assigned! @arfon is now the editor |
Hello, Unfortunately, my initial concerns persist. I was concerned that the approach taken by aPhyloGeo would not yield meaningful results about connections between climatic factors and evolution. I have not yet seen evidence that it can. The authors have provided an analysis of 35 sequences of 500bp of ribosomal RNA from shrimps, and claim that they can identify specific regions of this sequence which correlate with wind-speed and (separately) with oxygen concentration. It seems to me very unlikely that an analysis of 35 sequences would have adequate power to detect statistically meaningful such effects given the number of degrees of freedom (many windows of this sequence, many environmental variables). The authors' other example, of SARS-CoV-2, consists of just five sequences. (These concerns should not really be interpreted as a request for larger analyses: the convincing evidence would come from comparing to data with a known ground-truth relationship between evolution and climatic factors). |
Thanks @theosanderson. If I'm understanding the concern here, the concern is whether this software can be used to generate meaningful scientific results? Typically we wouldn't try and pass judgement this at JOSS (i.e., we're not looking to validate scientific results here), but if I understand correctly, this software has yet to be used to generate scientific outputs for any peer-reviewed papers @TahiriNadia? I note that in this issue, you point to this analysis https://github.com/tahiri-lab/aPhyloGeo/wiki/Study-case – is it correct to say that this analysis has not been published in a peer reviewed journal yet? |
Yes @arfon I think that's largely correct. My concerns relate to the claims that the tool can give insights into the role of climate in affecting evolution rather than into whether it performs the workflow in Fig. 1. I can confirm that it does (appear to) perform the workflow shown in Fig. 1, and if that's sufficient for an editorial decision that's great. I believe the authors have cited aPhyloGeo in some reviewed conference proceedings. |
Thanks @theosanderson. @TahiriNadia – I'd like to offer you the chance to provide a clear rebuttal here, but based on the review feedback, my current inclination is to reject this submission due to lack of (peer reviewed) evidence that the software is doing something with scientific merit. Please respond with any further information/justifications you have. |
Thank you, @arfon, for providing the opportunity to elaborate on our work. Our tool has been rigorously peer-reviewed and published in the SciPy proceedings over three consecutive years (2022, 2023, and most recently a few weeks ago in 2024), demonstrating a commitment to reproducible and scalable phylogenetic and phylogeographic analyses. This progression has consistently showcased improvements and new features that enhance the tool’s ability to address key challenges in host-virus relationships and pathogen evolution, as demonstrated in our recent work on coronavirus and bat host interactions (Li & Tahiri, 2024). Each publication highlights the practical utility and scientific relevance of our tool, addressing issues of data integration and reproducibility that are critical to advancing research in phylogeny. Moreover, our tool’s pertinence has been further validated through an active collaboration with Uhlir et al. (2021), specifically leveraging their extensive dataset on Cumacea diversity and distribution from the North Atlantic to the Arctic, published in PeerJ. This collaborative validation provided an ecological framework that underscored aPhyloGeo’s flexibility and robustness when applied to complex ecological data, establishing its value beyond viral phylogenies and into broader ecological contexts. Uhlir et al.’s involvement has been instrumental in ensuring the tool’s functionality aligns with both ecological and evolutionary research needs. In addition to these peer-reviewed outcomes, we are preparing a new manuscript that will present additional applications and validation results in a forthcoming journal submission. This new work will continue to build on our tool’s scientific impact, further solidifying its role as a versatile, scientifically robust resource for the research community. |
Thanks, @arfon. To clarify, the tool has indeed been validated and used to produce scientific results that have undergone peer review. Recently, our results were accepted in the SciPy proceedings as part of a peer-reviewed conference, where we detailed the scientific insights generated by the tool. Additionally, we have actively collaborated with the Uhlir team, using their ecological datasets to further validate the software’s relevance and applicability. |
@arfon: |
All the reviewers' requests have been addressed comprehensively, covering every detail, including the BIN repertory, tests, tutorial, Sphinx documentation, README, and wiki. |
The references provided are very useful, thanks for passing them along. In (Li & Tahiri, 2024), I do see how distance comparisons between trees (i.e., the bat host phylogenies and coronavirus phylogenies) were applied shed light on co-evolution. Would you be able to clarify the role of biogeography, and more explicitly aPhyloGeo’s concept of climatic trees, in that work? Could you also clarify the role that the aPhyloGeo code itself in that work? I looked through the workflow files linked at https://github.com/tahiri-lab/aPhyloGeo.sm/, and didn’t see aPhyloGeo used as an import or portions of the aPhyloGeo library code I recognized included directly. |
Dear @arfon, List of peer-reviewed evidence: 1️⃣ Gagnon, J. & Tahiri, N. (2024). Ecological and Spatial Influences on the Genetics of Cumacea (Crustacea: Peracarida) in the Northern North Atlantic. Proceeding in SciPy 2024, Tacoma, WA, USA 2️⃣ Li, W. (2023). New algorithm to assess the environmental influence of Coronavirus through phylogeographic analysis. MSc. these, University of Sherbrooke, QC, Canada. 3️⃣ Li, W. & Tahiri, N. (2024). Host–Virus Cophylogenetic Trajectories: Investigating Molecular Relationships between Coronaviruses and Bat Hosts. Viruses, 16(7), 1133. 4️⃣ Li, W. & Tahiri, N. (2023). aPhyloGeo-Covid: A Web Interface for Reproducible Phylogeographic Analysis of SARS-CoV-2 Variation using Neo4j and Snakemake. Proceeding in SciPy 2023, Auxtin, TX, USA 5️⃣ Koshkarov, A., Li, W., Luu, M. L., & Tahiri, N. (2022). Phylogeography: Analysis of genetic and climatic data of SARS-CoV-2. Proceeding in SciPy 2022, Auxtin, TX, USA |
Dear @mmore500, Thank you for the insightful question. For our study, we did indeed include the aPhyloGeo package via pipy to support the analyses. The references you provided offer valuable context, particularly regarding the role of distance comparisons in understanding co-evolution patterns, as seen in (Li & Tahiri, 2024). In that work, biogeography, and more specifically the concept of climatic trees within aPhyloGeo, was essential for mapping phylogenetic relationships against climatic gradients, revealing potential ecological constraints on co-evolutionary dynamics. Best. |
Apologies if I'm missing something obvious, but could you link me to the lines in https://github.com/tahiri-lab/aPhyloGeo.sm/ where aPhyloGeo is imported and used so I can better understand the role that it played? I cloned a copy of that repository and performed a project search for "aphylogeo" (case insensitive) but wasn't able to locate the parts of the pipeline it was integrated into. |
Looking closely through (Li & Tahiri, 2024), I do see some mentions of the bat populations being sampled from a specific region (China). I did also see a brief note in the discussion that suggests an analogy between vicariance/biogeography and host range:
Could you more directly point me to which analyses reported in https://doi.org/10.3390/v16071133 explicitly incorporate climactic or geographic data? I also looked in the supplement at https://www.mdpi.com/article/10.3390/v16071133/s1 but did not immediately see where climactic or geographic data was incorporated into analyses. |
Thank you, @mmore500, for your detailed review and questions. In (Li & Tahiri, 2024), the primary focus was indeed on the cophylogenetic patterns between coronaviruses (CoVs) and bats rather than on explicit climatic or geographic variables. The geographic scope—bat samples from China—was contextualized primarily to leverage the region’s diverse bat populations as a rich source of CoV diversity and host-virus interactions. Specific geographic origins of the bat species were referenced through metadata in GenBank and the original studies; however, these were not analyzed with climate data or mapped with explicit spatial models (but indirectly with geographical conditions). The study does make an analogy to vicariance biogeography in discussing host-parasite relationships, but this analogy serves more as a conceptual framework rather than being operationalized in a climate or geography-based model. The analyses center on genetic data, phylogenetic congruence, geographic, and recombination regions, focusing on the molecular dynamics between hosts and CoVs rather than incorporating climatic variables directly. If further elaboration on these regional influences is valuable, incorporating spatial or climate-focused data could indeed provide additional context to the observed host-virus associations. |
Thank you for reaching out, and I appreciate your attention to detail in reviewing the repository. In the pipeline, aPhyloGeo is referenced within the broader analytical workflow rather than as a distinct importable package (sorry for this confusion). It is embedded through specific scripts that address co-phylogenetic analyses, including tree construction and congruence metrics between bat hosts and CoVs. These scripts, which use RAxML for phylogeny inference and PACo for congruence analysis, do not explicitly use "aPhyloGeo" as a named module but rather reference this process as part of the complete aPhyloGeo pipeline. For a direct view of this integration, I recommend checking the workflow scripts (e.g., Snakefile) for pipeline steps involving phylogenetic alignment, tree reconciliation, and the Robinson–Foulds distance. These components represent the core functions associated with aPhyloGeo's intended cophylogenetic analysis, although the term itself may not appear in a standalone form in the code. |
The most recent peer-reviewed study directly addressing Cumacea is as follows: 4️⃣ Gagnon, J. & Tahiri, N. (2024). Ecological and Spatial Influences on the Genetics of Cumacea (Crustacea: Peracarida) in the Northern North Atlantic. Proceeding in SciPy 2024, Tacoma, WA, USA |
aPhyloGeo is included as a step in our analysis, as shown in the pipeline provided in this paper. |
Dear Editor @arfon, We sincerely thank the reviewers, @mmore500 and @theosanderson, for their valuable feedback, which has greatly improved the quality and clarity of our manuscript. Following their insightful comments, we have carefully addressed each requested modification. We removed self-references to enhance objectivity, and we simplified the pipeline figure. We also reviewed and ensured the precision of all decimal points in statistical tests to maintain rigorous accuracy and also the path. The licensing issue for the package was resolved by removing the bin directory, which was deemed unnecessary. To further demonstrate the tool's value and originality, we conducted additional analyses showcasing its utility in unique ways, confirming that no existing package matches the specific functionalities of aPhyloGeo. Furthermore, we illustrated the tool’s application in peer-reviewed publications, underscoring its relevance and robustness in research contexts. Finally, we carefully revised the manuscript to reduce its length without compromising the quality of the content, ensuring a concise and focused presentation. Sincerely, |
Submitting author: @TahiriNadia (Nadia Tahiri)
Repository: https://github.com/tahiri-lab/aPhyloGeo
Branch with paper.md (empty if default branch): joss-journal
Version: v1.0.0
Editor: @arfon
Reviewers: @annazhukova, @mmore500, @theosanderson
Archive: Pending
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@annazhukova & @mmore500, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review.
First of all you need to run this command in a separate comment to create the checklist:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @fboehm know.
✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨
Checklists
📝 Checklist for @annazhukova
📝 Checklist for @mmore500
📝 Checklist for @theosanderson
The text was updated successfully, but these errors were encountered: