Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JOSS] Paper feedback #117

Closed
ianfhunter opened this issue May 13, 2023 · 3 comments
Closed

[JOSS] Paper feedback #117

ianfhunter opened this issue May 13, 2023 · 3 comments

Comments

@ianfhunter
Copy link

Feedback from JOSS review

Overall, the paper is well written, I appreciate being able to understand it without too much prior knowledge. I have a few comments though to be addressed:

  • line 27: "~106+" - The claim is unclear. This should be either "~106" OR "106+". The training routine can scale to approximately that many points, or it can scale to more than that amount?

  • line 50: You should probably cite lmdb

  • line 54: the acronym UQ is not expanded.

  • You make claims that AMPTorch can support higher amounts of datapoints than the base AMP and other existing codes (lines 37-38). It would be helpful to name some other examples than AMP and state their limits for comparison, to highlight the impact of this work versus the SOTA.

@ajmedford
Copy link
Collaborator

@ianfhunter Thanks for the feedback! We have made some changes, and address these concerns point by point below. Let us know if you have further recommendations.

Comment: line 27: "~106+" - The claim is unclear. This should be either "~106" OR "106+". The training routine can scale to approximately that many points, or it can scale to more than that amount?

Solution: Removed "+", so it should appear as "~10^6". In theory it is possible to use more points, but we have not tested beyond a few million so this seems like the correct way to write it.

Comment: line 50: You should probably cite lmdb

Solution: LMDB package didn't have a specific doi or article to reference to so we added the author and another author who contributed (http://www.lmdb.tech/doc/). Hope this will suffice.

Comment: line 54: the acronym UQ is not expanded.

Reply: It is expanded in Line #71 "... statistically-rigorous uncertainty quantification (UQ) during ..."

Comment: You make claims that AMPTorch can support higher amounts of datapoints than the base AMP and other existing codes (lines 37-38). It would be helpful to name some other examples than AMP and state their limits for comparison, to highlight the impact of this work versus the SOTA.

Solution: Added Atom-centered symmetry function as a comparison for the number of fingerprinting dimensions scale with the number of chemical elements in the training data. Due to the word limit in JOSS, it's a little bit hard for us to expand on this point in greater detail.

@ml-evs
Copy link
Contributor

ml-evs commented May 19, 2023

Just so you have it all in once place, I'll also add my comments on the paper here. The paper is well-written and the package is well-motivated, so my suggestions are mostly minor. (Feel free to tick the boxes yourself)

  • 38: boiler plate -> boilerplate

  • 47: SingleN -> SingleNN

  • 50: Btree-based -> B-tree-based

  • 54: I think the UQ section should cite your own paper on UQ (https://arxiv.org/abs/2208.08337), at the very least!

  • 58: statistically-rigorous -> statistically rigorous

  • 84: the URL reference is mangled, presumably should be just https://doi.org/10.1021/acscatal.0c04525

  • I would suggest adding URLs for the LMDB and Skorch references

  • Minor stylistic thing but it seems like most other JOSS papers include a space before the reference, e.g., blah blah [@Evans2023] rather than blah blah[@Evans2023]

  • This sentence irks me slightly:

    Thus, AMPTorch is currently the only feature-based ML force field code capable of training on an arbitrary number of elements and training points

    as unfortunately by the time of publication these claims are almost always out-of-date (I take at least some of the blame for that!) I would suggest rephrasing it to something like "AMPTorch fills a gap in the ecosystem for feature-based ML force fields with capability for handling arbitrary numbers of elements and training points". I'm aware this is bordering on personal preference (maybe its just the italics...) though so I'll leave it up to you.

nicoleyghu added a commit that referenced this issue Jun 5, 2023
* Fixing paper/

* Add CONTRIBUTING.md file

* Adapt readme.md for contributing file.

* Minor fix.

* Fix readme.md and check the default values with AtomsTrainer class.

* Fix issues in paper.md per Issue #117

* Usage documentation update.

* Fix format via black.

* Updated usage.rst for rtd

* Add image to docs/

* Add to 2D water example.
@ajmedford
Copy link
Collaborator

@ml-evs I think we have addressed all the issues here, thanks for the suggestions! Please take a look and let us know if you have any more suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants