Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW]: Graph Transliterator: a graph-based transliteration tool #1717

Closed
38 tasks done
whedon opened this issue Sep 7, 2019 · 74 comments
Closed
38 tasks done

[REVIEW]: Graph Transliterator: a graph-based transliteration tool #1717

whedon opened this issue Sep 7, 2019 · 74 comments
Assignees
Labels
accepted published recommend-accept review

Comments

@whedon
Copy link

@whedon whedon commented Sep 7, 2019

Submitting author: @seanpue (A. Sean Pue)
Repository: https://github.com/seanpue/graphtransliterator
Version: v1.0.4
Editor: @gkthiruvathukal
Reviewer: @rlskoeser, @vc1492a
Archive: 10.5281/zenodo.3558365

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/a9ae921a0402eadf367b888c96d6cfca"><img src="https://joss.theoj.org/papers/a9ae921a0402eadf367b888c96d6cfca/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/a9ae921a0402eadf367b888c96d6cfca/status.svg)](https://joss.theoj.org/papers/a9ae921a0402eadf367b888c96d6cfca)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@rlskoeser & @vc1492a, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @gkthiruvathukal know.

Please try and complete your review in the next two weeks

Review checklist for @rlskoeser

Conflict of interest

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Contribution and authorship: Has the submitting author (@seanpue) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

Review checklist for @vc1492a

Conflict of interest

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Contribution and authorship: Has the submitting author (@seanpue) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?
@whedon
Copy link
Author

@whedon whedon commented Sep 7, 2019

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @rlskoeser, @vc1492a it looks like you're currently assigned to review this paper 🎉.

Important

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf

@whedon
Copy link
Author

@whedon whedon commented Sep 7, 2019

Attempting PDF compilation. Reticulating splines etc...

@whedon
Copy link
Author

@whedon whedon commented Sep 7, 2019

@vc1492a
Copy link

@vc1492a vc1492a commented Sep 13, 2019

Overview: Of the JOSS reviews I have been a part of, this project has (to date) been the most well-documented and polished in terms of code readability and test coverage which is great to see. I think the author(s) did a good job on the software and in providing examples and documentation in order to get started easily. The software is easy to use and I have even shared the software with some of my colleagues who will be exploring its usability in some of our team's projects - I think it could have some utility! Very much appreciated the thorough documentation in the code and tests written to verify functionality as well as with how to contribute, and this will go a long way towards communicating the work but also inviting open-source contributions. I have included some commentary on checkboxes below.

Installation and Installation Instructions: The project documentation provide instructions for installation via PyPi or by cloning the Github repository. After examining both setup.py, requirements.txt, and the source code from the project repository and attempting an installation in my environment, the installation completed successfully however when using import graphtransliterator the import failed due to marshmallow being a undefined import. It is one of the project's dependencies and is noted in requirements.txt, but it was not installed during installation as it was missing from a list of requirements in setup.py. I have corrected this minor error and have provided the fix in this pull request. It may help to also have the project's dependencies noted somewhere in the documentation beyond requirements.txt in the project repository, but I don't think it is a high priority. It may also be nice to list which specific versions of Python 3 have support and are tested for in readme.md and/or in the documentation - in this case, Python versions 3.5 through 3.7. The version numbers are listed in the contribution guidelines, but that's the only location I see the Python versions mentioned outside of configuration files.

Functionality: I have verified the functionality presented through the examples presented in the documentation, and the software's functionality operates as the author describes. The ambiguity checking is a particularly nice feature which could perhaps be used as a verification check if automatically generated transliteration rules. I also liked how the author provided a means to expose the underlying graph of the transliteration rules to the user for examination - whether through visualization our outside analyses. I currently do not have my own use case in which to test the software outside of the examples presented in the documentation - the editor is free to determine whether a more in-depth look at the functionality would be beneficial as part of the review.

Automated Tests: I was able to run the automated tests easily, and the tests are well documented which allowed me to quickly understand their purpose and which functionality in the software they are designed to test. I did notice a bit of unused tests which I removed in the aforementioned pull request just to clean things up a little bit.

References: The author did a good job of pointing out related software with citation and noting the differences between GraphTransliterator and those software. I did however expect some citations for the following sentences:

It [transliteration] enables the standardized organization and search of resources, as in library
systems. It also permits the encoding of additional information, which enables disambiguation
and advanced linguistic analysis, including natural language processing tasks, that are often
not possible in the original script.

A reader which is new to the field may want to explore works in which transliteration is used in these areas - citations would provide this background and ground the paper/software under review with some real-world applications. Providing some references here would provide a more complete paper around the software and its use cases.

@seanpue
Copy link

@seanpue seanpue commented Sep 17, 2019

@whedon generate pdf

@whedon
Copy link
Author

@whedon whedon commented Sep 17, 2019

Attempting PDF compilation. Reticulating splines etc...

@whedon
Copy link
Author

@whedon whedon commented Sep 17, 2019

@seanpue
Copy link

@seanpue seanpue commented Sep 17, 2019

Thanks @vc1492a for your generous and helpful review!

I have made the following changes to the oode and documentation:

  • Fixed installation issues (using your pull request)
  • Added list of dependencies to docs/installation.rst under subsection "Required Modules"
  • Added badge for supported Python versions to README.rst

I have also added references and made changes to the first paragraph of the paper following your suggestion to better elaborate the use cases of transliteration.

I hope that resolves the issues. Thanks so much. (P.S. Would love to hear later about the potential usages!)

@rlskoeser
Copy link

@rlskoeser rlskoeser commented Sep 19, 2019

I have reviewed the software, documentation, and paper. The only checklist item I'm not able to sign off on is a statement of need included with the project documentation. As I understand it, a statement of need should be included in the readme or project documentation, but the only place I could find this was in the article. I suggest adapting the summary information from the beginning of the article.

Other things I noted:

  • I was surprised not to find installation instructions included in the readme (at least the most common version); would be better if people didn't have to click through to read the docs to find that. Based on the reviewing guidelines it sounds like this should be included in the readme, but I checked it off since installation is well-documented in the read the docs documentation.
  • The detailed tutorial is well-written and very helpful, but it's difficult to cut and paste examples from the python console because of the ... included on the beginning of continuing lines. I tried to generate a file I could load using the from_yaml_file method based on the annotated sections you provide in a tutorial. First I got an error because I included the short set of rules, which looks like it has a typo (PR opened). Now that I've corrected that, I'm getting a validation error that I'm not sure how to troubleshoot. (I can't tell if this is a discrepancy between the versions in the tutorial or something I've done wrong.)
marshmallow.exceptions.ValidationError: {'rules': ['Invalid token "," in tokens of rule TransliterationRule(production=\',\', prev_classes=None, prev_tokens=None, tokens=[\',\'], next_tokens=None, next_classes=None, cost=0.5849625007211562)']}

Perhaps you could include the complete yaml file as a link that could be downloaded for use with the tutorial, and then people could compare your working version with the annotated samples in the tutorial? That would also allow you to omit the full lists of tokens rules from the tutorial, which might make it easier to follow.

  • In your contributing doc, the link on GitHub is broken but it works fine on readthedocs. I'm not sure if there's a way to make it work in both places
  • The directions for contributing say to use the Black code formatter, but it isn't included in the dev requirements. You have a note about pip installing them, but it seems like it would be better to add black to the dev requirements and update the contributing instructions to install development dependencies from that file.

@rlskoeser
Copy link

@rlskoeser rlskoeser commented Sep 19, 2019

Forgot to say: this is a very well-documented and interesting tool! I haven't had the chance to work much with languages that require transliteration, but we're starting to work on a project involving Ethiopian Ge`ez. I haven't learned much about the language yet, but will keep this in mind if a use case comes up. It's important to put a statement of need on the readme so that people like me, who don't usually work with languages that require transliteration, will be able to understand immediately the use and power of your tool.

@seanpue
Copy link

@seanpue seanpue commented Sep 19, 2019

Thank you for these excellent corrections and comments, @rlskoeser! I will work on fixing them over the weekend and will let you know when I'm done.

Do let me know how it goes with Ethiopian Ge`ez. I am working on developing a standard for Urdu and some other South Asian languages that will generate the Perso-Arabic text, with or without the Arabic diacritics, and also output in the scholarly transliteration that we use in academic writing as well as in library of congress romanization format, which I need for an archive project.

I also want to figure out a good way to bundle YAML files to the module, too, so that people can contribute them. I might add a transliterators module that reads from a directory of YAML files (or JSON) and can produce an accessible list and GraphTransliterator objects, maybe by setting __all__ programmatically in __init__.py. That will require setting some standards for the metadata field. I may model it on the fields in setup.py. I am not sure if there is any standard for that sort of thing. If you have any thoughts about that, please let me know. Thanks!

@rlskoeser
Copy link

@rlskoeser rlskoeser commented Sep 19, 2019

Bundled transliterator configurations sounds valuable. Maybe you could define a subclass of the GraphTransliterator object that looks for a named config file in a specified path (or possibly multiple paths).

If/when you add that, you'll probably also want to update your contributing document to explain how to add a new transliterator configuration. You'll also want think about what kind of testing a bundled transliterator needs (probably at least as some sanity checking?).

@seanpue
Copy link

@seanpue seanpue commented Sep 30, 2019

I'm close to finishing these corrections. It will probably take at least another week.

I have added bundled transliterators requiring tests that cover all nodes and edges of graphs, as well as OnMatchRules, if applicable. I will have to update the essay, as well.

I will also be cleaning up the documentation and other points following the excellent recommendations of @rlskoeser.

This review process has been so helpful! Thanks, @vc1492a and @rlskoeser!

@gkthiruvathukal
Copy link

@gkthiruvathukal gkthiruvathukal commented Sep 30, 2019

@seanpue: Thanks for your follow up and willingness to address all review feedback.

@vc1492a and @rlskoeser: Thanks for your thorough reviews. This feels like a model of how reviews should be done. Even though the work is clearly well-done and polished, there were minor issues raised to ensure the submission leads to a great outcome.

I feel optimistic this is heading toward acceptance and look forward to recommending same when @seanpue finishes making the minor revisions. Separately, I rarely jump in with any feedback before the reviewers do, but I felt pretty optimistic about this submission when I first saw it. It's nice to see this research area represented within JOSS.

@kthyng
Copy link

@kthyng kthyng commented Oct 22, 2019

Hi @seanpue how are your corrections coming along?

@seanpue
Copy link

@seanpue seanpue commented Oct 24, 2019

Hi @kthyng Almost done! Hopefully today.

@seanpue
Copy link

@seanpue seanpue commented Oct 24, 2019

@whedon generate pdf

@whedon
Copy link
Author

@whedon whedon commented Oct 24, 2019

Attempting PDF compilation. Reticulating splines etc...

@whedon
Copy link
Author

@whedon whedon commented Oct 24, 2019

@seanpue
Copy link

@seanpue seanpue commented Oct 24, 2019

Hi @vc1492a @rlskoeser @gkthiruvathukal

Ok, I think it's ready to go. I have added the statement of need, updated the article, and also added bundled transliterators, a tutorial on adding bundled transliterators, and a command line interface. I have used jupyter-sphinx so that the docs automatically generate the sample code, and the code can now be cut and pasted or downloaded.

@ooo
Copy link

@ooo ooo bot commented Oct 24, 2019

👋 Hey @seanpue...

Letting you know, @gkthiruvathukal is currently OOO until Sunday, October 27th 2019. ❤️

@seanpue
Copy link

@seanpue seanpue commented Nov 8, 2019

Hi @vc1492a @rlskoeser @gkthiruvathukal Wanted to check to see if you've had a chance to review the updates. Thanks.

@gkthiruvathukal
Copy link

@gkthiruvathukal gkthiruvathukal commented Nov 8, 2019

@seanpue It looks like we are nearing a conclusion here! It would be great if @vc1492a and @rlskoeser can chime in and take a final look. If I don't hear any new issues, say, by Monday, I think we are ready to move toward acceptance.

@vc1492a
Copy link

@vc1492a vc1492a commented Nov 8, 2019

@seanpue must have missed the notification previously, apologies! I took a look at the repository and the updates look good to me. I feel that this work is ready for publication in JOSS.

@seanpue
Copy link

@seanpue seanpue commented Nov 21, 2019

Thanks again @vc1492a @rlskoeser Do you think you'll get a chance to review the changes? I believe I have addressed your concerns.

@Kevin-Mattheus-Moerman
Copy link
Member

@Kevin-Mattheus-Moerman Kevin-Mattheus-Moerman commented Nov 30, 2019

@whedon set 10.5281/zenodo.3558365 as archive

@whedon
Copy link
Author

@whedon whedon commented Nov 30, 2019

OK. 10.5281/zenodo.3558365 is the archive.

@Kevin-Mattheus-Moerman
Copy link
Member

@Kevin-Mattheus-Moerman Kevin-Mattheus-Moerman commented Nov 30, 2019

@whedon set version as v1.0.1

@Kevin-Mattheus-Moerman
Copy link
Member

@Kevin-Mattheus-Moerman Kevin-Mattheus-Moerman commented Nov 30, 2019

@whedon set v1.0.1 as version

@whedon
Copy link
Author

@whedon whedon commented Nov 30, 2019

OK. v1.0.1 is the version.

@Kevin-Mattheus-Moerman
Copy link
Member

@Kevin-Mattheus-Moerman Kevin-Mattheus-Moerman commented Nov 30, 2019

@seanpue can you please change the Zenodo archive metadata such that the title and author list matches your paper?
Can you confirm the version is now v1.0.1?

@seanpue
Copy link

@seanpue seanpue commented Nov 30, 2019

@whedon set v1.0.4 as version

@whedon
Copy link
Author

@whedon whedon commented Nov 30, 2019

I'm sorry @seanpue, I'm afraid I can't do that. That's something only editors are allowed to do.

@seanpue
Copy link

@seanpue seanpue commented Nov 30, 2019

@whedon generate pdf

@whedon
Copy link
Author

@whedon whedon commented Nov 30, 2019

Attempting PDF compilation. Reticulating splines etc...

@whedon
Copy link
Author

@whedon whedon commented Nov 30, 2019

@seanpue
Copy link

@seanpue seanpue commented Nov 30, 2019

@Kevin-Mattheus-Moerman I have fixed the Zenodo metadata. Please update the version to 1.0.4 Thanks!

@Kevin-Mattheus-Moerman
Copy link
Member

@Kevin-Mattheus-Moerman Kevin-Mattheus-Moerman commented Dec 1, 2019

@whedon set v1.0.4 as version

@whedon
Copy link
Author

@whedon whedon commented Dec 1, 2019

OK. v1.0.4 is the version.

@Kevin-Mattheus-Moerman
Copy link
Member

@Kevin-Mattheus-Moerman Kevin-Mattheus-Moerman commented Dec 1, 2019

@whedon accept

@whedon
Copy link
Author

@whedon whedon commented Dec 1, 2019

Attempting dry run of processing paper acceptance...

@whedon
Copy link
Author

@whedon whedon commented Dec 1, 2019


OK DOIs

- 10.1007/s12046-018-0828-8 is OK
- 10.18653/v1/w18-2409 is OK

MISSING DOIs

- None

INVALID DOIs

- None

@whedon
Copy link
Author

@whedon whedon commented Dec 1, 2019

Check final proof 👉 openjournals/joss-papers#1134

If the paper PDF and Crossref deposit XML look good in openjournals/joss-papers#1134, then you can now move forward with accepting the submission by compiling again with the flag deposit=true e.g.

@whedon accept deposit=true

@Kevin-Mattheus-Moerman
Copy link
Member

@Kevin-Mattheus-Moerman Kevin-Mattheus-Moerman commented Dec 1, 2019

@seanpue

@Kevin-Mattheus-Moerman The PDF is not showing my initials in the footer, ...

This is normal

@Kevin-Mattheus-Moerman
Copy link
Member

@Kevin-Mattheus-Moerman Kevin-Mattheus-Moerman commented Dec 1, 2019

@whedon accept deposit=true

@whedon whedon added the accepted label Dec 1, 2019
@whedon
Copy link
Author

@whedon whedon commented Dec 1, 2019

Doing it live! Attempting automated processing of paper acceptance...

@whedon
Copy link
Author

@whedon whedon commented Dec 1, 2019

🐦🐦🐦 👉 Tweet for this paper 👈 🐦🐦🐦

@whedon
Copy link
Author

@whedon whedon commented Dec 1, 2019

🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨

Here's what you must now do:

  1. Check final PDF and Crossref metadata that was deposited 👉 openjournals/joss-papers#1135
  2. Wait a couple of minutes to verify that the paper DOI resolves https://doi.org/10.21105/joss.01717
  3. If everything looks good, then close this review issue.
  4. Party like you just published a paper! 🎉🌈🦄💃👻🤘

Any issues? notify your editorial technical team...

@Kevin-Mattheus-Moerman
Copy link
Member

@Kevin-Mattheus-Moerman Kevin-Mattheus-Moerman commented Dec 1, 2019

@openjournals/dev Can you check why this DOI is still not resolving?

@danielskatz
Copy link

@danielskatz danielskatz commented Dec 2, 2019

This DOI resolves, so I'm going to close this issue
Thanks to @gkthiruvathukal for editing and @rlskoeser & @vc1492a for reviewing, and congratulations to @seanpue

@whedon
Copy link
Author

@whedon whedon commented Dec 2, 2019

🎉🎉🎉 Congratulations on your paper acceptance! 🎉🎉🎉

If you would like to include a link to your paper from your README use the following code snippets:

Markdown:
[![DOI](https://joss.theoj.org/papers/10.21105/joss.01717/status.svg)](https://doi.org/10.21105/joss.01717)

HTML:
<a style="border-width:0" href="https://doi.org/10.21105/joss.01717">
  <img src="https://joss.theoj.org/papers/10.21105/joss.01717/status.svg" alt="DOI badge" >
</a>

reStructuredText:
.. image:: https://joss.theoj.org/papers/10.21105/joss.01717/status.svg
   :target: https://doi.org/10.21105/joss.01717

This is how it will look in your documentation:

DOI

We need your help!

Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

@whedon whedon added published recommend-accept labels Mar 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted published recommend-accept review
Projects
None yet
Development

No branches or pull requests

8 participants