Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW]: OrNet - a Python Toolkit to Model the Diffuse Structure of Organelles as Social Networks #1983

Open
whedon opened this issue Dec 22, 2019 · 10 comments
Assignees
Labels

Comments

@whedon
Copy link
Collaborator

@whedon whedon commented Dec 22, 2019

Submitting author: @Marcdh3 (Marcus Hill)
Repository: https://github.com/quinngroup/ornet-JOSS
Version: v0.1
Editor: @akeshavan
Reviewer: @serine, @vc1492a
Archive: Pending

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/6fbce182256576a0dd3d35c26f83a9ef"><img src="https://joss.theoj.org/papers/6fbce182256576a0dd3d35c26f83a9ef/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/6fbce182256576a0dd3d35c26f83a9ef/status.svg)](https://joss.theoj.org/papers/6fbce182256576a0dd3d35c26f83a9ef)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@serine & @vc1492a, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @akeshavan know.

Please try and complete your review in the next two weeks

Review checklist for @serine

Conflict of interest

  • I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Contribution and authorship: Has the submitting author (@Marcdh3) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

Review checklist for @vc1492a

Conflict of interest

  • I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Contribution and authorship: Has the submitting author (@Marcdh3) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?
@whedon

This comment has been minimized.

Copy link
Collaborator Author

@whedon whedon commented Dec 22, 2019

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @serine, @vc1492a it looks like you're currently assigned to review this paper 🎉.

⭐️ Important ⭐️

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf
@whedon

This comment has been minimized.

Copy link
Collaborator Author

@whedon whedon commented Dec 22, 2019

Attempting to check references...
@whedon

This comment has been minimized.

Copy link
Collaborator Author

@whedon whedon commented Dec 22, 2019

Attempting PDF compilation. Reticulating splines etc...
@whedon

This comment has been minimized.

Copy link
Collaborator Author

@whedon whedon commented Dec 22, 2019


OK DOIs

- 10.1101/522763 is OK

MISSING DOIs

- None

INVALID DOIs

- None
@whedon

This comment has been minimized.

Copy link
Collaborator Author

@whedon whedon commented Dec 22, 2019

@vc1492a

This comment has been minimized.

Copy link
Collaborator

@vc1492a vc1492a commented Jan 7, 2020

Overview: This was an interesting piece of software to review as its use case it outside of my typical area, but makes use of many Python software packages I use in my work and side projects. I appreciate the authors commitment to providing docstring throughout the code base which allowed me to dive in more deeply and provide some suggestions for further improvement.

Below are my comments on particular sections, of which most (if not all) are relatively minor. When the authors strictly meet the requirements laid out in the review checklist, I have marked these items as complete. For those where the requirement is not strictly met, I have left the checklist entry empty. Either way, I have provided some comments below with actions that should be taken (in the case of unchecked entries) and actions that could be taken to improve the work further but that aren't strictly required for submission to JOSS (in the case of checked entries).

Repository URL: While the software is available at the repository URL, the URL listed in this issue as part of the review textually does not match the destination (https://github.com/quinngroup/ornet-JOSS vs https://github.com/quinngroup/ornet). This should be noted and corrected prior to publication to JOSS by the editor.

Installation and Installation Instructions: While the installation proceeds without issue according to the instructions provided in readme.md and has clearly listed dependencies in both readme.md and setup.py, I did not see a requirements.txt file that is typical with Python software that can be used to install the necessary dependencies in the following way:

pip install -r requirements.txt

It is recommended that the authors include a requirements.txt file in their repository, and I have provided a version of this file as part of this pull request (which can be generated using the command pip freeze > requirements.txt). The authors should check this file and make any adjustments as necessary.

In checking the agreement between the dependencies in readme.md and setup.py, I noticed that the opencv-python dependency in setup.py is commented and not included as part of a fresh install. If it's possible to install using setup.py, then the line should be uncommented such that the installation can proceed in that way. I noticed that it is a required package for running the tests, so I have fixed the issue in setup.py as part of this pull request.

Lastly, it would be beneficial for the authors to consider using semantic versioning if not doing so already, consider releasing their software on PyPi for easy installation using a command like pip install ornet, and tagging releases on git and uploading those releases to Github. It may be beneficial to tag the releases that are uploaded to Github and/or PyPi versioned using semantic versioning such that users can download the software directly from Github via a release as opposed to having to clone or pull the repository.

Functionality, Functionality Documentation, and Example Usage: Running the samples took quite a bit of time on the machine in which I was using to review the software (greater than 11 hours, almost all in video processing), which is energy-efficient and not designed for large workloads (e.g. 1.1 Ghz Intel Core m3). It may be beneficial for the authors to note the expected duration of running OrNet on the sample videos and masks or provide suggested hardware requirements.

The authors did include sample (example) files for understanding how to use the functionality of OrNet and included the necessary commands needed to use the software (e.g. functionality documentation). However, it would be good to explicitly note in the documentation (e.g. in readme.md) that there are examples in the samples directory in which to try out the software's functionality. This is implied by the existence of the samples directory but it isn't explicit in the documentation.

Automated Tests: After ensuring opencv-python was installed, I was able to run the commands that execute the tests according to the instructions provided by the authors. However, one of the 7 tests fail when running on my machine (macOS version 10.14.6 with Python version 3.7.2), specifically line 58 in ornet_tests.py. A subsequent run however showed that this test succeeded, so there is some inconsistency that may want to be examined. The authors should examine this prior to publication in JOSS. The results show 7 tests with 1 failure, but the readme states that there are 6 tests - this discrepancy ought to be addressed as well.

If the authors are looking to have others contribute to the project more easily in the future, they may want to consider setting up a continuous integration pipeline that will automatically run tests and report the test results (e.g. using Travis CI) and unit test coverage (e.g. using Coveralls) with any pushes to master or dev and for any pull requests opened by open-source contributors. This pipeline would ease the burden on making sure new additions and/or changes to software work as intended, and would also allow maintainers to formally test against every version of Python 3 that should be supported. This would allow the authors to state "this software works in Python versions 3.7 - 3.8" as opposed to simply stating something along the lines of "works in Python >=3.7" in the readme. I see that this is an open issue already.

State of the Field: The paper does a great job referencing how the software has been built on top of already-existing and well-utilized Python libraries such as scikit-learn or matplotlib. In addition, the paper makes reference to a C software called ITK-SNAP to generate masks for analysis by OrNet. However, the software paper does not explicitly state whether OrNet represents a new capability in the field or whether it ought to be compared with any existing software packages. The authors may want to consider being more explicit in the wording of the paper in this area - if its a new capability, explicitly make the claim. Otherwise, compare OrNet to any existing software packages that allow researchers to perform a similar task. This perhaps could be as simple as stating that previously researchers were required to build their own scripts for this task manually and on their own.

References: The two references provided are appropriately cited and relevant to this work and the software paper that is included in this submission, and are in the proper citation syntax. However, I feel that additional references would make the supplied software paper much stronger, especially for readers / reviewers that are not in the authors' field of study. For example, the first paragraph states "Fluorescent tagging has shed light on mitochondria's spatial orientation...", but does not provide a reference that indicates the first occurrence of this or any seminal works. Additionally, while it may seem odd at first, the Python language was originally presented at a conference and can thus be referenced (I have supplied this reference in paper.bib as part of this pull request, but paper.md should be amended by the authors). Similarly, scikit-learn should be referenced as well as software in the scipy ecosystem (I have supplied the scikit-learn reference in paper.bib as part of this pull request, but paper.md should be amended by the authors for this as well as scipy prior to publication).

Community Guidelines: The authors have a section dedicated in the readme for users of the software to request changes to the software, report any issues during use, and ask any questions (to be performed by opening issues within the Github repository). However, the section does not outline specifically how users and/or developers may contribute to the software for future use. It may benefit the authors / developers to include more specific contribution instructions, such as asking users to first fork the repository and then create a new branch with a name that describes the issue or new feature before submitting a pull request. This may result in some additional organization and encourage others to contribute to the software.

Other: Running the samples takes a very long time on hardware that isn't optimized for video processing, like the hardware I used in my review (just over 11 hours to normalize the input video). There are perhaps a few areas in the code base that could take advantage of multi-processing or parallelism to speed up computation, which would be helpful in a practical sense, and I highly suggest introducing progress bars or other console output that will help inform the user of how much progress has been made when using OrNet. I've performed some code formatting and cleanup that is consistent with PEP8 formatting as part of the aforementioned pull request. Other small tweaks which could be made: the gamma parameter is not included in the docstring for the function multivariate_hellinger in measure.py (as well as other parameters in other files such as run_gmm.py that are not in dosctring), using typing for functions which was introduced in Python 3.5, and removing unused code in extract_cells.py.

I very much look forward to seeing this software mature, and it was fun to review!

@vc1492a vc1492a mentioned this issue Jan 7, 2020
@Marcdh3

This comment has been minimized.

Copy link

@Marcdh3 Marcdh3 commented Jan 9, 2020

@vc1492a Thank you so much for the detailed review of our software! We really appreciate the effort you made into making that pull request. As of now, that request has been merged, and we are working on incorporating all of the other changes you pointed out. Also, we have improved the median normalization code so it no longer takes hours but minutes, so if you ever get free time and want to play around with it again it should be a much better experience. We will definitely update everyone once all of those changes have been made.

@serine

This comment has been minimized.

Copy link
Collaborator

@serine serine commented Jan 16, 2020

Functionality

  • Installation: Does installation proceed as outlined in the documentation?

In the current form software doesn't install. It is missing:

  • opencv-python module is missing, had to pip install opencv-python
    I've noticed that setup.py has it, but the line is commented out

  • sudo apt-get install python3-tk This is annoying because it requires system wide installation, could be hard for many users. I recommend looking into packaging with conda

  • Functionality: Have the functional claims of the software been confirmed?

Work in progress as I haven't been able to get results yet..

  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

There doesn't appears to be any claims regarding software performance, so this one is ok

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?

From reading a README.md in the repo it is hard to get an idea what the tool does. In general I'm still not clear if the tool is for some pre-processing or it is meant to yield some biologically meaningful / interpretable results.

One thing that concerns me is a statement in the paper.md that this is a "prototype" software and there some light hints that a new software might come to exists? I'll try to provide more comments in the next session Sortware paper but one of the main purposes of this review is for you to be able to say that you have a functioning software/tool that people in the world can use and we (JOSS reviews) have tested it out and thumbed up. So if you are planing to abundant this tool in the near future, the effort seems wasteful.

  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.

for linux (ubuntu at least) this additional package is required python3-tk system install

It is probably good idea list all python packages under "dependencies" section

  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).

There is an example folder, however little description given about the video files. It would be nice if there was more explanation about the video files and what the tool will do to the them in terms of processing and the expected results. In addition author seems to assume that the user will know how to do the masking and generate vtk file. I think it would be helpful to include a reference link to a external resource on how to generating vtk files if one doesn't know. Also some indication of the runtime would be very helpful.

  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?

  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?

I'm confused about stdout/stderr from the tests. I think those unclosed file should be fixed..?

2
unclosed file <_io.BufferedWriter name=4>
unclosed file <_io.BufferedReader name=5>
unclosed file <_io.BufferedReader name=7>
...unclosed file <_io.BufferedWriter name=4>
unclosed file <_io.BufferedReader name=5>
unclosed file <_io.BufferedReader name=7>
.unclosed file <_io.BufferedReader name=7>
...
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?

  • State of the field: Do the authors describe how this software compares to other commonly-used packages?

  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
    References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

Other

It would be nice to add detection of the existing output folder or simply warn users that results will be overwritten

There probably need some mention of the resources required for tool to run. i.e 1 cpu or several. Also RAM footprint. I've noticed that on mdivi sample, ornet was taking just over 5Gb. I'm guessing RAM usage will go up with the file (video) size. What is your "typical" amount of raw data and processed? I guess folks in the field should know, however I still feel that this information will be useful. Also mention runtime.. seems like it is taking a long time.
Also is there a need for large tmp/scratch space? Just thinking about running this scrip on the cluster/HPC ..

Does the video format matters? you should probably state file format dependency either way

This is being rather picky, but those spaces in file names are rather annoying for command line tool? Your sample videos have brackets and spaces :).

When you are working with image data, do you get a "movie" file e.g avi, from the instrument or you get bunch of images that you then convert into a "movie" file, which you then use with your tool?
It would be nice to have a more clear view of the process.

I think python version is too stick? The tool seems to work on python version 3.5 and 3.6.7.. It doesn't really matter for the review process, but you sort of limiting amount of people that are potentially could use your tool

Do you need this print statement src/extract_cells.py:52: print(number_of_segments) ?

It gives a number on a console without much context.. If this is a meaningful message you should probably prefix a short message i.e

print("MSG: Number of segments %s" % number_of_segments, file = sys.stderr)

or something similar, or just comment out for debugging purposes later

I think from usability point of view it would be great to change

python -m ornet.Pipeline

to simply ornet. I've send you PR

@Marcdh3

This comment has been minimized.

Copy link

@Marcdh3 Marcdh3 commented Jan 16, 2020

Hi @serine. Thank you for the detailed review as well! Many of the issues you have pointed out are actively being fixed as we speak, and we will update everyone once the last fixes are applied. As of now, I believe the last things necessary will be clearing up ambiguity in our documentation and paper, and improving our test suite. A major point needing clarification is that this software will not be abandoned in the near future. We apologize for the confusion caused by the prototype statement. It was intended to indicate that we already have a usable package and we plan on constantly improving many areas of it, but we see how it was misleading.

One interesting issue you discovered was a missing python3-tk package. After receiving feedback from @vc1492a, we've setup continuous integration using travis ci and has been able to successfully test our code against an Ubuntu environment for different python versions using our install instructions. Whenever you find free time, could you please run our latest install instructions without the python3-tk package and let us know if the problem is now resolved.

Once again, we thank everyone for their feedback, because this review feedback has been amazing in terms of improving our work. We are working hard on pushing these changes quickly, so the review process can continue smoothly.

@serine

This comment has been minimized.

Copy link
Collaborator

@serine serine commented Jan 17, 2020

@Marcdh3 yep, so python3-tk isn't required. I've removed that module, followed installed instructions and it installed fine. This is the original issue that I faced

(ins)(env) [biostation2]~/.../OrNet/ornet (master)$ python tests/ornet_tests.py                                                                                                                                    
Traceback (most recent call last):                                                                                                                                                                                 
  File "/usr/lib/python3.5/tkinter/__init__.py", line 36, in <module>                                                                                                                                              
    import _tkinter                                                                                                                                                                                                
ImportError: No module named '_tkinter'                                                                                                                                                                            
                                                                                                                                                                                                                   
During handling of the above exception, another exception occurred:                                                                                                                                                
                                                                                                                                                                                                                   
Traceback (most recent call last):                                                                                                                                                                                 
  File "tests/ornet_tests.py", line 9, in <module>                                                                                                                                                                 
    import ornet.Pipeline as pipeline                                                                                                                                                                              
  File "/home/kirill/projects/JOSS-reviews/OrNet/env/lib/python3.5/site-packages/ornet/Pipeline.py", line 19, in <module>                                                                                          
    from ornet.gmm.run_gmm import skl_gmm                                                                                                                                                                          
  File "/home/kirill/projects/JOSS-reviews/OrNet/env/lib/python3.5/site-packages/ornet/gmm/run_gmm.py", line 1, in <module>                                                                                        
    import matplotlib.pyplot as plt                                                                                                                                                                                
  File "/home/kirill/projects/JOSS-reviews/OrNet/env/lib/python3.5/site-packages/matplotlib/pyplot.py", line 2372, in <module>                                                                                     
    switch_backend(rcParams["backend"])                                                                                                                                                                            
  File "/home/kirill/projects/JOSS-reviews/OrNet/env/lib/python3.5/site-packages/matplotlib/pyplot.py", line 207, in switch_backend                                                                                
    backend_mod = importlib.import_module(backend_name)                                                                                                                                                            
  File "/home/kirill/projects/JOSS-reviews/OrNet/env/lib/python3.5/importlib/__init__.py", line 126, in import_module                                                                                              
    return _bootstrap._gcd_import(name[level:], package, level)                                                                                                                                                    
  File "/home/kirill/projects/JOSS-reviews/OrNet/env/lib/python3.5/site-packages/matplotlib/backends/backend_tkagg.py", line 1, in <module>                                                                        
    from . import _backend_tk                                                                                                                                                                                      
  File "/home/kirill/projects/JOSS-reviews/OrNet/env/lib/python3.5/site-packages/matplotlib/backends/_backend_tk.py", line 5, in <module>                                                                          
    import tkinter as Tk                                                                                                                                                                                           
  File "/usr/lib/python3.5/tkinter/__init__.py", line 38, in <module>                                                                                                                                              
    raise ImportError(str(msg) + ', please install the python3-tk package')                                                                                                                                        
ImportError: No module named '_tkinter', please install the python3-tk package

using python 3.6.7 fixed that issue.

However you should provide more explicit installation steps. This is what I had to do

virtualenv -p /opt/python-3.6.7/bin/python3 env
source env/bin/activate
git clone https://github.com/quinngroup/ornet
cd ornet
pip install -r requirements.txt
pip install .
cd tests
python test_pipeline.py

Yep as you've mentioned some work on the docs and you should be all good.

Cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.