Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW]: Local clustering #960

Closed
18 tasks done
whedon opened this issue Sep 17, 2018 · 32 comments
Closed
18 tasks done

[REVIEW]: Local clustering #960

whedon opened this issue Sep 17, 2018 · 32 comments
Assignees
Labels
accepted published Papers published in JOSS recommend-accept Papers recommended for acceptance in JOSS. review

Comments

@whedon
Copy link

whedon commented Sep 17, 2018

Submitting author: @volfpeter (Péter Volf)
Repository: https://github.com/volfpeter/localclustering
Version: v0.10
Editor: @pjotrp
Reviewer: @DannyArends
Archive: 10.5281/zenodo.1443551

Status

status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/c6859cfde227c60d4d1836bc24ade7b4"><img src="http://joss.theoj.org/papers/c6859cfde227c60d4d1836bc24ade7b4/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/c6859cfde227c60d4d1836bc24ade7b4/status.svg)](http://joss.theoj.org/papers/c6859cfde227c60d4d1836bc24ade7b4)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@DannyArends, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.theoj.org/about#reviewer_guidelines. Any questions/concerns please let @pjotrp know.

Please try and complete your review in the next two weeks

Review checklist for @DannyArends

Conflict of interest

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Version: Does the release version given match the GitHub release (v0.10)?
  • Authorship: Has the submitting author (@volfpeter) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Authors: Does the paper.md file include a list of authors with their affiliations?
  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?
@whedon
Copy link
Author

whedon commented Sep 17, 2018

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @DannyArends it looks like you're currently assigned as the reviewer for this paper 🎉.

⭐ Important ⭐

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

@whedon
Copy link
Author

whedon commented Sep 17, 2018

Attempting PDF compilation. Reticulating splines etc...

@whedon
Copy link
Author

whedon commented Sep 17, 2018

@pjotrp
Copy link

pjotrp commented Sep 17, 2018

@volfpeter to expedite the review process do you mind going through above list of check boxes and make sure they can be ticked (you can't tick them). Ping us here when you are done.

@volfpeter
Copy link

@pjotrp I've gone through the checklist. I think everything is fine, but here are a couple of notes:

  • Version: The latest release is v0.10, the code itself has not changed since that release, but obviously the paper is not included in that release. If the published release must contain the paper, then i will have to update the version.
  • Automated tests: There are only manual "tests" in the form of the iPython notebook and demo.py. I hope it will be enough.
  • References: Do i need to look up the DOI of all citations? If so, then i will get to it.

@pjotrp
Copy link

pjotrp commented Sep 18, 2018

Just create a new release when we are ready. Tests should be fine. DOIs, please add.

@DannyArends you can start review. Thanks!

@DannyArends
Copy link

DannyArends commented Sep 18, 2018

Ticked all the boxes I could, in general the software is in a good state. However how to use the library, what the advantages are, and the use-cases of this clustering algorithm compared to already available methods remains a mystery to me.

I would advice the author to additionally take a close look at the language used in the paper (although this is outside of the scope of the review) I added some comments in the additional section below.

Furthermore I was not able to install the software yet, I will do that after the issues listed below are taken care of.

Functionality - Installation

The software only provides a short list of dependencies, no clear install guide is provided. Please provide an install guide outlining the steps needed to be taken by a new user to install the software, even if this is a simple as:

    pip install graphscraper
    git clone https://github.com/volfpeter/localclustering.git

Then what do I need to do ? Do I need to include the library ? does it need compilation/installation ? Do I need to update my path variable ?

Additionally, it is not specified if this requires a certain OS, is this windows, linux, macOSX compatible ? has it been tested on multiple platforms ?

Documentation - A statement of need

No statement of need is provided, please make clear in the readme for whom the software in this repository is useful: e.g. is it aimed at biologist, mathematicians, physics

Documentation - Installation instructions

There is no clear list of dependencies, and dependencies for the library, the demo and the python notebook are different. Furthermore, it is unclear which version of "GraphScraper library" is required, since the readme file links to the pypi version (https://pypi.python.org/pypi/graphscraper) as well as the authors fork of the same software: (https://github.com/volfpeter/graphscraper).

A solution would be to split up the dependencies section into three parts outlining the requirements for: Library, Demo and IPython notebook, and then giving install instructions on how-to install the dependencies.

Documentation - Automated tests

No automated tests are provided, and no instructions are provided in the readme on howto run manual tests of the library.

Documentation - Community guidelines

point 2) Report issues or problems with the software, and point 3) Seek support are to be addressed

Software paper - Authors

Author is listed, but the affiliation is mentioned as "None", if no affiliation exists, it would be more professional to mark the author as "independant developer"

Software paper - A statement of need

Same comment as before in the documenttation section, please provide a statement of need

Software paper - References

No DOIs are provided for the references

Additional

The text in the paper is of low quality in some areas, and the author might want to improve the text before publication, for example:

"Here are some ideas for future work:"
"These are the main resources besides the source code:"

Furthermore no comparison is made between the current clustering algorithm and known clustering
algorithm, how is this algorithm different from K-Means, Mean-Shift, DBSCAN, EM Clustering with
Gaussian Mixture Models (GMM) or Agglomerative Hierarchical Clustering.

To me it is not clear what advantages this clustering algorithm has, and it is not clear what
the author means by improving the "cluster's quality". How is this measured?

Furthermore, the phrase "they define a set of requirements cluster definitions must fulfill" is
very vague, and no explanation (or link to an example) is given how the user must do this.

@volfpeter
Copy link

Thanks for the comments! I'll start fixing these issues and get back to you when it's done.

@pjotrp
Copy link

pjotrp commented Sep 26, 2018

Hi @volfpeter, you had great momentum. Don't let it slip :). I think @DannyArends has some good points, but they are not intrusive. I think most of the work is to provide a clear install path (we require that) and a bit more contextual information on other clustering algorithms. I don't think a full comparison is required. Also tests are optional.

@volfpeter
Copy link

Hi @pjotrp , I haven't had time last week, but i started working on it this week. I decided to completely rewrite and significantly extend the readme of the project because it was really lacking in some areas and only this review made it obvious to me (it was OK for someone who knew how to use the software, i.e. me, but it was not very useful for others)...
Regarding the comparison, i haven't made a decision on that part yet. All the algorithms mentioned in the review (and the ones referenced in the paper) are global clustering algorithms that group all the nodes of a graph into clusters while this one is local in the sense that it aims to find the cluster of a small set of nodes (usually one) without even looking at the whole graph. It's a bit like comparing apples and oranges. Anyway, i'll think of something, maybe clustering the Zachary karate club graph which is well-known and analyzed by global algorithms.

@pjotrp
Copy link

pjotrp commented Sep 26, 2018

Excellent. That is what review is for - to improve the paper :)

@volfpeter
Copy link

@DannyArends I've updated the readme. Hopefully it's ok now, although i'm not sure you will be satisfied with the "statement of need" paragraph.
I'll update the paper soon, it's not ready yet.

@DannyArends
Copy link

DannyArends commented Oct 1, 2018

@volfpeter Looks a lot better already, I check all the boxes, except the "automated tests", since @pjotrp mentioned that "Also tests are optional." I could just tick it if needed:

A couple of minor things:
In the README.md, please specify the language for the example text boxes, so that it will do code-highlighting, by specifying the language after the ''' to '''Python:

from localclustering.definitions.connectivity import ConnectivityClusterDefinition
from localclustering.localengine import LocalClusterEngine

compared with (not the best example, but I think you'll get the idea):

from localclustering.definitions.connectivity import ConnectivityClusterDefinition
from localclustering.localengine import LocalClusterEngine

There is a missing space in the README.md: "a detailed descriptionand", as well as a minor spelling error: change 'in' to 'by' in the following sentence: "contact the repository owner in email"

Anyway, it looks good to me, just update the paper (if you want I could proofread it for you) and if @pjotrp confirms his "tests are optional" statement I will check the last box.

Danny

Ps. I noticed there is 1 reference without a DOI still, however this paper from 2000 does not have a DOI assigned to it, is this ok @pjotrp ?

@volfpeter
Copy link

I've fixed the readme.

The reference that doesn't have a DOI is a technical report. I have found DOIs for related papers, but i wanted to cite the technical report because that's what i read back in the day.

Regarding the test, the only related thing in the repository is the demo module. It's easy to run if you went through the Getting started section of the readme. You will receive a very detailed error message saying you need to install the cairo library and its python binding (pycairo), then it will run.

One more thing: i've released a new version (0.11) that fixed the setup.py file, so that's the latest version right now.

Hopefully i will have time to finalize and commit the paper today.

@DannyArends
Copy link

DannyArends commented Oct 1, 2018

@volfpeter I think it's more for regression/CI tests e.g. when using travis.ci

These kinds of tests allow you to know when a change to the code breaks the results compared to a previous version. I use it for the CTLmapping repository, every time I push to the master branch, the code is checked to make sure it can still install, and it runs tests to see if the mapping algorithm still produces the same output as before. See for example:

https://github.com/DannyArends/CTLmapping/blob/master/Rctl/tests/test_openmp.R

in this code my own openMP correlation algorithm is compared to the build-in correlation algorithm from R. If my algorithm produces different results compared to the R version:

if(!all(res.ref == res.cor)) stop("ref != cor\n") # When results differ, stop

The test is stopped (stop is R lingo for error), and travis.ci will email me that the latest changes have broken the algorithm.

You could do something like this for the clustering, in which you calculate a base case, which you then test.

@pjotrp Is this a requirement or a nice to have feature for JOSS ?

Additionally it gives you the opportunity to add the nice green "build passing" icon to your repository

image

@pjotrp
Copy link

pjotrp commented Oct 1, 2018

tests are optional.

@DannyArends
Copy link

Ok,
thanks for the confirmation, I checked all the boxes.
After the paper is updated, everything should be good 2 go

Danny

@pjotrp
Copy link

pjotrp commented Oct 1, 2018

Yup. If there is no DOI there is none ;)

@volfpeter
Copy link

@DannyArends I've committed the updated version of the paper and regenerated the PDF. Please review it and let me know if further updates or fixes are needed. Thanks!

@DannyArends
Copy link

Submitted some minor changes to the text, seems good 2 me.
@pjotrp After regenerating the paper, it's ready for publication in my opinion.

@volfpeter
Copy link

@DannyArends Thanks a lot for the fixes! I've merged the pull request.
@DannyArends @pjotrp Thanks for all the help during the pre-review and review process! It was a much better experience than i had expected.

@pjotrp
Copy link

pjotrp commented Oct 3, 2018

@whedon generate pdf

@whedon
Copy link
Author

whedon commented Oct 3, 2018

Attempting PDF compilation. Reticulating splines etc...

@whedon
Copy link
Author

whedon commented Oct 3, 2018

@pjotrp
Copy link

pjotrp commented Oct 3, 2018

Excellent. The submission is slightly over 1000 words at 1200 words, but in this case I think it serves its purpose (@arfon is that OK?).

We need you to deposit a copy of your software repository (including any revisions made during the JOSS review process) with a data-archiving service. To do so:

  1. Create a GitHub release of the current version of your software repository
  2. Deposit that release with Zenodo, figshare, or a similar DOI issuer.
  3. Post a comment here with the DOI for the release.

@arfon
Copy link
Member

arfon commented Oct 3, 2018

Excellent. The submission is slightly over 1000 words at 1200 words, but in this case I think it serves its purpose (@arfon is that OK?).

👍 LGTM

@volfpeter
Copy link

I've submitted the repository to Zenodo:

  • DOI for "paper" release: 10.5281/zenodo.1443551
  • DOI for all releases: 10.5281/zenodo.1443550

The repository version of the "paper" release is 0.12.1.

@pjotrp
Copy link

pjotrp commented Oct 3, 2018

@whedon set 10.5281/zenodo.1443551 as archive

@whedon
Copy link
Author

whedon commented Oct 3, 2018

OK. 10.5281/zenodo.1443551 is the archive.

@pjotrp
Copy link

pjotrp commented Oct 3, 2018

@arfon ready to R&R

@arfon arfon added the accepted label Oct 3, 2018
@arfon
Copy link
Member

arfon commented Oct 3, 2018

@DannyArends - many thanks for your review here and to @pjotrp for editing this submission ✨

@volfpeter - your paper is now accepted into JOSS and your DOI is https://doi.org/10.21105/joss.00960 ⚡ 🚀 💥

@arfon arfon closed this as completed Oct 3, 2018
@whedon
Copy link
Author

whedon commented Oct 3, 2018

🎉🎉🎉 Congratulations on your paper acceptance! 🎉🎉🎉

If you would like to include a link to your paper from your README use the following code snippets:

Markdown:
[![DOI](http://joss.theoj.org/papers/10.21105/joss.00960/status.svg)](https://doi.org/10.21105/joss.00960)

HTML:
<a style="border-width:0" href="https://doi.org/10.21105/joss.00960">
  <img src="http://joss.theoj.org/papers/10.21105/joss.00960/status.svg" alt="DOI badge" >
</a>

reStructuredText:
.. image:: http://joss.theoj.org/papers/10.21105/joss.00960/status.svg
   :target: https://doi.org/10.21105/joss.00960

This is how it will look in your documentation:

DOI

We need your help!

Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

@whedon whedon added published Papers published in JOSS recommend-accept Papers recommended for acceptance in JOSS. labels Mar 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted published Papers published in JOSS recommend-accept Papers recommended for acceptance in JOSS. review
Projects
None yet
Development

No branches or pull requests

5 participants