New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW]: Clustergram: Visualization and diagnostics for cluster analysis #5240
Comments
Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks. For a list of things I can do to help you, just type:
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
|
|
Wordcount for |
|
👋🏼 @martinfleis, @csadorf, @gagolews - this is the review thread for the submission. All of our communications will happen here from now on. Please check the post at the top of the issue for instructions on how to generate your own review checklist. As you go over the submission, please check any items that you feel have been satisfied. There are also links to the JOSS reviewer guidelines. The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues directly in the software repository. If you do so, please mention this thread so that a link is created (and I can keep an eye on what is happening). Please also feel free to comment and ask questions in this thread. It is often easier to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package. Please feel free to ping me (@csoneson) if you have any questions or concerns. Thanks! |
I'll complete the review in 2-3 weeks. |
Review checklist for @gagolewsConflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
CommentsAn interesting contribution, but with a "but". Clustergrams are rather simple – merely a dozen or so lines of plotting code https://github.com/martinfleis/clustergram/blob/main/clustergram/clustergram.py – therefore the scope of the package is quite limited... I am unsure if it is not a "missed opportunity" for something more substantial. I wonder if the Python community would benefit more from the current author's contribution to the seaborn package, for example? Or adding more diagnostic tools for clustering so that its scope is not limited to a single visualisation type? It seems that there is an implementation of the clustergrams in the According to this journal's scope and submission requirements, I read that "Minor 'utility' packages, including 'thin' API clients, and single-function packages are not acceptable." – I am not completely convinced that this is exactly the case here... Other remarks:
|
Thank you for the comments @gagolews! I am fully aware that the size of the submission is borderline and I was assuming it will undergo a scope query prior actual review. I am copying below the note I made in the submission regarding this.
So I am happy if @csoneson requests a scope check from the editorial team here. As said above, I believe this is borderline on the side of large enough, but I may be wrong.
I believe that the whole package should be considered when assessing the scope, not only the plotting part. It is not that straightforward to create the data that needs to be plotted from various clustering engines. I think that this processing part is equally valuable as the plotting segments (either static or interactive).
You'll probably agree that seaborn is not the ideal candidate. I'd be happy to contribute the code elsewhere but I haven't found a good home for it. Hence it lives alone. If someone has good suggestions, I am all ears. It would certainly be better than maintaining this as a small package on its own.
As much as I'd love to, my time I can spend on this is fairly limited, so if the current size of the package does not fulfil the scope limits, it will just not be published.
Thanks, I wasn't aware of it. It is a bit tough to assess it though, I haven't found any documentation or repository apart from what is on CRAN.
One of the points on evaluation of substantive scholarly effort is Whether the software has already been cited in academic papers.,. This shows an evidence for that point. |
@martinfleis I'm not going to be too stubborn, the package itself is quite nice. Let's see what others say about it... |
I am sorry, but I was a bit swamped the past two weeks, I plan on doing the review mid next week. |
👋🏻 @csadorf - just wanted to check whether you had time to take a first look at this submission. Thanks! |
@csoneson Very sorry about the delay, I'll have a look at it this week-end. |
@csadorf - no worries, thank you! |
Review checklist for @csadorfConflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
|
👋🏻 @csadorf - could you provide an update on the progress of your review? Thanks! |
The paper is very well written and illustrated, and the references are almost complete. The code is professionally developed, very well documented, tested, and distributed. I think the paper reaches the level of scholarly effort required for a publication in JOSS, however – as acknowledged by the author themself – just barely, because the majority of the code are thin wrappers around packages that readily provide clustering algorithms (notably sklearn, scipy, and RAPIDS cuML) whereas the core function of the package is to generate plots (the clustergrams), which is achieved in a handful of functions and very few lines of code. It is the substantial amount of testing and documentation that is really providing the net value of this package. The author claims to embrace scikit-learn’s API style, however that is unfortunately not entirely achieved in my view. While the main class does provide a estimator-like interface, accepting hyper parameters as constructor arguments and offering a Since the author is explicitly referencing scikit-learn’s API style, I would suggest to cite Buitinck et al. (arXiv:1309.0238 [cs.LG]) as recommended by the scikit-learn's citation guide. As a bit of a minor point, however since plotting is the core function of this package I think it warrants mentioning, I would recommend that the ticks on the x-axis on all clustergram diagrams are ensured to be natural numbers since there are no fractions of numbers of clusters. There are no instructions on how to contribute to the package. I would recommend that the author adopt some of the recommendations on making the class more modular and independent, consistently follow the scikit-learn API guidelines, and then to consider pursuing a contribution to the scikit-learn package. I recommend the paper for acceptance with minor revisions, because despite the scholarly effort being marginal, I believe it provides benefit to the community and would simplify reference in future publications. |
Thanks for the comment! I'll have a deeper dive to see if the API can follow the style more closely. I suppose the issue will be in the difference between sklearn's estimators producing clustering labels for a fixed
Will do, thanks!
Good point. Will fix.
Contributing guide is available in the documentation: https://clustergram.readthedocs.io/en/stable/contributing.html I can eventually copy its content to CONTRIBNUTING.md and store it at the root of the repo but I didn't think the duplication is necessary here. Thanks you for your comment! |
@editorialbot generate pdf |
Done! version is now v0.8.0 |
@martinfleis I have gone through the paper and it looks good - one reference (Pedregosa et al) is missing a DOI but I don't think it actually has one. Before I recommend acceptance, could you please update the title and author list of the Zenodo archive to match those of the paper, and also specify the license in the Zenodo archive? |
@csadorf Thanks! I have updated Zenodo as requested. DOI of the latest fixed versions is
That was my understanding as well. All tasks from the author checklist above (checks etc) have also been done. |
@editorialbot set 10.5281/zenodo.8202396 as archive |
Done! archive is now 10.5281/zenodo.8202396 |
Looks good! I'm handing over to the associate EiC for the last steps. |
@editorialbot recommend-accept |
|
|
Element doi: [facet 'pattern'] The value '10/ggj45f' is not accepted by the pattern '10\.[0-9]{4,9}/.{1,200}'.
Element doi: [facet 'pattern'] The value '10/ghh97z' is not accepted by the pattern '10\.[0-9]{4,9}/.{1,200}'. |
@martinfleis It seems the editorial bot is not happy with the two short DOIs (although the links in the bibliography go to the right place). There are long forms for both these DOIs - would you mind switching to those? |
@openjournals/dev - I'm not sure whether this is an expected error, or if these short DOIs should be accepted ☝🏻. The links in the bibliography are https://doi.org/ghh97z and https://doi.org/ggj45f, which do point to the right places. |
@editorialbot check references |
|
@csoneson should be fixed in the bib file now. |
@editorialbot generate pdf |
@editorialbot recommend-accept |
|
|
👋 @openjournals/dsais-eics, this paper is ready to be accepted and published. Check final proof 👉📄 Download article If the paper PDF and the deposit XML files look good in openjournals/joss-papers#4520, then you can now move forward with accepting the submission by compiling again with the command |
@editorialbot accept |
|
Ensure proper citation by uploading a plain text CITATION.cff file to the default branch of your repository. If using GitHub, a Cite this repository menu will appear in the About section, containing both APA and BibTeX formats. When exported to Zotero using a browser plugin, Zotero will automatically create an entry using the information contained in the .cff file. You can copy the contents for your CITATION.cff file here: CITATION.cff
If the repository is not hosted on GitHub, a .cff file can still be uploaded to set your preferred citation. Users will be able to manually copy and paste the citation. |
🐘🐘🐘 👉 Toot for this paper 👈 🐘🐘🐘 |
🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨 Here's what you must now do:
Any issues? Notify your editorial technical team... |
Submitting author: @martinfleis (Martin Fleischmann)
Repository: https://github.com/martinfleis/clustergram
Branch with paper.md (empty if default branch):
Version: v0.8.0
Editor: @csoneson
Reviewers: @csadorf, @gagolews
Archive: 10.5281/zenodo.8202396
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@csadorf & @gagolews, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review.
First of all you need to run this command in a separate comment to create the checklist:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @csoneson know.
✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨
Checklists
📝 Checklist for @csadorf
📝 Checklist for @gagolews
The text was updated successfully, but these errors were encountered: