Skip to content

[REVIEW]: GeneNetwork: framework for web-based genetics #25

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
14 of 17 tasks
whedon opened this issue May 30, 2016 · 34 comments
Closed
14 of 17 tasks

[REVIEW]: GeneNetwork: framework for web-based genetics #25

whedon opened this issue May 30, 2016 · 34 comments
Labels
accepted published Papers published in JOSS recommend-accept Papers recommended for acceptance in JOSS. review

Comments

@whedon
Copy link

whedon commented May 30, 2016

Submitting author: @pjotrp (Pjotr Prins)
Repository: https://github.com/genenetwork/genenetwork2
Version: 2.0-a8fcff4
Editor: @arfon
Reviewer: @krother, @agarie
Archive: 10.5281/zenodo.53740

Status

status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/84b27b28948be80a47268a5baa2701ad"><img src="http://joss.theoj.org/papers/84b27b28948be80a47268a5baa2701ad/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/84b27b28948be80a47268a5baa2701ad/status.svg)](http://joss.theoj.org/papers/84b27b28948be80a47268a5baa2701ad)

Reviewer questions

Conflict of interest

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (such as being a major contributor to the software).

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Version: Does the release version given match the GitHub release (2.0-a8fcff4)?
  • Archive: Does the software archive resolve?

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: Have the performance claims of the software been confirmed?

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the functionality of the software documented to a satisfactory level (e.g. API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

Paper PDF: 10.21105.joss.00025.pdf

  • Authors: Does the paper.md file include a list of authors with their affiliations?
  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • References: Do all archival references that should have a DOI list one (e.g. papers, datasets, software)?
@whedon whedon added the review label May 30, 2016
@arfon
Copy link
Member

arfon commented May 30, 2016

/ cc @openjournals/joss-reviewers - would anyone be willing to review this submission?

If you would like to review this submission then please comment on this thread so that others know you're doing a review (so as not to duplicate effort). Something as simple as :hand: I am reviewing this will suffice.

Reviewer instructions

  • Please work through the checklist at the start of this issue.
  • If you need any further guidance/clarification take a look at the reviewer guidelines here http://joss.theoj.org/about#reviewer_guidelines
  • Please make a publication recommendation at the end of your review

Any questions, please ask for help by commenting on this issue! 🚀

@krother
Copy link

krother commented May 30, 2016

✋ I am reviewing this

@krother
Copy link

krother commented Jun 1, 2016

This is how software papers I like to read should look like!
I have a number of small suggestions:

  • 1st par: "experimental data" could be fleshed out in 2-3 words for the biologist reader (is it SNPs, microarray data, NGS, Y2H or manually counted individuals?)
  • 2nd par: "can be found in ___ guix-bioinformatics", please insert 'the article', 'the package', 'the lab fridge labeled'.. whichever is aproppriate ;-)
  • last par: "Python" is spelled with a capital 'P'
  • REST API looks like a very good idea.

@pjotrp
Copy link

pjotrp commented Jun 1, 2016

Thanks @krother for reviewing this very complex project! We'll clarify the text.

Also @agarie promised to review. Do we need a second reviewer?

@krother
Copy link

krother commented Jun 1, 2016

Not necessarily. I think a few of the points above (contribution, mini-summary) could be spelled out in the README.

Apart from that, I'm looking forward to see the software in action. At the moment, I'm trying to install guix (binary tarball is 404, Github issue pending).

@pjotrp
Copy link

pjotrp commented Jun 1, 2016

We can support you on the #genenetwork IRC channel (irc.freenode.net - there is a web interface) if you like. Guix binary reproducible installs can be a bit daunting the first time because there are quite a few newish concepts.

@agarie
Copy link

agarie commented Jun 1, 2016

Ooops, sorry for not posting before, I'm reviewing this.

  • You could add a CONTRIBUTING.md file, as explained in Setting guidelines for repository contributors. Specifically, what are your requirements re: testing and documentation?
  • How do I run the tests?
  • As @krother said, these points could be added to the README.
  • I think you could move that huge IRC log to another file inside doc/ and leave some highlights or brief quotes in the installation docs.

I read the installation instructions on doc/README.org, but I'm on a MacBook right now. I'll try to install GN2 on my linux desktop at home tonight and report back my impressions and suggestions about the process.

pjotrp added a commit to pjotrp/genenetwork2 that referenced this issue Jun 2, 2016
1st par: "experimental data" could be fleshed out in 2-3 words for the biologist reader (is it SNPs, microarray data, NGS, Y2H or manually counted individuals?)
pjotrp added a commit to pjotrp/genenetwork2 that referenced this issue Jun 2, 2016
2nd par: "can be found in ___ guix-bioinformatics", please insert 'the article', 'the package', 'the lab fridge labeled'.. whichever is aproppriate ;-)
pjotrp added a commit to pjotrp/genenetwork2 that referenced this issue Jun 2, 2016
last par: "Python" is spelled with a capital 'P'
@pjotrp
Copy link

pjotrp commented Jun 2, 2016

@krother I have addressed these in the following commits:

1st par: "experimental data" could be fleshed out in 2-3 words for the biologist reader (is it SNPs, microarray data, NGS, Y2H or manually counted individuals?)

genenetwork/genenetwork2@e241d27

2nd par: "can be found in ___ guix-bioinformatics", please insert 'the article', 'the package', 'the lab fridge labeled'.. whichever is aproppriate ;-)

genenetwork/genenetwork2@9669b09

last par: "Python" is spelled with a capital 'P'

genenetwork/genenetwork2@3788a3a

REST API looks like a very good idea.

Yes, the code is being developed in this repo: https://github.com/genenetwork/gn_server

pjotrp added a commit to genenetwork/genenetwork2 that referenced this issue Jun 2, 2016
@pjotrp
Copy link

pjotrp commented Jun 2, 2016

@krother, @agarie I think I addressed most JOSS requirements by updating the README with:
genenetwork/genenetwork2@e3fc8d1

That includes a statement of work, contact and how to contribute. We do not prescribe guidelines for coding standards etc. Maybe that will come in the future.

Note: the testing framework is new and still limited in scope and @Brainiarc7 is working on that. We want to achieve full coverage in the coming month(s).

Note: the user level documentation is mostly from GN1 days. We need to update them for GN2, this is a slow work in process. Meanwhile the GN1 docs are still usable.

Note: Example use of GN can be found in the many publications by users mentioning GN on pubmed and Google Scholar. We never had a publication on the software itself that did justice to the work of the contributing developers.

@krother
Copy link

krother commented Jun 2, 2016

I find the README.md much more informative now. Thanks!

@pjotrp
Copy link

pjotrp commented Jun 2, 2016

@krother and @agarie: when it comes to installation, GN2 installation reflects that it is a rather complex product with a huge number of dependencies. It includes a recent R with packages. It includes Python with packages. It includes C++ modules. It will include OpenCL dependencies soon. Even so, GNU Guix provides a reliable byte-identical installation path with source and binary deployment.

If GN2 was part of the GNU Guix project proper it would simply be one command:

guix package -i genenetwork2

and all software would be installed on any Linux.

Unfortunately genenetwork2 will never make it into GNU Guix. This is because some software components are dated (one python package had the last update in the 90s!) and we include large data files.

Guix, however, allows you to pull in your own packages using a GUIX_PACKAGE_PATH, so we created a git repo for that named https://github.com/genenetwork/guix-bioinformatics/tree/gn-latest (the gn-latest branch is the one we deploy).

Additionally, to ascertain byte-identical deployment we also have to fixate the GNU Guix package tree itself. This is done in yet another git repository named https://github.com/genenetwork/guix/tree/gn-latest (again the gn-latest branch).

To run these trees, after instaling GNU Guix itself, some environment variables need to be set properly
and the test database has to be added to whatever mysql daemon is running.

This is all described in the installation docs as a step-by-step instruction, but I am just giving the overview here to explain why it is non-trivial at this point. Even so it is a formal deployment path and we and others have successfully installed GN2 and are happily developing against it. We include the IRC log to show it is doable - I would like to keep it there until we have a single install command. To achieve the latter we are working with the GNU Guix project to make installation easier.

I am not sure the reviewers want to go through the full installation (the website also runs online), but if you do, please tell me where you are stuck and we will resolve it. It is also an opportunity for me to improve the docs ;). One thing you may gain in this process: GNU Guix is a great way of deploying bioinformatics software!

To make things easier we could deploy GN2 inside a Docker container, but we don't think that is the right way of going about deployment. Docker containers are opaque and a security risk.

@agarie
Copy link

agarie commented Jun 3, 2016

The improved README is good!

I tried to build everything but couldn't download the tarball for AnnotationDbi. After reading that IRC log, the best way to install GN2 is apparently to download the substitutes from http://guix.genenetwork.org:8080, right? Given the complexity of the software, I'm fine with needing a bit of help during installation, but if you could put a public key on the docs ("gpg add this key to use substitute-urls for a binary installation"), that'd be great.

I navigated through the live site in order to experiment with the platform and I think both functionality and performance are OK. The documentation and the other checks are also OK. I would say that the API-level docs are sparse, but the software is meant to be used on a browser, not as a library. Even then, this documentation can come in handy for new contributors and anyone experiencing problems (or just being curious), but this is mostly a suggestion for the coming months, not a necessary revision for acceptance.

The JOSS reviewer guidelines explicitly say that reviewers are expected to be able to install the software. If my understanding of guix is correct, I expect to be able to install GN2 after I get a key for the caching server (guix archive --authorize, right?); can you send that key to me? :)

@pjotrp
Copy link

pjotrp commented Jun 4, 2016

@agarie, thanks, building from source can fail when upstream package maintainers remove a tar source ball. In Guix recently infrastructure was added to cache source tar balls too - so this will soon be a thing from the past.

The use of our binary caching guix server makes installation a lot easier and faster - because you don't have to build everything. This is a limited guix server since it only contains GN2 related binaries. The public key to trust this http://guix.genenetwork.org:8080 server is:

(public-key
 (ecc
  (curve Ed25519)
  (q #11217788B41ADC8D5B8E71BD87EF699C65312EC387752899FE9C888856F5C769#)
  )
 )

I'll add the access key for our caching server to the docs.

On the API: we are building up the API as a new Elixir based REST service with https://github.com/genenetwork/gn_server. Documentation will be generated from both the API and the included unit tests. There exists an old REST API which is documented at http://www.genenetwork.org/CGIDoc.html, but we want to discourage the use of that. We are adding documentation, in particular on database layout etc. to help people get started.

@agarie
Copy link

agarie commented Jun 4, 2016

I got genenetwork2 working on my own machine, using the test database (linked from here), so the installation instructions are correct. :)

With that out of the way, @arfon, my recommendation is for acceptance.

@pjotrp
Copy link

pjotrp commented Jun 5, 2016

Thanks @agarie! I am really glad you succeeded because there are quite a few steps. Not surprising for a web framework with a host of dependencies. Even so, I think I can turn it into a cookbook approach - line by line.

During the installation, what were the hurdles? So we can focus on those and improve the documentation. The obvious one is the use of the Guix key for the binary caching server.

I realise the Guix installation is a bit involved. With time it should get better because GNU Guix is getting better. Now we are using git checkouts to fixate the dependency graph. This is something that can be handled transparently in the future.

@krother do you want have a go this week? I, myself, have time to help. To get GN installed it may take half an hour up to two hours of your time (if we hit an unexpected snag, such as a proxy server).

@agarie
Copy link

agarie commented Jun 6, 2016

Yeah, the key was the biggest problem — this was my first time trying out guix, but the documentation explaining how to setup guix, your notes, and the installation instructions from genenetwork2 did the job. Not that it was easy, but it's definitely doable... maybe an installation "from scratch" beginning with a quick guix install + setup? Or at least links to a tutorial with just that?

@krother
Copy link

krother commented Jun 6, 2016

Yes I'd like to give it a try as well.

@pjotrp
Copy link

pjotrp commented Jun 8, 2016

@krother I wrote a 'quick installation recipe' which does not require building the software.

See https://github.com/pjotrp/genenetwork2/tree/paper/doc#quick-installation-recipe

Only alternative I can think of is a binary tarball or a Docker image. But I don't think that is in the spirit of JOSS.

@pjotrp
Copy link

pjotrp commented Jun 8, 2016

BTW we also have a plant database available now (3GB). I should add that to the paper.

@pjotrp
Copy link

pjotrp commented Jun 8, 2016

The GTK+ fix in the recipe is actually a known bug in Guix: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=21888. Mentioned workaround does fix it.

@pjotrp
Copy link

pjotrp commented Jun 9, 2016

To get an idea of the complexity of GN2, see the graph of dependencies at http://biobeat.org/gn2.svg. I added the graph to the README.

@pjotrp
Copy link

pjotrp commented Jun 10, 2016

@krother I updated the docs and reran the installation with success in a fresh and basic Debian VM. If you follow the instructions it takes a few minutes of keyboard actions (copy-paste only). Please try
https://github.com/genenetwork/genenetwork2/tree/master/doc#quick-installation-recipe

Downloading the software takes an hour on my 5 Mbs connection. Mind, the download activity is mesmerizing, so you may spend more time looking at the scrollling screen than is considered healthy ;).

@krother
Copy link

krother commented Jun 13, 2016

Hi, I have to report that I am stuck with the installation of guix (got a 'unknown locale' at step 7). Given the amount of operations on root level, I am also concerned about the health of my system. Is it possible to install the whole thing in some kind of sandbox?
I don't want to block the review process - what are our options?

@pjotrp
Copy link

pjotrp commented Jun 13, 2016

@krother you can ignore the locale error. It is just a warning and harmless.

Your system is safe because guix itself and its packages are installed in /gnu/store - which is isolated from the rest of your system. A database is installed in /var/guix and an access key in /etc/acl. Nothing else gets installed.

One of the great things about guix is that it runs on top of any existing Linux system.

Even so, if you prefer, you can install in a VM, such as KVM or VirtualBox. We do that all the time.

I would suggest to continue where you are now. If you come on IRC we can help you. Use https://webchat.freenode.net/ and use the #genenetwork channel. You can use any nick name. We are there.

@roelj
Copy link

roelj commented Jun 13, 2016

If I may jump in..

@krother:
Your system is safe. The only thing Guix does, is install things to /gnu/store and to /var/guix/. By default, commands you type are looked for in directories specified by a PATH variable. Guix creates a system on top of a system, so to use programs installed with Guix, you have to add an entry to the PATH variable. Removing it, and removing /gnu/store and /var/guix brings you back to the original state of your system.

The locale error is not a problem actually. Some special characters may not display in the output of programs, which does not hinder the actual programs (it's only an output-to-shell problem). You could fix that by updating the LANG and LC_ALL environment variables after installing locales with Guix (Guix programs are self-contained).

I am also in the #genenetwork channel on freenode.net, available to help you there.

@krother
Copy link

krother commented Jun 13, 2016

Thanks for the quick reply. I'll try VirtualBox .. have been looking for an excuse to try that for some time :-)

@pjotrp
Copy link

pjotrp commented Jun 13, 2016

In general a wise idea when installing and testing software from other parties, though you are safe with Guix. If you get stuck I'll try to document those VirtualBox steps too.

@arfon
Copy link
Member

arfon commented Jun 13, 2016

Hi @krother, given that this installation seems to be causing significant trouble for you and that @agarie has managed to install the application already I'm happy to stop here if that makes things easier?

Also, I would note that the submitting authors have gone to significant lengths to improve installation documentation ⚡

If you agree, we can (I believe) move forward to accept.

@krother
Copy link

krother commented Jun 13, 2016

Yes, I have seen a lot improving, and in my opinion the review process has worked well. I just checked the examples in the documentation and ticked the two remaining boxes in 'Documentation'. I agree with your suggestion to move forward to accept.

@arfon
Copy link
Member

arfon commented Jun 13, 2016

Yes, I have seen a lot improving, and in my opinion the review process has worked well. I just checked the examples in the documentation and ticked the two remaining boxes in 'Documentation'. I agree with your suggestion to move forward to accept.

👍 thanks @krother.

@pjotrp - do you want to make a new Zenodo release with your changes or has this already been done?

@pjotrp
Copy link

pjotrp commented Jun 13, 2016

@krother thanks!

@arfon no changes in the release - we only added to the documentation to clarify the installation process. Happy to go public on the current Zenodo doi.

@arfon arfon added the accepted label Jun 14, 2016
@arfon
Copy link
Member

arfon commented Jun 14, 2016

👍 thanks for the review @krother & @agarie.

@pjotrp - your DOI is http://dx.doi.org/10.21105/joss.00025 ⚡ 🚀 💥

@arfon arfon closed this as completed Jun 14, 2016
@pjotrp
Copy link

pjotrp commented Jun 16, 2016

@arfon thanks for the great journal! In the paper there is a typo in Prof. Wiliams' name and I amended a URL to point to the master branch. Can you fix that, please?

See genenetwork/genenetwork2@75c7a08

@arfon
Copy link
Member

arfon commented Jun 17, 2016

Sure thing. That should be fixed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted published Papers published in JOSS recommend-accept Papers recommended for acceptance in JOSS. review
Projects
None yet
Development

No branches or pull requests

6 participants