Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a badge for GitHub/GitLab repository, and or issue tracker? #72

Open
krassowski opened this issue Oct 10, 2020 · 3 comments
Open

Add a badge for GitHub/GitLab repository, and or issue tracker? #72

krassowski opened this issue Oct 10, 2020 · 3 comments

Comments

@krassowski
Copy link

krassowski commented Oct 10, 2020

For a long time I was considering contributing to existing packages hosted on Bioconductor that I use every day. I believe in incremental improvements and would prefer to support the work of original creator rather than create a new package just because the existing one has bugs or imperfections.

However, I find it difficult to navigate the space from the developer point of view. As I just described in a tweet storm here, it is often difficult to know if a package has a GitHub/GitLab, and I believe that it is a valuable information. I see GitHub/GitLab (and other similar hubs) as extremely valuable, because in addition to hosting the git repository, they provide:

  • bug tracking (with the bug status clearly defined and easier to explore than searching thought the Bioconductor support forum)
  • other tools fostering collaboration, such as forking, pull-requests, automated actions
  • easy to use interface for code exploration

and finally, they encourage best code practices by allowing for easy integration with linters, security scanners and other automated code intelligence tools

I was thinking, if it would be possible to gently encourage the existing packages to create GitHub/GitLab or similar, and make navigation from one to to other easier?

GitHub and Bioconductor badges

One idea would be to provide a badge linking from a package website to the GitHub and a second one from the reposities README to the package website. The former could be autogenerated, and the latter could be promoted by encouraging adding it in tutorials for maintainers (it could use an existing solution, e.g. badger, though I would prefer a version which is published to be shown rather than the downloads number).

For examples let's look at the top 5 Bioconductor packages:

  1. BiocVersion: package, GitHub - no link from one to the other, either way
  2. BiocGenerics: package, GitHub - no link from package to GitHub
  3. S4Vectors: package, GitHub - no link from package to GitHub
  4. IRanges: package, GitHub - no link from package to GitHub
  5. BiocGenerics: package, GitHub - no link from package to GitHub

And at the top 30 packages which do not constitute the core infrastructure:

  1. zlibbioc - core
  2. AnnotationDbi - core
  3. XVector - core
  4. BiocParallel - core
  5. GenomeInfoDb - core
  6. DelayedArray - core
  7. GenomicRanges - core
  8. SummarizedExperiment - core
  9. limma: package, no GitHub/GitLab etc - in vacuum people either refer to gravely outdated CRAN limma mirror (13 years old version!) or create their own mirrors, e.g. gangwug/limma
  10. Biostrings - core
  11. Rsamtools - core
  12. biomaRt: package, GitHub - no link from one to the other, despite the issues tracker containing important information about the state of the package
  13. annotate - core
  14. genefilter - core
  15. GenomicAlignments - core
  16. Rhtslib - core
  17. graph - core
  18. rtracklayer: package, GitHub - no link from one to the other, either way; moreover the search also returns an older mirror from @mtmorgan which might confuse at first; the issue tracker contains useful information that the user should be aware of
  19. edgeR: package - no collaborative platform like GitHub nor GitLub, or I could not find any
  20. GenomicFeatures - core
  21. BiocFileCache - core
  22. DESeq2: package, GitHub - this is exceptional, because the maintainer use the URL fields in both GitHub and package description to create a superb experience linking the two pages together; moreover the maintainer explained when to post an issue on the GitHub repo, and when to ask a question on the Bioconductor uspport forum
  23. Rhdf5lib: package, GitHub - another great example; both URL and BugReports fields are utilised
  24. geneplotter - core
  25. rhdf5 (same as Rhdf5lib)

Having two badges one from package website to the GitHub repo and one the other way round would help greatly here!

Contributions friendly?

A variation of this proposal would be to have a badge saying "welcoming contributors" or "contributor friendly" on the Bioconductor package site; this would signify that the maintainer opted-in to provide the repository address and will consider PRs with bug fixes and improvements.

Issues count badge?

The final variation, is to have a badge showing the number of issues open. I believe that this is very important, because issues can be discovered after a release and users should know what are limitations of the package; they should not have to read through all the the support forum questions and answers to discover that there is a bug that changes the result - this is not what I would expect a typical user to do just after installing a package. However, should they have a badge saying [7 issues], they might be inclined to check that.

I would emphasise that this badge would have a different purpose forum the existing "posts" badge which counts the questions and answers on the support website; a popular package might have thousands of usage questions, but only a few bugs at any given time. It is not important for the integrity of the research that the users read the 1000s of usage questions, but it is that they are aware of the few bugs which might or might not affect their use case.

Apologies if this is not an appropriate place to post this idea.

@krassowski
Copy link
Author

krassowski commented Oct 10, 2020

To give a concrete example, yesterday I mis-directed this post SamGG/ropls#2 attempting to describe a performance issue in ropls and asking if PRs would be considered (they would not as it is not the repo of the author - despite the author showing up as they do have a GitHub account!).

Another anecdotal evidence - there are other, often trivial issues like mislabelled figures in vignettes which I would have fixed, should there be an easy way to submit a fix; I even caught myself analysing the same mislabeled figure twice the same year which was a bit of a waste of time (for me but most certainly for others too!).

@mtmorgan
Copy link
Contributor

My not quite current clones of Bioconductor software packages shows

~/b/git$ find . -maxdepth 2 -name DESCRIPTION|xargs grep "^URL:" |wc -l
     845
~/b/git$ find . -maxdepth 2 -name DESCRIPTION|xargs grep "^BugReports:" |wc -l
     599

so maybe 1/3rd of Bioconductor packages do reference an external location; the URL is available in the Details: section of package landing pages. Certainly adding BugReports: to this section would be helpful

The core packages could be updated with URL / BugReports links via pull requests.

I'm not entirely sure about adding buttons, because I think there is value in users discussing use problems in a central location -- the support site. It seems like issues on repositories should really be limited to bug reports.

It seems like the way to 'encourage' this, at least in new package submissions, is to add a BiocCheck, perhaps generating a NOTE encouraging use of URL / BugReport. Again a pull request (on https://github.com/Bioconductor/BiocCheck) would be a good step.

@jorainer
Copy link
Contributor

I think a link to an existing GH (or similar) repository is certainly a good thing. Encouraging this in new package submissions as Martin suggested is an excellent idea.

However, IMHO, having a badge for the number of issues might be misleading as an issue is by no means always related to a bug. I usually add also issues to my (or other repositories) for feature requests or add ideas to improve functionality - mostly to have the idea written down and implement it at a later time point. So, using issues as a way to count bugs in a software might be totally misleading. Also, not having any issues does not mean that a software is free of bugs - in most cases

Finally, from personal experience, I found the Bioconductor community extremely open for collaborations and contributions. Even if I could not find a github repo to make a pull request (as in the case of ggbio), an old fashioned, kind email to the maintainer worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants