Genome Annotation for the Masses
JavaScript Perl HTML Ruby CSS
Latest commit 0ed650a Feb 7, 2017 Anurag Priyam Minor updates to curate sidebar.
Signed-off-by: Anurag Priyam <anurag.priyam@qmul.ac.uk>
Permalink
Failed to load latest commit information.
.extlib Afra Feb 10, 2014
Bio JB takes care of URI escaping. Dec 7, 2014
bin Further revise importer. Oct 8, 2015
data
etc/postgresql/9.4/main Update docker setup. Nov 17, 2014
migrations Show task name in the top bar. Oct 7, 2015
models Typo. Feb 7, 2017
routes Simple review. Oct 14, 2015
scripts JB takes care of URI escaping. Dec 7, 2014
services Further revise importer. Oct 8, 2015
tests Implement getSpliceSites. And remove bionode. Jan 8, 2015
www
.bowerrc Afra Feb 10, 2014
.dockerignore Update docker setup. Nov 17, 2014
.gitignore gitignore .rake/ Nov 17, 2014
.ruby-version Let's not specify point release in .ruby-version. Dec 9, 2015
.travis.yml
ArrayRepr.pm Afra Feb 10, 2014
Dockerfile Dockerfile: import test data as the last step. Nov 21, 2014
FeatureTrack.pm Afra Feb 10, 2014
Gemfile Group development dependencies. Mar 26, 2015
GenomeDB.pm Update Perl subsystem to JB 1.11.5. Nov 27, 2014
IntervalStore.pm Afra Feb 10, 2014
JBlibs.pm Update Perl subsystem to JB 1.11.5. Nov 27, 2014
JsonFileStorage.pm Update Perl subsystem to JB 1.11.5. Nov 27, 2014
LICENSE.txt Copyright attribution to JB and WA in NOTICE.txt. Remove Appendix from Jun 23, 2014
LICENSE_NOTICE.txt Acknowledgement and legal. Dec 9, 2015
LazyNCList.pm Afra Feb 10, 2014
LazyPatricia.pm
Makefile.PL Update Perl subsystem to JB 1.11.5. Nov 27, 2014
NCLSorter.pm Afra Feb 10, 2014
NCList.pm Afra Feb 10, 2014
NameHandler.pm Afra Feb 10, 2014
README.mkd less unrealistic timelines Apr 19, 2016
Rakefile Revise importer. Oct 7, 2015
app.rb Bind to 0.0.0.0 instead of localhost. Nov 17, 2014
bower.json Can now validate a transcript in edit track with GeneValidator. Sep 24, 2015
build.js Prevent bootstrap from being bundled again for /curate. Apr 22, 2015
example.env.yml
package.json Basic frontend build setup. Apr 6, 2015

README.mkd

build status Code Climate Gitter chat

Afra: crowdsourcing gene feature annotation

Genomes of emerging model organisms are now being sequenced at low cost. However, obtaining accurate gene predictions remains challenging. Even the best gene prediction algorithms make substantial errors, leading to further erroneous analysis. Therefore, many predicted genes need to be visually inspected and manually curated (Yandell & Ence); this can be infeasible when working with thousands of genes from multiple organisms.

Inspired by crowdsourcing approaches and platforms including Foldit, Galaxy Zoo and Crowdflower, we are developing Afra to recruit additional gene feature curators. This should help dramatically increase the quality of gene curations available for newly sequenced genomes. In the long-term we aim to recruit contributors among members of the general public. However, gene curation requires large amounts of specialist knowledge and overcoming a steep learning curve. While we are working to reduce the steepness of the learning curve via interactive tutorials and support forums, genome curation is not yet easily accessible to all. Thus in a first instance we are recruiting curators among biology students. They perform curations as part of their courses aiming to understand gene structure and/or challenges with gene identification and gene prediction.

Current status

Users login to their dashboard using their Facebook account, where they are presented with documentation, guided tutorial exercises, and curation challenges which include "Curate" buttons. Each curation challenge invites user to contribute towards a different curation project.

user dashboard

Clicking 'Curate' sends the user to a JBrowse-derived WebApollo-like curation interface focusing on a single gene model and showing all available tracks of evidence for this gene model. The user starts by dragging one of these models (typically the consensus gene model) to the edit track and can then edit this gene model.

curation interface

Users may refer to the tutorials or seek help on our forum using the 'Help & Support' link at the top. A simple step by step guideline to curation is always available in a sidebar that folds to the right.

Behind the scenes

Afra imports a GFF file of predicted gene models and creates a prioritized list of "curation tasks" based on expected curation difficulty; the administrator can additionally prioritize specific genes for a specific curation project. Each gene prediction is presented to four independent users/curators. Each curator independently examines the gene model and may propose revisions or add comments (e.g., if there is insufficient evidence to curate).

For each gene prediction, submitted gene models are then automatically compared: if all users propose the same changes to a gene model, these changes are considered to be correct. If gene models proposed by different curators disagree, the different gene predictions are shown to several more experience curators who submit their curation in turn. If gene models proposed by the more experienced curators disagree, all predictions are shown to an even more senior curator who makes a final verdict.

Roadmap

at work

  • Annotation editing.
  • Prioritized redundant task distribution
  • Basic user dashboard.
  • Simple, non-interactive tutorials.
  • Obtain curations from eight QMUL MSc students.
  • Obtain contributions from 20 of undergraduate students.
  • December 2014: Simple editor synchronization between two tabs/windows.
  • December 2014: Improve annotation editing experience. Make it more intuitive.
  • December 2014: Basic automated testing of annotation editing functionality.

Todos:

  • Improve page load times.
  • Partially done genome dashboard: Overview of contributions per genome. How many curations. How many pass auto-check.
  • Comments on curations.
  • Extensive automated testing of annotation editing functionality.
  • Improve annotation editing performance.
  • Interactive tutorial.
  • Roll out to 200 first year students learning about gene structure ... and the inadequacies of Bioinformatics algorithms.

Contributions are welcome

We welcome contributions of code, curations, or documentation. Find us on Gitter to discuss how you could best help.

Our Wiki details setting up a development environment using Docker.

Contact

Please email if you:

  • would like a demo
  • would like to use Afra in your institution to help teach students
  • have any other questions

Afra is Copyright (©) 2013 Queen Mary, University of London.
Parts of Afra are a derivative work of JBrowse and WebApollo which are respectively copyright (c) 2000-2006 The Perl Foundation and copyright (c) 2010 Regents of the University of California.