What is Gee Fu?
Gee Fu is an application that holds Gene Feature data. It has been designed with the needs of researchers wanting to keep, share and annotate sequence and feature data. Gee Fu is a Ruby on Rails based RESTful web-service application that stores and serves sequence assembly and genome feature data on request.
GeeFu can be used as a base that can be easily extended into fuller customised web-applications using the powerful Rails framework.
Gee Fu is ideally suited to serving large amounts of data such as those from high-throughput sequencing experiments via bio-samtools and BAM files.
Gee Fu is capable of receiving and handling requests from AnnoJ , a web service based viewing engine for genomic data. It can return JSON data which AnnoJ is able to render. We anticipate being able to serve up data in formats suitable for different applications as development progresses and we become aware of other rendering engines and web services that request data.
Setting up Gee Fu
- Install rvm using:
\curl -L https://get.rvm.io | bash -s stable --rails --autolibs=enabled --ruby=1.9.3-p286
- Install redis (on OSX with homebrew use)
brew install redis
- Follow instructions given by
brew info redisto have launchd start redis at login
- Install git (on OSX with homebrew use)
brew install git
- Install postgres (on OSX use PostgresApp)
- Clone the GeeFU repo
git clone git://github.com/danmaclean/gee_fu.git .
- Accept the .rvmrc as trusted
- Install bundler
gem install bundler
- Setup the repo
- Signup for a Mandrill account and grab an API key
- Create a .env file in the root folder with:
EMAIL_SENDER=My Name <firstname.lastname@example.org> MANDRILL_USERNAMEemail@example.com MANDRILL_APIKEY=[insert API key]
- Create a Postgres user called gee_fu with no password
- Change the attributes of Postgres user gee_fu to allow database creation
ALTER USER gee_fu CREATEDB;
- Follow the instructions at README_FOR_DATA.mdown
- Start the gee-fu server with
- Visit http://localhost:5000
- Once you've signed up as the first user use
rake admin:set firstname.lastname@example.org set your account as an admin to add Organisms
Setting up AnnoJ
The AnnoJ view requires a JSON format configuration file, to tell it about the genome and the service etc. Gee Fu allows you to maintain a YAML file and convert it to the
RAILS_ROOT/config/config.yml to set up the AnnoJ view.
The most important are the tracks stanzas, this is where you tell AnnoJ the tracks that should be available in your browser.
There are 3 track types:
- ModelsTrack, for aggregated transcript style gene models
- RepeatsTrack, for strandless block objects like repeats and
ReadsTrack, for high throughput se- quencing reads that will be be shown at various zoom-levels. The reference sequence Gee Fu provides is sent to AnnoJ as if it is a big read. Here are the different configuration options for the rest
id : a numeric unique id for the track name: free text name type : one of ModelsTrack , RepeatsTrack , ReadsTrack data : /features/[annoj|chromosome]/experiment_id showControls : 'true' to see the individual lane control button or leave absent height : 80 start up height of the track in pixels minHeight : 20 height of track when minimised maxHeight : 60 # height of track when maximised single : # 'true' if the track is strandless otherwise absent
AnnoJ requires certain metadata to understand what it is that it is supposed to be rendering. This is required at a genome and individual track level. When you create an experiment or genome you have the option to add a yml metadata file. As a minimum you should include the following information for your genome:
−−− institution : name: Your Institute logo : images/ institute_logo .png url : http ://www.your_place.etc service : format : Unspecified title : Sample Gee Fu served browser (TAIR 9 Gene Models) server : Unspecified version : Vers 1 access: public description: Free text description of the service engineer: name: Mick Jagger email : email@example.com genome: version : TAIR9 description : Chr1 from TAIR9 species : Arabidopsis thaliana
The schema is very straightforward and easily extended. It consists of a central Features table and a many to many join table Parents that indicates which features are parents (according to their gff records) of which other features.
Extending the database and creating new functionality
You can extend the database exactly as if it were any other Rails application. See the Rails documentation for conventions for creating and naming new tables, Rails prefers convention over configuration so you should pay attention to these. Adding new functionality to the web app is covered by the same documentation.
Where to find more info
If you get really frustrated with the software, feel free to complain to me
firstname.lastname@example.org. A lot of your initial problems will be answered in the Rails community pages, please look there if your problem looks like it might be related to Rails more directly than this particular instance of a Rails app.