Skip to content

averissimo/gene-extractor

Repository files navigation

GeneExtractor

Searchs independent terms against different databases and retrieves gene sequences from:

Requirements

How to Use

  1. Run bundle install --path vendor/bundle to install dependencies (currently only Bioruby)
  2. Create a keys.txt file (either by copying keys.txt.example or creating a blank one)
  • Add query terms to keys.txt (separated by new lines)
  1. Create a config.yml file (either by copying keys.txt.example or creating a blank one)
  • Open the file and change options (if need be)
  1. Run bundle exec ruby script.rb to search and download all the associated genes
  • If you don't install gems locally then just run ruby script.rb

Config.yml options

YML syntax is used to configure GeneExtractor. It is an hierarchical file that uses indentation to define children attribute or lists.

  • email: user's valid email address necessary to use NCBI Rest API
  • output:
  • dir: parent folder to place results from GeneExtractor
  • data_prefix: add an additional fodler level with date and time when GeneExtractor was executed
  • kegg: folder name for kegg results
  • ncbi: folder name for ncbi results
  • search:
  • ncbi: list of fields that should be searched in NCBI (each field)

example config.yml

email: gene.extractor@mailinator.com
output:
  dir: queries
  date_prefix: true
  kegg: kegg
  ncbi: ncbi

search:
  ncbi:
    - Protein name
    - Gene name
    - Title

Ackowledgements

This tool was created as a part of FCT grant SFRH/BD/97415/2013 and European Commission research project BacHBerry (FP7- 613793)

Developer

About

Searchs independent search term against KEGG and GeneBank (NCBI)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages