Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Summer of Code 2015 #28

Closed
bmpvieira opened this issue Apr 4, 2015 · 1 comment
Closed

Google Summer of Code 2015 #28

bmpvieira opened this issue Apr 4, 2015 · 1 comment
Assignees

Comments

@bmpvieira
Copy link
Member

Bionode submitted with BioJS, projects for GSoC2015. Unfortunately, BioJS wasn't accepted as an organisation this year (Google didn't accept many previously accepted organisations, and accepted new ones, which is fair).
However, those projects can still be carried by anyone interested in them, or by students looking for a projects.
The following is a copy of the Bionode projects listed on the BioJS page. If you are interested, please reply here or at gitter.im/bionode/bionode. You can also send me an email at mail@bmpvieira.com

Bionode Pipeline Building GUI

Rationale & Approach

Making a easy to use graphical user interface to build interactive pipelines would lower the barrier of entry to usage of Bionode to non-bioinformaticians/programmers. This could be achieved through integration with projects like Galaxy, however a more interactive/advanced interface such as Node-RED is what we are aiming for. Another good source for interface inspiration would be the NoFlo project. Node-RED or any other open source project can and should be used/adapted as much as possible instead of writing a new interface from scratch.

The resulting interface should produce as output a descriptive text file representation of the pipeline, that should be able to run on the command line without requiring the GUI. For example, Gasket, datscript, hackfile or Makefile.

Challenges

  • Integration between available interfaces and bionode pipeline
  • Producing a simple text format representation of those pipelines for easy versioning, distribution and collaboration.

Involved Tools / Libraries

  • Node-RED
  • NoFlo (for ideas)
  • Galaxy (for ideas)
  • Gasket, Datscript, Hackfiles, Makefiles (for text representation of pipeline).

Needed Skills

  • Backend JavaScript/Node.js
  • Frontend JavaScript
  • Bash
  • CoffeeScript (for NoFlo)

Mentors

Bionode team (contact: Bruno Vieira)

  • Boris Adryan: Scientist: @Flyjedi, genome gazer. Geek: Founder of @thingslearn, #IoT tinkerer
  • Bruno Vieira: Bioinformatics PhD student at Queen Mary University of London and Node.JS Web Developer. Working on population genomics, bionode.io and dat-data.com
  • Dave C-J: Node-RED developer
  • Karissa McKelvey: Programmer and idea jockey based in Oakland, CA. Former academic experienced in building interactive data visualization and collaboration tools
  • Mathias Buus: Programmer based in Copenhagen, Denmark. Co-creator of node-modules.com and co-founder of ge.tt. open mouth, open source
  • Max Ogden: Programmer based in Portland, OR. Max works on or has worked on things like CSVConf, Code for America, JavaScript for Cats, and Voxel.js
  • Nicholas O'Leary: IBM Emerging Technologies geek. All things MQTT and IoT. Creator of @nodered and one of the @BeardyDads
  • Steve Moos: Passionate Computational and Data Scientist specialising in Bioinformatics, DevOps and SysAdmin
  • Yannick Wurm: Population Genomics, Bioinformatics, Evolution of Social Insects. Senior Lecturer at Queen Mary University London

Bionode integration

Rationale & Approach

Bionode focus is on modular pipelines for data manipulation and analysis, while BioJS focus is on visualisation. It would be interesting to combine both tools to solve a biologically relevant problem while testing and solving issues with the integration between both projects.
For example, one interesting use case is to use Bionode to get transcriptomic data from the Sequence Read Archive (SRA) for any species/experiment and visualise the expression levels of genes with BioJS. During your project you should be able to work on at least three different use cases.
As the data might become larger for specific files (e.g. SAM/BAM) one should be able to use streams to communicate with Bionode modules

Challenges

  • Getting several modules from both projects to work together
  • Might require some architectural changes to those modules.

Involved Tools / Libraries

  • Bionode
  • BioJS

Needed Skills

  • Frontend JavaScript
  • Backend JavaScript/Node.js

Mentors

Bruno Vieira (Bionode) and Miguel Pignatelli (BioJS)

Bionode distribution on HPC Grid

Rationale & Approach

Bionode pipelines can currently only run on one machine, but we would like them to be able to scale and be distributed across nodes of a high performance computing cluster (HPC). There are several ways to distribute Node apps across several CPUs/Machines using native Node.js or libraries but for a scenario were the user does not have administrative access to the cluster and must rely on established queuing tools (i.e., Sun Grid Engine) integrating/wrapping Bionode around those tools might be the best approach.

Challenges

  • Development will require access to a cluster of several machines or a simulated environment. We already have a Docker container that provides Sun Grid Engine.
  • If the student is interested in using Node.js queuing/distribution libraries, it will require a review of the existing options and adapting to bionode pipelines.
  • If the student has more interest or experience with other queuing systems, it will require wrapping those systems with bionode/node.js code.
  • We only expect the student to do one approach, but a very skilled student could do both.

Involved Tools / Libraries

  • Node queuing systems
  • Other queuing systems (i.e. SGE)

Needed Skills

  • Node.js/JavaScript
  • HPC experience
  • Docker (could be useful for development)

Mentors

Steve Moss and Max Ogden

Bionode modules

Rationale & Approach

There are several modules that would be useful for bionode that can be grouped in:

  • Data access (from web APIs)
  • Data parsing/wrangling
  • Tools wrappers

The student could work on improving an existing module or writing from scratch a module that has been requested. If the student is interested in several small modules, improving their architecture and integration among themselves and other UNIX tools could have a huge impact on the usability of the project.

Challenges

The challenges will depend on the module(s) the student is interested in, but there are enough options to adapt to a very diverse range and level of skills.

Involved Tools / Libraries

  • Depends on the module, but everything from web APIs (e.g., NCBI) to command line tools (e.g., SAMTOOLS).
  • Node.js/JavaScript

Needed Skills

JavaScript/Node.js

Mentors

Bionode team (contact: Bruno Vieira)

@bmpvieira
Copy link
Member Author

Bionode pipelines got a follow up in GSoC 2016 with @thejmazz working as a student on bionode-watermill and this year for GSoC 2017 he's looking forward to be a mentor for another student!

@bmpvieira bmpvieira self-assigned this Apr 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

1 participant