Skip to content

CSBiology/BioFSharp

Repository files navigation

Logo

Nuget Made with F#

BioFSharp is an open source bioinformatics and computational biology toolbox written in F#. https://csbiology.github.io/BioFSharp/

Gitter GitHub contributors

Build status (ubuntu and windows) Test Coverage
codecov

Core functionality

In its core namespace, BioFSharp contains the basic data structures for common biological objects and their modification. Our type modeling starts at chemical elements, abstracts those to form formulas, and finally molecules of high biological relevance such as amino acids and nucleotides. Sequences of these molecules are modelled by BioCollections, which provide extensive functionality for investigating their real life counterparts.

Data model

Additionally, core algorithms for biological sequences such as alignments and pattern matching algorithms are implemented.

Besides the core functionality, BioFSharp has several namespaces as sub-projects with different scopes:

IO functionality

The IO namespace aims to make data available and ease further processing. It contains read/write functions for a diverse set of biological file formats such as Fasta, FastQ, GeneBank or GFF, as well as helper function for searching on or transforming the input data. Wrappers for commonly used command line tools like NCBI's Blast assure interoperability with an array of existing bioinformatic workflows

BioDB functionality

The BioDB namespace offers API access to powerful popular databases like GEO and EBI(including SwissProt/Expasy). We additionally provide an API access for FATool, a webservice by our workgroup for querying functional annotations of proteins.

This project is netframework only and has a new home here: https://github.com/CSBiology/BioFSharp.BioDB

BioContainers functionality

The BioContainers namespace is our newest BioFSharp project and we are very excited about it! It is all about making common bioinformatics tools programmatically accessible from F#. This is realized by making the containerized tool accessible via the Docker daemon. We wrap some functionality from Docker.DotNet to communicate with the docker API while providing extensive, type safe bindings for already 9 tools, including Blast, ClustalO, and TMHMM

ML functionality

Make your workflow ML ready with BioFSharp.ML. Currently contains helper functionf for CNTK and a pre-trained model we used in our publication about predicting peptide observability.

Stats functionality

The Stats namespace contains statistical functions with a clear biological focus such as functions for calculating Gene Ontology Enrichments.

Documentation

Functions, types and Classes contained in BioFSharp come with short explanatory description, which can be found in the API Reference.

More indepth explanations, tutorials and general information about the project can be found here.

The documentation and tutorials for this library are automatically generated (using the F# Formatting) from *.fsx and *.md files in the docs folder. If you find a typo, please submit a pull request!

Contributing

Please refer to the Contribution guidelines

Community/Social

Want to get in touch with us? We recently joined the twitter crowd:

Twitter Follow

Twitter Follow