Skip to content

OpenOmics: Presubmission Inquiry #30

@JonnyTran

Description

@JonnyTran

Submitting Author: Nhat Tran (@JonnyTran)
Package Name: OpenOmics
One-Line Description of Package: Library for integration of multi-omics, annotation, and interaction data.
Repository Link (if existing): https://github.com/BioMeCIS-lab/OpenOmics


Description

OpenOmics is a Python library to assist integration of heterogeneous multi-omics bioinformatics data. By providing an API of data manipulation tools as well as a web interface (WIP), OpenOmics facilitates the common coding tasks when preparing data for bioinformatics analysis. It features support for:

  • Genomics, Transcriptomics, Proteomics, and Clinical data.
  • Harmonization with 20+ popular annotation, interaction, and disease-association databases (e.g. GENCODE, Ensembl, RNA Central, BioGRID, DisGeNet etc.)

OpenOmics also has an efficient data pipeline that bridges the popular data manipulation library like Pandas and distributed processing like Dask to the Dash web dashboard interface. With an intuitive web interface and easy-than-ever API, OpenOmics addresses the following use cases:

  • Provides a standard pipeline for dataset indexing, table joining and querying, which are transparent to users.
  • Multiple data types that supports both interactions and sequence data, and allows users to export to NetworkX graphs or down-stream machine learning.
  • An easy-to-use API that works seamlessly with the Dash web interface.

Scope

  • Please indicate which category or categories this package falls under:

    • Data retrieval
    • Data extraction
    • Data munging
    • Data visualization
    • Data deposition
    • Reproducibility
    • Geospatial
    • Education
    • Unsure/Other (explain below)
  • Explain how the and why the package falls under these categories (briefly, 1-2 sentences). Please note any areas you are unsure of:
    OpenOmics' core functionalities are to provide a suite of tools for data preprocessing, data integration, and public database retrieval. Its main goal is to maximize the transparency and reproducibility in the process of multi-omics data integration.

  • Who is the target audience and what are scientific applications of this package?
    OpenOmics' primary target audience are computational bioinformaticians, and the scientific application of this package is to facilitate the data prep tasks in multi-omics data integration in a reproducible manner. Also, we are currently developing interfaces to the Galaxy Tool Shed, disseminating the tool to biologists without a programming background.

  • Are there other Python packages that accomplish the same thing? If so, how does yours differ?
    Existing PyPI Python packages within the scope of multi-omics data analysis are "pythomics" and "omics". Their functions appear to be lacking support for manipulation of integrated multi-omics dataset, retrieval of public databases, and extensible OOP design. OpenOmics aims to follow modern software best-practices and package publishing standards.

  • Any other questions or issues we should be aware of?:

P.S. *Have feedback/comments about our review process? Leave a comment here

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Closed

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions