-
Notifications
You must be signed in to change notification settings - Fork 36
Description
Submitting Author: Jonny Tran (@JonnyTran)
All current maintainers: @JonnyTran
Package Name: openomics
One-Line Description of Package: Library for integration of multi-omics, annotation, and interaction data
Repository Link: https://github.com/JonnyTran/OpenOmics
Version submitted: 0.8.4
Editor: @NickleDave
Reviewer 1: @gawbul
Reviewer 2: @ksielemann
Archive:
JOSS DOI:
Version accepted: v 0.8.8
Date accepted (month/day/year): 04/17/2021
Description
OpenOmics is a Python library to assist integration of heterogeneous multi-omics bioinformatics data. By providing an API of data manipulation tools as well as a web interface (WIP), OpenOmics facilitates the common coding tasks when preparing data for bioinformatics analysis. It features support for:
- Genomics, Transcriptomics, Proteomics, and Clinical data.
- Harmonization with 20+ popular annotation, interaction, and disease-association databases (e.g. GENCODE, Ensembl, RNA Central, BioGRID, DisGeNet etc.)
OpenOmics also has an efficient data pipeline that bridges the popular data manipulation Pandas library and Dask distributed processing to address the following use cases:
- Provides a standard pipeline for dataset indexing, table joining and querying, which are transparent and customizable for end-users.
- Efficient disk storage for large multi-omics dataset with Parquet data structures.
- Multiple data types that supports both interactions and sequence data, and allows users to export to NetworkX graphs or down-stream machine learning.
- An easy-to-use API that works seamlessly with external Galaxy tool interface or the built-in Dash web interface (WIP).
Scope
- Please indicate which category or categories this package falls under:
- Data retrieval
- Data extraction
- Data munging
- Data deposition
- Reproducibility
- Geospatial
- Education
- Data visualization*
* Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see notes on categories of our guidebook.
- Explain how the and why the package falls under these categories (briefly, 1-2 sentences):
OpenOmics' core functionalities are to provide a suite of tools for data preprocessing, data integration, and public database retrieval. Its main goal is to maximize the transparency and reproducibility in the process of multi-omics data integration.
- Who is the target audience and what are scientific applications of this package?
OpenOmics' primary target audience are computational bioinformaticians, and the scientific application of this package is to provide scalable ad-hoc data-frame manipulation for multi-omics data integration in a reproducible manner. Also, we are currently developing an interactive web dashboard and interfaces to the Galaxy Tool Shed, disseminating the tool to biologists without a programming background.
- Are there other Python packages that accomplish the same thing? If so, how does yours differ?
Existing PyPI Python packages within the scope of multi-omics data analysis are "pythomics" and "omics". Their functions appear to be lacking support for manipulation of integrated multi-omics dataset, retrieval of public databases, and extensible OOP design. OpenOmics aims to follow modern software best-practices and package publishing standards.
Aside from multi-omics integration tools, several specialized Python packages exists for single omics data, such as ScanPy's "AnnData" and "Loom" files. They provide an intuitive data structure for expression arrays and side annotations, and Loom file even allows for out-of-core data-frame processing. However, they don't yet provide mechanisms for multi-omics data integration, where each omics data may have overlapping samples or varying row/column sizes.
- If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or
@tag
the editor you contacted:
Technical checks
For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:
- does not violate the Terms of Service of any service it interacts with.
- has an OSI approved license.
- contains a README with instructions for installing the development version.
- includes documentation with examples for all functions.
- contains a vignette with examples of its essential functions and uses.
- has a test suite.
- has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.
Publication options
- Do you wish to automatically submit to the Journal of Open Source Software? If so:
JOSS Checks
- The package has an obvious research application according to JOSS's definition in their submission requirements. Be aware that completing the pyOpenSci review process does not guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS.
- The package is not a "minor utility" as defined by JOSS's submission requirements: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria.
- The package contains a
paper.md
matching JOSS's requirements with a high-level description in the package root or ininst/
. - The package is deposited in a long-term repository with the DOI: 10.5281/zenodo.4441167
Note: Do not submit your package separately to JOSS
Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?
This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.
- Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.
Code of conduct
- I agree to abide by pyOpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.
P.S. *Have feedback/comments about our review process? Leave a comment here
Editor and Review Templates
Metadata
Metadata
Assignees
Type
Projects
Status