Skip to content
/ pyradigm Public
forked from raamana/pyradigm

Python class defining a machine learning dataset ensuring key-based correspondence and maintaining integrity

License

Notifications You must be signed in to change notification settings

m9h/pyradigm

 
 

Repository files navigation

Pyradigm: PYthon based data structure to improve Dataset's InteGrity in Machine learning workflows

travis Code Health PyPI version codecov

A common problem for machine learning developers is keeping track of the source of the features extracted, and to ensure integrity of the dataset (e.g. not getting data mixed up from different subjects and/or classes). This is incredibly hard as the number of projects grow, or personnel changes are frequent. These aspects can break the chain of hyper-local info about various datasets, such as where did the original data come from, how was it processed or quality controlled, how was it put together, by who and what does some columns in the table mean etc. This package provides a Python data structure to encapsulate a machine learning dataset with key info greatly suited for neuroimaging applications (or any other domain), where each sample needs to be uniquely identified with a subject ID (or something similar). Key-level correspondence across data, labels (e.g. 1 or 2), classnames (e.g. 'healthy', 'disease') and the related helps maintain data integrity, in addition to offering a way to easily trace back to the sources from where the features have been originally derived.

For users of Panadas, some of the elements in pyradigm's API/interface may look familiar. However, the aim of this data structure is not to offer an alternative to pandas, but to ease the machine learning workflow for neuroscientists by 1) offering several well-knit methods and useful attributes specifically geared towards neuroscience research, 2) aiming to offer utilities that combines multiple or advanced patterns of routine dataset handling and 3) using a more accessible language (compared to hard to read pandas docs aimed at econometric audience) to better cater to neuroscience developers (esp. the novice).

Thanks for checking out. Your feedback will be appreciated.

Installation

pip install pyradigm

Usage

This Pyradigm Example notebook illustrates the usage.

Requirements

  • Packages: numpy
  • Python versions: I plan to support all the popular versions soon. Only 2.7 is tested for support at the moment.

Support on Beerpay

Hey dude! Help me out for a couple of 🍻!

Beerpay Beerpay

About

Python class defining a machine learning dataset ensuring key-based correspondence and maintaining integrity

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 58.3%
  • Python 40.8%
  • TeX 0.9%