Skip to content

buchananja/dpyp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dpyp

A convenience tool for small-scale data pipelines in Python

  image

About

dpyp is a data-pipeline convenience tool containing functionality for reading and writing batches, cleaning data, diagnosing pipelines, manipulating text, and calculating fields in Python.

Usage

  • dpyp consists of seven modules: 'calculate', 'clean', 'diagnose', 'read', 'text', 'write', and 'transform'.
  • Designed for use in small-scale Python pipelines with an emphasis on batch-processing via 'data-dictionaries'.
  • Batch processing of data via dictionaries allows iterative functions to improve readability and ease of use.
  • Built using a combination of base Python and pandas for writing robust small-scale pipelines with text manipulation capabilities.

Dependencies

  • pandas
  • pyarrow
  • numpy

Installation

pip install dpyp

License

See LICENSE.md

Contributing

See CONTRIBUTING.md