Skip to content

Open-source data wrangling library for Python. Standardizes string data, tracks data frame changes, and identifies and corrects data entry errors with nlp.

License

Notifications You must be signed in to change notification settings

jaimiles23/pywrangle

Repository files navigation

Pywrangle

About

PyWrangle is an open-source Python library for data wrangling. Wikipedia defines data wrangling as follows:

is the process of transforming and mapping data from one "raw" data form into another format with the intent of making it more appropriate and valuable for a variety of downstream purposes such as analytics

Functions

PyWrangle currently supports:

  • cleaning strings
  • tracking dataframe changes
  • identifying data entry errors

Documentation & Distribution

Documentation is available here

Distribution is available here

Install

Requirements

  • Python >= 3.6
  • numpy >= 1.14.4
  • pandas >= 1.0.3
  • fuzzywuzzy >= 0.18.0
  • python-levenshtein >= 0.12.0
  • metaphone >= 0.6

Pip Install

To install pywrangle, use pip:

pip install pywrangle

Import

Per convention with Python libraries for data science, import pywrangle as follows:

>>> import pywrangle as pw

Contributing

Like all developers, I love open source. Please reference the contributing guidelines here

About

Open-source data wrangling library for Python. Standardizes string data, tracks data frame changes, and identifies and corrects data entry errors with nlp.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages