PyWrangle is an open-source Python library for data wrangling. Wikipedia defines data wrangling as follows:
is the process of transforming and mapping data from one "raw" data form into another format with the intent of making it more appropriate and valuable for a variety of downstream purposes such as analytics
PyWrangle currently supports:
- cleaning strings
- tracking dataframe changes
- identifying data entry errors
Documentation is available here
Distribution is available here
- Python >= 3.6
- numpy >= 1.14.4
- pandas >= 1.0.3
- fuzzywuzzy >= 0.18.0
- python-levenshtein >= 0.12.0
- metaphone >= 0.6
To install pywrangle, use pip:
pip install pywrangle
Per convention with Python libraries for data science, import pywrangle as follows:
>>> import pywrangle as pw
Like all developers, I love open source. Please reference the contributing guidelines here