|
| 1 | +##Data Processing |
| 2 | + |
| 3 | +1. pandas |
| 4 | +pandas is a package providing fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. |
| 5 | +Project Source: https://github.com/pydata/pandas |
| 6 | +Project Homepage: http://pandas.pydata.org/ |
| 7 | + |
| 8 | +1. Faker |
| 9 | +Faker is a package that generates fake data for you. Whether you need to bootstrap your database, create good-looking XML documents, fill-in your persistence to stress test it, or anonymize data taken from a production service, Faker is for you. |
| 10 | +Project Source: https://github.com/joke2k/faker |
| 11 | +Project Documentation: http://fake-factory.readthedocs.org/en/latest/ |
| 12 | + |
| 13 | +1. tablib |
| 14 | +Tablib is a format-agnostic tabular dataset library, written in Python. |
| 15 | +Project Source: https://github.com/kennethreitz/tablib |
| 16 | +Project Documentation: http://docs.python-tablib.org/en/latest/ |
| 17 | + |
| 18 | +1. TextBlob |
| 19 | +TextBlob is a Python library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) |
| 20 | +tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. |
| 21 | +Project Source: https://github.com/sloria/TextBlob |
| 22 | +Project Homepage: http://textblob.readthedocs.org/en/dev/ |
| 23 | + |
| 24 | +1. jieba |
| 25 | +Chinese text segmentation. |
| 26 | +Project Source: https://github.com/fxsjy/jieba |
| 27 | +Online Demo Address: http://jiebademo.ap01.aws.af.cm/ |
| 28 | + |
| 29 | +1. nltk |
| 30 | +NLTK is a suite of open source Python modules, data sets and tutorials supporting research and development in Natural Language Processing. |
| 31 | +Project Source: https://github.com/nltk/nltk |
| 32 | +Project Homepage: http://www.nltk.org/ |
| 33 | + |
| 34 | +1. newspaper |
| 35 | +News extraction, article extraction and content curation in python. |
| 36 | +Project Source: https://github.com/codelucas/newspaper |
| 37 | +Project Homepage: http://newspaper.readthedocs.org/en/latest/ |
| 38 | + |
| 39 | +1. Pillow |
| 40 | +Python Imaging Library. |
| 41 | +Project Source: https://github.com/python-imaging/Pillow |
| 42 | +Project Homepage: http://python-imaging.github.io/ |
| 43 | + |
| 44 | +1. gensim |
| 45 | +Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. |
| 46 | +Project Source: https://github.com/piskvorky/gensim |
| 47 | +Project Homepage: http://radimrehurek.com/gensim/ |
| 48 | + |
0 commit comments