An all in one Python3 Data Science Package. Easy visualisation, data mining, data preparation and machine learning.
Please check the Jupyter Notebook for instructions on how to use it. You can also check sciblox out on https://danielhanchen.github.io/
https://pypi.python.org/pypi/sciblox
Install:
[sudo] pip install sciblox
NOTE: If you intend to use remove linearly dependent rows or KNN,SVD impute:
[sudo] pip install fancyimpute sympy theano
If fancyimpute fails: Please install C++ or MingW compiler
WHAT'S NEW?
- FASTER (x10) BPCA fill
- Better analyser
- NEW modules - Machine Learning
Some features explained include:
- MICE, BPCA missing data imputation with Random Forests, XGBoost and Linear Regression support
- Automatic Data Plotting
- Word extraction and frequency plots
- Sequential text processing
- CARET like processes including ZeroVarCheck, FreqRatios etc.
- Discretization and Continuisation
- Easy data structure changes like Hcat, Vcat, reversing etc.
- Easy CARET like Machine Learning modules
- Automatic Best Graphs Plotting
IN CONSTRUCTION:
- Advanced text extraction methods
- Automatic Machine Learning methods
For easier calling:
from sciblox import *
%matplotlib notebook
If you are using other methods, just copy paste sciblox.py into whatever Python3 main directory. Then call it same as top.
Some screenshots: