This tool will allow you to save a considerable amount of time. It includes automatic data preparation, training of 10 models of ML (supervised and unsupervised), and a visualization to help you to choose and set up a model. All these steps are done specifically for the detection of outliers.
- Linear model
- Xgboost
- LightGbm
- Random Forrest
- Naive Bayes
- KMeans
- Mean Shift
- Mixture Gaussian
- Bayesian Mixture Gaussian
- You need to have your dataset in a CSV format.
- You have to install the python librairies :
pip install -r requirements.txt
Execute the following command in a terminal : python main.py
Then you need to copy the following link in your Web browser :
-
Then, you have to choose the learning mode. If your dataset contains the feature to predict, click on supervised. Otherwise, click on unsupervised.
-
If you have chosen the supervised learning, you have to select the feature to predict :
- You want to see the changes made on your dataset ? Select diagnostic.
- Select automatic if you want to launch the data preparation.
To perform and visualize an unsupervised model, you have to reduce the dimensions. The PCA is the most basic and strong one.
If you choose to keep more than 2 dimensions after the PCA, you have to select an other dimensionality reduction to visualize your data in 2D.
The algorithms implemented are the followings :
- TSNE
- Locally Linear Embedding
- Multi-dimensional Scaling (MDS)
- Isomap
To understand better these algorithms, click on this link