Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 84 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,89 @@
# NiaAML

NiaAML is an automated machine learning Python framework based on nature-inspired algorithms for optimization. The name comes from the automated machine learning method of the same name [[1]](#1). Its goal is to efficiently compose the best possible classification pipeline for the given task using components on the input. The components are divided into three groups: feature preprocessing algorithms, feature selection algorithms and classifiers. The framework uses nature-inspired algorithms for optimization to choose the best set of components for the classification pipeline on the output and optimize their parameters. We use <a href="https://github.com/NiaOrg/NiaPy">NiaPy framework</a> for the optimization process which is a popular Python collection of nature-inspired algorithms.
NiaAML is an automated machine learning Python framework based on nature-inspired algorithms for optimization. The name comes from the automated machine learning method of the same name [[1]](#1). Its goal is to efficiently compose the best possible classification pipeline for the given task using components on the input. The components are divided into three groups: feature seletion algorithms, feature transformation algorithms and classifiers. The framework uses nature-inspired algorithms for optimization to choose the best set of components for the classification pipeline on the output and optimize their parameters. We use <a href="https://github.com/NiaOrg/NiaPy">NiaPy framework</a> for the optimization process which is a popular Python collection of nature-inspired algorithms. The NiaAML framework is easy to use and customize or expand to suit your needs.

## Components

Below you can see a list of currently implemented components divided into three groups: classifiers, feature selection algorithms and feature transformation algorithms.

### Classifiers

* AdaBoost,
* Bagging,
* Extremely Randomized Trees,
* Linear SVC,
* Multi Layer Perceptron,
* Random Forest Classifier.

### Feature Selection Algorithms

* Select K Best,
* Select Percentile,
* Variance Threshold.

#### Nature-Inspired

* Bat Algorithm,
* Differential Evolution,
* Self-Adaptive Differential Evolution (jDEFSTH),
* Grey Wolf Optimizer,
* Particle Swarm Optimization.

### Feature Transformation Algorithms

* Normalizer,
* Standard Scaler.

## Examples

### Example of Usage

Load data and try to find the optimal pipeline for the given components.

```sh
from niaaml import PipelineOptimizer
from niaaml.data import CSVDataReader

data_reader = CSVDataReader(src='path_to_csv_file.csv', contains_classes = True, has_header = False)

pipeline_optimizer = PipelineOptimizer(
data=data_reader,
classifiers=['AdaBoost', 'Bagging', 'MultiLayerPerceptron', 'RandomForestClassifier'],
feature_selection_algorithms=['SelectKBest', 'SelectPercentile', 'ParticleSwarmOptimization'],
feature_transform_algorithms=['Normalizer', 'StandardScaler']
)
final_pipeline = t.run('Accuracy', 20, 20, 400, 400, 'ParticleSwarmAlgorithm', 'ParticleSwarmAlgorithm')
```

You can save a result of the optimization process as an object to a file for later use.

```sh
final_pipeline.export('pipeline.ppln')
```

And also load it from a file and use the pipeline.

```sh
loaded_pipeline = Pipeline.load('pipeline.ppln')

import numpy
# numpy array containing features
x = numpy.ndarray([[0.35, 0.46, 5.32], [0.16, 0.55, 12.5]], dtype=float)

y = loaded_pipeline.run(x)
```

You can also save a user-friendly representation of a pipeline to a text file.

```sh
final_pipeline.export_text('pipeline.txt')
```

### Example of a Pipeline Component Implementation

NiaAML framework is easily expandable as you can implement components by overriding the base classes' methods. To implement a classifier you should inherit from the [Classifier](niaaml/classifiers/classifier.py) class and you can do the same with [FeatureSelectionAlgorithm](niaaml/preprocessing/feature_selection/feature_selection_algorithm.py) and [FeatureTransformAlgorithm](niaaml/preprocessing/feature_transform/feature_transform_algorithm.py) classes. All of the mentioned classes inherit from the [PipelineComponent](niaaml/pipeline_component.py) class.

For more information take a look at the [Classifier](niaaml/classifiers/classifier.py) class and the implementation of the [AdaBoost](niaaml/classifiers/ada_boost.py) classifier that inherits from it.

## Licence

Expand Down
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
59 changes: 59 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
import sphinx_rtd_theme

# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Path setup --------------------------------------------------------------

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))


# -- Project information -----------------------------------------------------

project = 'NiaAML'
copyright = '2020, Luka Pečnik'
author = 'Luka Pečnik'

# The full version, including alpha/beta/rc tags
release = '0.0.1'


# -- General configuration ---------------------------------------------------

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'sphinx.ext.autodoc',
'sphinx_rtd_theme'
]

# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']


# -- Options for HTML output -------------------------------------------------

# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'sphinx_rtd_theme'

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
4 changes: 4 additions & 0 deletions docs/getting_started.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Getting Started
===============

TODO
22 changes: 22 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
NiaAML's documentation!
==================================

.. automodule:: niaaml

NiaAML is an automated machine learning Python framework based on nature-inspired algorithms for optimization. The name comes from the automated machine learning method of the same name [1]. Its goal is to efficiently compose the best possible classification pipeline for the given task using components on the input. The components are divided into three groups: feature seletion algorithms, feature transformation algorithms and classifiers. The framework uses nature-inspired algorithms for optimization to choose the best set of components for the classification pipeline on the output and optimize their parameters. We use NiaPy framework for the optimization process which is a popular Python collection of nature-inspired algorithms. The NiaAML framework is easy to use and customize or expand to suit your needs.

* **Free software:** MIT license
* **Github repository:** https://github.com/lukapecnik/NiaAML
* **Python versions:** 3.8.x

The main documentation is organized into a couple of sections:

* :ref:`user-docs`

.. _user-docs:

.. toctree::
:maxdepth: 2
:caption: User Documentation

getting_started
35 changes: 35 additions & 0 deletions docs/make.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=poetry run sphinx-build
)
set SOURCEDIR=.
set BUILDDIR=_build

if "%1" == "" goto help

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.http://sphinx-doc.org/
exit /b 1
)

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd
5 changes: 3 additions & 2 deletions niaaml/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from niaaml import classifiers
from niaaml import data
from niaaml import preprocessing
from niaaml.utilities import float_converter
from niaaml import fitness
from niaaml.utilities import MinMax
from niaaml.utilities import ParameterDefinition
from niaaml.utilities import Factory
Expand All @@ -13,10 +13,11 @@
'classifiers',
'data',
'preprocessing',
'float_converter',
'fitness',
'get_bin_index',
'MinMax',
'ParameterDefinition',
'OptimizationStats',
'Factory',
'PipelineOptimizer',
'Pipeline',
Expand Down
92 changes: 49 additions & 43 deletions niaaml/classifiers/ada_boost.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,54 +7,60 @@
__all__ = ['AdaBoost']

class AdaBoost(Classifier):
r"""Implementation of AdaBoost classifier.

Date:
2020

Author
Luka Pečnik

License:
MIT

See Also:
* :class:`niaaml.classifiers.Classifier`
"""

def __init__(self, **kwargs):
r"""Initialize AdaBoost instance.
"""
self._params = dict(
n_estimators = ParameterDefinition(MinMax(min=10, max=111), np.uint),
algorithm = ParameterDefinition(['SAMME', 'SAMME.R'])
)
self.__ada_boost = AdaBoostClassifier()

def set_parameters(self, **kwargs):
r"""Set the parameters/arguments of the algorithm.
"""
self.__ada_boost.set_params(**kwargs)

def fit(self, x, y, **kwargs):
r"""Fit AdaBoost.
r"""Implementation of AdaBoost classifier.

Date:
2020

Arguments:
x (Iterable[any]): n samples to classify.
y (Iterable[any]): n classes of the samples in the x array.
Author
Luka Pečnik

Returns:
None
"""
self.__ada_boost.fit(x, y)
License:
MIT

See Also:
* :class:`niaaml.classifiers.Classifier`
"""
Name = 'AdaBoost'

def predict(self, x, **kwargs):
r"""Predict class for each sample (row) in x.
def __init__(self, **kwargs):
r"""Initialize AdaBoost instance.
"""
self._params = dict(
n_estimators = ParameterDefinition(MinMax(min=10, max=111), np.uint),
algorithm = ParameterDefinition(['SAMME', 'SAMME.R'])
)
self.__ada_boost = AdaBoostClassifier()

def set_parameters(self, **kwargs):
r"""Set the parameters/arguments of the algorithm.
"""
self.__ada_boost.set_params(**kwargs)

def fit(self, x, y, **kwargs):
r"""Fit AdaBoost.

Arguments:
x (Iterable[any]): n samples to classify.
x (numpy.ndarray[float]): n samples to classify.
y (Iterable[any]): n classes of the samples in the x array.
"""
self.__ada_boost.fit(x, y)

def predict(self, x, **kwargs):
r"""Predict class for each sample (row) in x.

Arguments:
x (numpy.ndarray[float]): n samples to classify.

Returns:
Iterable[any]: n predicted classes.
"""
return self.__ada_boost.predict(x)
"""
return self.__ada_boost.predict(x)

def to_string(self):
r"""User friendly representation of the object.

Returns:
str: User friendly representation of the object.
"""
return Classifier.to_string(self).format(name=self.Name, args=self._parameters_to_string(self.__ada_boost.get_params()))
Loading