![](https://github.com/arthurpaulino/miraiml/raw/master/docs/img/MiraiML.svg?sanitize=true) 

MiraiML: asynchronous, autonomous and continuous Machine Learning in Python https://miraiml.readthedocs.io


MiraiML
=======
    Mirai: `future` in japanese.

MiraiML is an asynchronous engine for continuous & autonomous machine learning,
built for real-time usage.
Usage
-----

1. Install: ``$ pip install miraiml``
2. Now, inside a Python environment, you can import the main components:

>>> from miraiml import SearchSpace, Config, Engine

You might want to `Read the Docs`_ for a better understanding of MiraiML.

Contributing
------------

Please, follow the guidelines_ if you want to be part of this project.

-  _examples: https://github.com/arthurpaulino/miraiml/tree/master/examples

- _Read the Docs: https://miraiml.readthedocs.io/en/latest/

- _guidelines: https://github.com/arthurpaulino/miraiml/blob/master/CONTRIBUTING.md

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# Any results you write to the current directory are saved as output.

In [None]:
!pip install miraiml

In [None]:
# Read the data
data = pd.read_csv('/kaggle/input/house-prices-advanced-regression-techniques/train.csv')
data = data[['LotArea', 'OverallQual', 'YearBuilt', 'TotRmsAbvGrd', 'SalePrice']]

In [None]:
from sklearn.model_selection import train_test_split

train_data, test_data = train_test_split(data, test_size=0.2)

# Building the search spaces
Let's compare (and ensemble) a ``KNeighborsRegressor`` and a pipeline composed by ``QuantileTransformer`` and a ``LinearRegression``.

In [None]:
from sklearn.neighbors import KNeighborsRegressor
from sklearn.linear_model import  LinearRegression
from sklearn.preprocessing import QuantileTransformer
from sklearn.preprocessing import StandardScaler

from miraiml import SearchSpace
from miraiml.pipeline import compose

Pipeline = compose(
    [('scaler', StandardScaler), ('linear_reg', LinearRegression)]
)

search_spaces = [
    SearchSpace(
        id='k-NeighborsRegressor',
        model_class=KNeighborsRegressor,
        parameters_values=dict(
            n_neighbors=range(2, 9),
            weights=['uniform', 'distance'],
            p=range(2, 5)
        )
    ),
    SearchSpace(
        id='Pipeline',
        model_class=Pipeline,
        parameters_values=dict(
            scaler__with_mean=[True, False],
            scaler__with_std=[True, False],
            lin_reg__fit_intercept=[True, False]
        )
    )
]

# Configuring the Engine
For this demonstration, let's use ``r2_score`` to evaluate our modeling.

In [None]:
from sklearn.metrics import r2_score

from miraiml import Config

config = Config(
    local_dir='miraiml_local',
    problem_type='regression',
    score_function=r2_score,
    search_spaces=search_spaces,
    ensemble_id='Ensemble'
)

# Triggering the Engine
Let's also print the scores everytime the Engine finds a better solution.

In [None]:
from miraiml import Engine

def on_improvement(status):
    scores = status.scores
    for key in sorted(scores.keys()):
        print('{}: {}'.format(key, round(scores[key], 3)), end='; ')
    print()

engine = Engine(config=config, on_improvement=on_improvement)

Now we're ready to load the data

In [None]:
engine.load_train_data(train_data, 'SalePrice')
engine.load_test_data(test_data)

Let's see the status report.

In [None]:
from time import sleep

engine.restart()

sleep(1)

print('\nShuffling train data')
engine.shuffle_train_data(restart=True)

sleep(1)

engine.interrupt()

# Engine’s status analysis

In [None]:
status = engine.request_status()

In [None]:
print(status.build_report(include_features=True))

# Final