This repository contains the code for FRUITS, a Python package implementing a collection of transformations that extract features from univariate or multivariate time series. Additionally, we provide the following:
- A documentation for FRUITS available to build using
sphinx
. - A number of unit tests for FRUITS.
- An extensive suite of algorithms to compare and analyze FRUITS pipelines, called corbeille.
- The explicit pipelines (or just fruits) used in our paper, see fruit_general.py, fruit_reduced.py and fruit_twi.py.
- An ipynb notebook containing code that analysis FRUITS on some datasets in the UCR archive, using corbeille and classically.
Install FRUITS by cloning the repository to your local machine and using poetry
to install
the package and all dependencies in a new virtual environment. Alternatively, use instead pip
to
install FRUITS inside an existing environment (*).
$ git clone https://github.com/irkri/fruits
$ cd fruits
$ poetry install (or: $ python -m pip install -e .)
If an error occures, please try commenting out the line
# corbeille = {path = "experiments/corbeille/", optional = true, develop = true}
in the file pyproject.toml.
Without cloning the repository, use instead:
$ pip install git+https://github.com/alienkrieg/fruits
An html documentation is available as a zipped folder in the latest release of FRUITS.
The documentation can be built by calling make html
in the docs folder.
This should create a local directory docs/build
. Open the file docs/build/index.html
in a
browser to access the documentation. The Python dependencies (sphinx
, sphinx-rtd-theme
) needed are listed in
the toml file as development dependencies.
FRUITS implements the class fruits.Fruit
. A Fruit
consists of at least one slice
(fruits.FruitSlice
). A single slice consists of the following building blocks.
- Preparateurs: Preprocess the input time series.
- ISS: Calculate iterated sums for different semirings, weightings and words.
For example:
<[11], ISS(X)>=numpy.cumsum([x^2 for x in X])
is the result of
fruits.ISS([fruits.words.SimpleWord("[11]")]).fit_transform(X)
The definition and applications of the iterated sums signature ISS can be found in this paper by Diehl et al.. - Sieves: Extract single numerical values (i.e. the final features) from the arrays calculated in the previous step.
All features of each fruit slice will be concatenated at the end of the pipeline.
A simple example could look like this:
import numpy
import fruits
# time series dataset: 200 time series of length 100 in 3 dimensions
X_train = numpy.random.sample((200, 3, 100))
# create a fruit
fruit = fruits.Fruit("My Fruit")
# add preparateurs (optional)
fruit.add(fruits.preparation.INC)
# configure the type of Iterated Sums Signature being used
iss = fruits.ISS(
fruits.words.of_weight(2, dim=3),
mode=fruits.ISSMode.EXTENDED,
)
fruit.add(iss)
# choose from a variety of sieves for feature extraction
fruit.add(fruits.sieving.NPI(q=(0.5, 1.0)))
fruit.add(fruits.sieving.END)
# cut a new fruit slice without the INC preparateur
fruit.cut()
fruit.add(iss.copy())
fruit.add(fruits.sieving.NPI)
fruit.add(fruits.sieving.END)
# fit the fruit to the data and extract all features
fruit.fit(X_train)
X_train_features = fruit.transform(X_train)
Have a look at the instructions for more information on how to execute some experiments with FRUITS.
There are a bunch of tests for FRUITS available to execute. To do this, enter the command
$ python -m pytest tests
in a terminal/command line from the main directory of this repository.