pyts: a Python package for time series classification
pyts is a Python package for time series classification. It aims to make time series classification easily accessible by providing preprocessing and utility tools, and implementations of state-of-the-art algorithms. Most of these algorithms transform time series, thus pyts provides several tools to perform these transformations.
- Python (>= 3.5)
- NumPy (>= 1.15.4)
- SciPy (>= 1.3.0)
- Scikit-Learn (>=0.20.4)
- Joblib (>=0.12)
- Numba (>=0.45.1)
To run the examples Matplotlib (>=2.0.0) is required.
If you already have a working installation of numpy, scipy, scikit-learn,
joblib and numba, you can easily install pyts using
pip install pyts
conda via the
conda install -c conda-forge pyts
You can also get the latest version of pyts by cloning the repository
git clone https://github.com/johannfaouzi/pyts.git cd pyts pip install .
After installation, you can launch the test suite from outside the source directory using pytest:
See the changelog for a history of notable changes to pyts.
The development of this package is in line with the one of the scikit-learn community. Therefore, you can refer to their Development Guide. A slight difference is the use of Numba instead of Cython for optimization.
The section below gives some information about the implemented algorithms in pyts. For more information, please have a look at the HTML documentation available via ReadTheDocs.
pyts consists of the following modules:
approximation: This module provides implementations of algorithms that approximate time series. Implemented algorithms are Piecewise Aggregate Approximation, Symbolic Aggregate approXimation, Discrete Fourier Transform, Multiple Coefficient Binning and Symbolic Fourier Approximation.
bag_of_words: This module consists of a class BagOfWords that transforms time series into bags of words. This approach is quite common in time series classification.
datasets: This module provides utilities to make or load toy datasets, as well as fetching datasets from the UEA & UCR Time Series Classification Repository.
decomposition: This module provides implementations of algorithms that decompose a time series into several time series. The only implemented algorithm is Singular Spectrum Analysis.
multivariate: This modules provides utilities to deal with multivariate time series. Available tools are MultivariateTransformer and MultivariateClassifier to transform and classify multivariate time series using tools for univariate time series respectively, as well as JointRecurrencePlot and WEASEL+MUSE.
preprocessing: This module provides most of the scikit-learn preprocessing tools but applied sample-wise (i.e. to each time series independently) instead of feature-wise, as well as an imputer of missing values using interpolation. More information is available at the pyts.preprocessing API documentation.
transformation: This module provides implementations of algorithms that transform a data set of time series with shape
(n_samples, n_timestamps)into a data set with shape
(n_samples, n_features). Implemented algorithms are BOSS, ShapeletTransform and WEASEL.
utils: a simple module with utility functions.