Skip to content

florisvb/PyNumDiff

Repository files navigation

PyNumDiff

Python methods for numerical differentiation of noisy data, including multi-objective optimization routines for automated parameter selection.

Python for Numerical Differentiation of noisy time series data

Documentation Status PyPI version DOI

Table of contents

Introduction

PyNumDiff is a Python package that implements various methods for computing numerical derivatives of noisy data, which can be a critical step in developing dynamic models or designing control. There are four different families of methods implemented in this repository: smoothing followed by finite difference calculation, local approximation with linear models, Kalman filtering based methods and total variation regularization methods. Most of these methods have multiple parameters involved to tune. We take a principled approach and propose a multi-objective optimization framework for choosing parameters that minimize a loss function to balance the faithfulness and smoothness of the derivative estimate. For more details, refer to this paper.

Structure

  • .github/workflows contains .yaml that configures our GitHub Actions continuous integration (CI) runs.
  • docs/ contains make files and .rst files to govern the way sphinx builds documentation, either locally by navigating to this folder and calling make html or in the cloud by readthedocs.io.
  • examples/ contains Jupyter notebooks that demonstrate some usage of the library.
  • pynumdiff/ contains the source code. For a full list of modules and further navigation help, see the readme in this subfolder.
  • .editorconfig ensures tabs are displayed as 4 characters wide.
  • .gitignore ensures files generated by local pip installs, Jupyter notebook runs, caches from code runs, virtual environments, and more are not picked up by git and accidentally added to the repo.
  • .pylintrc configures pylint, a tool for autochecking code quality.
  • .readthedocs.yaml configures readthedocs and is necessary for documentation to get auto-rebuilt.
  • CITATION.cff is citation information for the Journal of Open-Source Software (JOSS) paper associated with this project.
  • LICENSE.txt allows free usage of this project.
  • README.md is the text you're reading, hello.
  • linting.py is a script to run pylint.
  • pyproject.toml governs how this package is set up and installed, including dependencies.

Citation

See CITATION.cff file as well as the following references.

PyNumDiff python package:

@article{PyNumDiff2022,
  doi = {10.21105/joss.04078},
  url = {https://doi.org/10.21105/joss.04078},
  year = {2022},
  publisher = {The Open Journal},
  volume = {7},
  number = {71},
  pages = {4078},
  author = {Floris van Breugel and Yuying Liu and Bingni W. Brunton and J. Nathan Kutz},
  title = {PyNumDiff: A Python package for numerical differentiation of noisy time-series data},
  journal = {Journal of Open Source Software}
}

Optimization algorithm:

@article{ParamOptimizationDerivatives2020, 
doi={10.1109/ACCESS.2020.3034077}
author={F. {van Breugel} and J. {Nathan Kutz} and B. W. {Brunton}}, 
journal={IEEE Access}, 
title={Numerical differentiation of noisy data: A unifying multi-objective optimization framework}, 
year={2020}
}

Getting Started

Prerequisite

PyNumDiff requires common packages like numpy, scipy, and matplotlib. For a full list, you can check the file pyproject.toml

In addition, it also requires certain additional packages for select functions, though these are not required for a successful install of PyNumDiff:

  • Total Variation Regularization methods: cvxpy
  • pytest for unittests

Installing

The code is compatible with >=Python 3.5. It can be installed using pip or directly from the source code. Basic installation options include:

  • From PyPI using pip: pip install pynumdiff.
  • From source using pip git+: pip install git+https://github.com/florisvb/PyNumDiff
  • From local source code using setup.py: Run pip install . from inside this directory. See below for example.

Call pip install pynumdiff[advanced] to automatically install CVXPY along with PyNumDiff. Note: Some CVXPY solvers require a license, like ECOS and MOSEK. The latter offers a free academic license.

Usage

PyNumDiff uses Sphinx for code documentation, so read more details about the API usage there.

Code snippets

  • Basic Usage: you provide the parameters
from pynumdiff.submodule import method

x_hat, dxdt_hat = method(x, dt, param1=val1, param2=val2, ...)     
  • Intermediate usage: automated parameter selection through multi-objective optimization
from pynumdiff.optimize import optimize

params, val = optimize(method, x, dt, search_space={'param1':[vals], 'param2':[vals], ...},
                                            tvgamma=tvgamma, # hyperparameter, defaults to None if dxdt_truth given
                                            dxdt_truth=None) # or give ground truth data, in which case tvgamma unused
print('Optimal parameters: ', params)
x_hat, dxdt_hat = method(x, dt, **params)

If no search_space is given, a default one is used.

  • Advanced usage: automated parameter selection through multi-objective optimization using a user-defined cutoff frequency
# cutoff_freq: estimate by (a) counting the number of true peaks per second in the data or (b) look at power spectra and choose cutoff
log_gamma = -1.6*np.log(cutoff_frequency) -0.71*np.log(dt) - 5.1 # see: https://ieeexplore.ieee.org/abstract/document/9241009
tvgamma = np.exp(log_gamma) 

params, val = optimize(method, x, dt, search_space={'param1':[options], 'param2':[options], ...},
                                            tvgamma=tvgamma)
print('Optimal parameters: ', params)
x_hat, dxdt_hat = method(x, dt, **params)

Notebook examples

We will frequently update simple examples for demo purposes, and here are currently exisiting ones:

Important notes

  • Larger values of tvgamma produce smoother derivatives
  • The value of tvgamma is largely universal across methods, making it easy to compare method results
  • The optimization is not fast. Run it on subsets of your data if you have a lot of data. It will also be much faster with faster differentiation methods, like savgoldiff and butterdiff.
  • The following heuristic works well for choosing tvgamma, where cutoff_frequency is the highest frequency content of the signal in your data, and dt is the timestep: tvgamma=np.exp(-1.6*np.log(cutoff_frequency)-0.71*np.log(dt)-5.1)

Running the tests

We are using GitHub Actions for continuous intergration testing.

To run tests locally, type:

> pytest pynumdiff

Add the flag --plot to see plots of the methods against test functions. Add the flag --bounds to print log error bounds (useful when changing method behavior).

License

This project utilizes the MIT LICENSE. 100% open-source, feel free to utilize the code however you like.

About

Methods for numerical differentiation of noisy data in python

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages