Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bspline.py dependency on old scipy.stats.models #601

Closed
raddy opened this issue Jan 5, 2013 · 4 comments
Closed

bspline.py dependency on old scipy.stats.models #601

raddy opened this issue Jan 5, 2013 · 4 comments

Comments

@raddy
Copy link

raddy commented Jan 5, 2013

Does anyone have a workaround for this dependency? Is the LSQunivariate interpolate function a good proxy?

@josef-pkt
Copy link
Member

There is no direct workaround. The old stats.models bspline was based on a c extension that crashed (segfaulted).
The old code has been left in the sandbox in case someone wants to reuse parts for a new implementation.
I never tried to figure out the details of the old bspline code.

scipy.interpolate splines work (reasonably) well, but they are not designed or don't have wrappers for some of the typical smoothing spline usages in statistics. But I think they can be reused for many cases,

(I started to work on pure python smoothing splines and penalized splines but haven't looked at it in more than a year.)

What's your use case?
Maybe I have some recipes or can tell how easy it is to write them.

@raddy
Copy link
Author

raddy commented Jan 6, 2013

My use case is doing some functional domain analysis for a dynamic volatility-surface model:

Currently, I use the scipy.interpolate splines to do this, but there doesn't seem to be a way to get the correct basis matrix, plus, well there are some issues with some of the interpolate methods.

I have essentially a time series of unevenly spaced observations along the X axis, and at each time step, I'd like to use knots located in a fixed space to spline-smooth over. (So the knots aren't necessarily located at the observation points, but I could maybe make do with uniform spacing)

My thought was to use bayesian filters to control the evolution of the coefficients.

Assuming that all of this performs to my (overly optimistic) expectations, I need to rebuild or at least supervise the reconstruction of this in C++, so I was particularly attracted to the fact that this implementation did not wrap fortran. (As fortran is indecipherable to me)

@josef-pkt
Copy link
Member

My general impression after having worked on splies for a bit

Building a general purpose spline tool box is a lot of work. A lot of it is numerical optimization for a large number of knots, for example linear algebra to take account of the banded structure, and functions to add and drop knots without recalculating everything.
If you need this performance and you finally want to end up in C++, then I would look at C++ libraries, maybe Boost.

What's a rough count for the number of knots that you would be using?
If your number of knots is not too large, then, I think, using specialised code (for banded matrices) doesn't matter so much.

The scipy wrappers have several low level function. You can set your own knots (if you don't do it already) and Chuck posted a recipe how to get the basis functions out of it (although that didn't sounded to be the most efficient way).

The simple (non-optimized) case is actually not difficult: get basis functions then run a Bayesian/Ridge Regression with informative penalization matrix, for example penalize difference between neighboring coefficients.

The more difficult parts:
imposing constraints (endpoint, periodic, and similar) on the splines makes it into a non-linear optimization problem
searching for the location of knots needs efficient updating (adding and dropping knots)

to the last point: from what I have read in the literature there are two main camps for smoothing splines: one emphasises the search for the best location of a smallish number of knots, the other uses a large number of knots and emphasises penalization.

scipy.signal also has uniformly spaced splines, but I don't remember if observations also have to be on a uniform time grid.

I haven't looked at the functional time series case yet.
If you keep the number of knots constant and not too large, then almost all the work will be in updating the (penalized) least squares parameter estimate, especially if you have a long time series. And finding the smoothing/penalization parameter or Bayes prior.

@josef-pkt
Copy link
Member

closing this

there are some interesting enhancements to do, but I don't see anyone trying to rescue the old code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants