Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: interpolators, smoothers and splines #2361

Open
josef-pkt opened this issue Apr 15, 2015 · 1 comment
Open

ENH: interpolators, smoothers and splines #2361

josef-pkt opened this issue Apr 15, 2015 · 1 comment

Comments

@josef-pkt
Copy link
Member

The type of models that have only a few (for memory consumption) basis functions, can be estimated and used by our regular models, or by looking at the (nobs, nobs) "kernel" matrix.

Examples:
Kernel Ridge Regression in sandbox
scikit smooth
http://mail.scipy.org/pipermail/scipy-dev/2015-April/020617.html with link to code on scipy-central
(has a nice class structure that could be used in a similar way for subclassing with choice of basis function)

creating a projection matrix based on splines.
Use case: Matthews on scipy-dev: having a single 1d exog x, but many endog y.

Using patsy's bsplines doesn't handle boundaries properly. problems with extrapolation. That will need more work.
(Also using the bspline in a formula with OLS works fine for fitting, but I had an exception in the predict. traceback will follow in next comment.)

>>> spb2 = patsy.dmatrix("bs(x, df=10, degree=3, include_intercept=True) - 1", {'x': np.arange(0,10, 0.25)})
>>> project = spb2.dot(np.linalg.pinv(spb2[::4]))
>>> y = np.sin(np.linspace(-np.pi, np.pi, 10))
>>> y2 = np.sin(2*np.linspace(-np.pi, np.pi, 10))
>>> y3 = np.sin(1.5*np.linspace(-np.pi, np.pi, 10))
>>> pred = project.dot(np.column_stack((y,y2,y3)))
>>> plt.plot(np.arange(0,10, 0.25)[::4], np.column_stack((y,y2,y3)), "o")
[<matplotlib.lines.Line2D object at 0x000000000D032D30>, <matplotlib.lines.Line2D object at 0x000000000D032B38>, <matplotlib.lines.Line2D object at 0x000000000D032438>]
>>> plt.plot(np.arange(0,10, 0.25), pred, "+-")
[<matplotlib.lines.Line2D object at 0x000000000D080D68>, <matplotlib.lines.Line2D object at 0x000000000D080518>, <matplotlib.lines.Line2D object at 0x000000000D083080>]
>>> plt.ylim(-1.2, 1.2)
(-1.2, 1.2)
>>> plt.show()
@josef-pkt
Copy link
Member Author

here is the traceback for predict with patsy formula bspline (no clean example yet)

>>> y = np.sin(np.linspace(-np.pi, np.pi, 10))
>>> resf = OLS.from_formula("y ~ bs(x, df=10, degree=3, include_intercept=True) - 1", {'x': np.arange(10)}).fit()
>>> predictedf = resf.predict({'x': np.arange(0.25,10, 0.25)})
Traceback (most recent call last):
  File "C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\patsy\compat.py", line 117, in call_and_wrap_exc
    return f(*args, **kwargs)
  File "C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\patsy\eval.py", line 127, in eval
    + self._namespaces))
  File "<string>", line 1, in <module>
  File "C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\patsy\splines.py", line 239, in transform
    basis = _eval_bspline_basis(x, self._all_knots, self._degree)
  File "C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\patsy\splines.py", line 39, in _eval_bspline_basis
    raise NotImplementedError("some data points fall outside the "
NotImplementedError: some data points fall outside the outermost knots, and I'm not sure how to handle them. (Patches accepted!)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<pyshell#202>", line 1, in <module>
    predictedf = resf.predict({'x': np.arange(0.25,10, 0.25)})
  File "m:\josef_new\eclipse_ws\statsmodels\statsmodels_py34\statsmodels\base\model.py", line 745, in predict
    exog)
  File "C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\patsy\highlevel.py", line 278, in dmatrix
    NA_action, return_type)
  File "C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\patsy\highlevel.py", line 156, in _do_highlevel_design
    return_type=return_type)
  File "C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\patsy\build.py", line 947, in build_design_matrices
    value, is_NA = evaluator.eval(data, NA_action)
  File "C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\patsy\build.py", line 71, in eval
    result = self.factor.eval(self._state, data)
  File "C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\patsy\eval.py", line 485, in eval
    return self._eval(memorize_state["eval_code"], memorize_state, data)
  File "C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\patsy\eval.py", line 468, in _eval
    code, inner_namespace=inner_namespace)
  File "C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\patsy\compat.py", line 124, in call_and_wrap_exc
    exec("raise new_exc from e")
  File "<string>", line 1, in <module>
patsy.PatsyError: Error evaluating factor: NotImplementedError: some data points fall outside the outermost knots, and I'm not sure how to handle them. (Patches accepted!)
    y ~ bs(x, df=10, degree=3, include_intercept=True) - 1
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant