Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH numdiff vectorized for scalar parameter functions #1783

Closed
josef-pkt opened this issue Jun 24, 2014 · 10 comments
Closed

ENH numdiff vectorized for scalar parameter functions #1783

josef-pkt opened this issue Jun 24, 2014 · 10 comments

Comments

@josef-pkt
Copy link
Member

I'm using a vectorized derivative for a scalar parameter and get a diagonal matrix back instead of the 1d vectorized version.
using diag(deriv) works

We need a way to disambiguate how 1d parameters are treated.

no standalone example, this was debugging in PR #1781 derivative of var and link functions
josef-pkt@6433db8

@josef-pkt
Copy link
Member Author

import numpy as np
from statsmodels.tools.numdiff import approx_fprime, approx_fprime_cs

def func(x):
    return (x - 1)**2

p = np.linspace(0., 5, 5)
print(approx_fprime(p, func).shape)
print(approx_fprime_cs(p, func).shape)
#(6, 6)
#(6, 6)
#>>> (approx_fprime(p, func))
#array([[-2.00000003,  0.        ,  0.        ,  0.        ,  0.        ],
#       [ 0.        ,  0.50000002,  0.        ,  0.        ,  0.        ],
#       [ 0.        ,  0.        ,  3.00000004,  0.        ,  0.        ],
#       [ 0.        ,  0.        ,  0.        ,  5.50000006,  0.        ],
#       [ 0.        ,  0.        ,  0.        ,  0.        ,  8.0000001 ]])

@josef-pkt
Copy link
Member Author

bump
I'm running into the same testing new transformation functions

@josef-pkt josef-pkt changed the title ENH numdiff vectorized for scalar function ENH numdiff vectorized for scalar parameter functions Jan 6, 2021
@josef-pkt
Copy link
Member Author

maybe trivial

I copied the approx_fprime code, and needed to change n=len(x) to n = x.shape[-1].
This works for the example

approx_fprime(p, func)
array([[-2.00000003,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.50000002,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  3.00000004,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  5.50000006,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  8.0000001 ]])
approx_fprime_scalar(p[:,None], func)
array([-2.00000003,  0.49999997,  2.99999982,  5.49999952,  8.00000191])

unit tests in test_numdiff almost only check correctness of numbers. There is one type test to allow for list.
AFAICS, all cases are for one function w.r.t. one vector parameter.
I don't see from looking at the test code whether there are vector returns of the objective function,

@josef-pkt
Copy link
Member Author

Making the change is not backwards compatible, but the case is not mentioned in the docs and it should be a bugfix.
It is implied in the docstring that we have only one params vector.
The only user of the vectorized params case might be our genmod Link classes.

Behavior for vector return of the objective function is well defined for 1-d and 2-d

from the Notes section of approx_fprime

If f returns a 1d array, it returns a Jacobian. If a 2d array is returned
    by f (e.g., with a value for each observation), it returns a 3d array
    with the Jacobian of each observation with shape xk x nobs x xk. I.e.,
    the Jacobian of the first observation would be [:, 0, :]

The new case is that the function has only a single parameter, i.e. R ->R or R -> R^k, but we can evaluate it at many different params. The vectorized params needs to be a column array.

2-d return by the function seems to work also, but params needs to have extra dimensions to broadcast directly. Broadcasting needs to be handled in the user function, so that can be messy.
2-d function return looks too tricky to support and explain in general

e.g. vectorized over params p for function that returns results corresponding to shape of a. func broadcasts between params p and "data" a.

caveat is that results are transposed

res = func(p[:,None,None] *0.1, a=(np.arange(1, 7).reshape(1, -1, 2)))
res.shape
 (6, 3, 2))

approx_fprime_scalar(p[:,None, None] *0.1, func, args=(np.arange(1, 7).reshape(1, -1, 2),)).shape
(2, 3, 6)

scalar return

1-dim params keeps current behavior with the diag result

approx_fprime_scalar(p, func, args=(1,)).shape
(6, 6)

with 2-dim column params, we get new behavior derivative at each point of params

func(p, a=1).shape
(6,)
approx_fprime_scalar(p[:,None], func, args=(1,)).shape
(6,)

1-dim return by function

vectorized params needs to broadcast within the function, and needs p.shape[-1] == 1 in p-vectorized fprime usage

func(p[:,None], a=(np.arange(1, 5))).shape
(6, 4)

approx_fprime_scalar(p[:,None], func, args=(np.arange(1, 5),)).shape
(4, 6)

@josef-pkt
Copy link
Member Author

timing huge speedup, but more important, the old diag version uses (len(p), len(p)) sized arrays, len(p) = nobs in our applications

loop is also slow because the example function does little work and loop overhead dominates

It looks like params needs to be an atleast_1d array in approx_fprime

p_ = np.linspace(0, 5, 1000)
%timeit approx_fprime_scalar(p_[:,None], func, args=(1,)).shape
26.6 µs ± 1.16 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit approx_fprime(p_, func, args=(1,)).shape
13.9 ms ± 253 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

# loop in old version
%timeit np.asarray([approx_fprime(np.array([pi]), func, args=(1,)) for pi in p_]).shape
15.6 ms ± 256 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

@josef-pkt
Copy link
Member Author

I haven't tried approx_fprime_cs, but it looks like the same change to .shape should work

Note
Using .shape attribute requires numpy and is more restrictive than len.
So, there might still be problems with this fix for non-numpy array parameters. I didn't check.

(I'm back to other parts)

@josef-pkt
Copy link
Member Author

correction to backwards compatibility:
Our cases in links cdflink using np.diag(approx_fprime.. do not break.
If parameter array is 1-dim, then there is no change because len(x) == x.shape(-1) for 1-dim.

Consquently, we do not have failing unit test cases to find the refactoring cases.

change will be included in cdflink PR #7287

@josef-pkt
Copy link
Member Author

approx_fprime_cs

changing only len(x) is not enough.
get_epsilon uses elementwise maximum: h = EPS**(1. / s) * np.maximum(np.abs(x), 0.1)

Trying to fix it so that it handles also the new case gets messy. Instead I special case the code for the vectorized scalar parameter case if x.ndim == 2 and n == 1:

This has now epsilon varying by row depending on the parameter.

The simple fix to approx_fprime, does not adjust parameter specific epsilon correctly. It takes now just the epsilon for the first case of the vectorized parameter.

My new t(1)/cauchy test in #7287 fails and has less precision tol around 1e-5 instead of 1e-8 using vectorized approx_fprime compared the the diag version.

@josef-pkt
Copy link
Member Author

I didn't refactor approx_fprime further. I created new function _approx_fprime_scalar that only handles scalar parameter case, but in vectorized form. This treats also 1-dim params as vectorized scalar param, not as a single vector parameter as does approx_fprime.

More generalizations are left for the future when or if we need them.

t(1)/cauchy cdflink test passes again with epsilon scaled by individual parameters.
approx_fprime_cs cannot be used for cdflinks because much of scipy.stats.distributions does not support complex numbers.

@josef-pkt
Copy link
Member Author

closing,
scalar param case has been added in #7287

general 2-d params case is open, #7680

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

1 participant