Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass or store smoothed coordinate values #13

Closed
Jacob-Stevens-Haas opened this issue Jun 17, 2022 · 3 comments
Closed

Pass or store smoothed coordinate values #13

Jacob-Stevens-Haas opened this issue Jun 17, 2022 · 3 comments
Assignees

Comments

@Jacob-Stevens-Haas
Copy link
Collaborator

Jacob-Stevens-Haas commented Jun 17, 2022

Some derivative methods smooth the coordinate points while calculating the derivatives, and these smoothed coordinates can be useful for SINDy. I think the list of methods that smooth coordinates are:

  • savitsky golay
  • spline
  • trend-filtered
  • kalman

There's several options for the interface, either

  1. functional: return tuple of x, x_dot from Derivative.compute_for() and Derivative.d(), and possibly dxdt().
  2. object-oriented: Derivative.x set when calling Derivative.compute_for(). Subclasses that do not smooth x should probably still set the value to maintain interface. Could be set in Derivative._global(), but not sure how to deal with savitsky golay, since local methods don't need to compute the derivative for all points.

As I see it, the pros/cons boil down to explicit is better than implicit for the functional pattern vs backwards compatibility for the object-oriented pattern. Setting a new object attribute after initialization is implicit, changing return type is explicit. The object oriented pattern requires derivative objects remain public, and users like pysindy would need a more complicated calling convention.

return dxdt(x, t, axis=0, **self.kwargs)

in favor of

deriv = derivative.methods[kind](**self.kwargs)
x_dot = deriv.d(x, t, axis)
x = deriv.x
return x, x_dot

That said, backwards compatibility is always important. I'm happy to build a PR of either.

Jacob-Stevens-Haas added a commit to Jacob-Stevens-Haas/derivative that referenced this issue Jun 18, 2022
@andgoldschmidt
Copy link
Owner

I agree: x_dot = f(x_est) should be available to users. An explicit return of x_est, x_dot sounds best. An explicit return makes it easier to handle the local methods. Any changes we make to this will force an adjustment to the wrappers insindy_derivative.py in PySINDy.

Looking over the existing methods, I didn't see any case where the derivative was computed from an implicit x_est (that is, a situation where it was computationally or conceptually inefficient to get x_est during or after the computation of x_dot). For example,

  • SavitzkyGolay: return the value from the fit polynomial
  • Spectral: we'd just run the same filtered Fourier transform, but without any multiplication by $ i\omega$.
  • TrendFiltered: we can apply the integration operator to the Lasso solution, i.e. $A u^*$

The takeaway is that whether global or local, we want to compute x_est and x_dot together to avoid repeating calculations.

As for implementation, I think we want to modify the contract for derivative.compute() and derivative.compute_for() so they return x_est, x_dot. Then there are two options:

  1. We could have a separate derivative.d and derivative.x to return the two components. That could be efficient for global methods, but probably not for local methods, where we'd end up repeating calculations.
  2. We have derivative.d return x_est and x_dot.

Looks to me like option 2 is better.

@Jacob-Stevens-Haas
Copy link
Collaborator Author

Jacob-Stevens-Haas commented Jun 20, 2022

DISREGARD - I forgot the arguments to @lru_cache must be hashable, which **kwargs blocks. No it should work. I'll test it out tomorrow

I understand that the wrapper in sindy_derivative.py currently calls dxdt(), which doesn't provide access to the on-the-fly generated Derivative object. In either option 1 or 2 for Derivative.compute()/compute_for(), we can use cacheing to both keep the functional API backwards compatible and avoid recomputation. e.g. (omitting kind is None default case for brevity)

def dxdt(x, t, kind, axis, **kwargs):
    method = gen_method(kind, x, t, axis)
    return method.d(x, t, axis=axis) # if option 2, then method.d(x,t,axis)[1]

# new function
def smooth_x(x, t, kind, axis, **kwargs):
    method = gen_method(kind, x, t, axis)
    return method.x(x, t, axis=axis) # if option 2, then method.d(x,t,axis)[0]

@lru_cache
def gen_method(kind, x, t, axis, **kwargs):
    return methods.get(kind)(**kwargs)

The cache means that calling dxdt() and smooth_x() with the same parameters will result in the same object, preventing recomputation, so long as it's prevented within a Derivative object.

There's probably a pattern/cacheing we could use that keeps Derivative.d backwards compatible as well as keeps from recomputing, lm think about it...

@Jacob-Stevens-Haas
Copy link
Collaborator Author

Ok, so that was much harder than expected, and I ended up implementing not just cacheing for Arrays, but also having to implement reference counting and garbage collection inside the cache. Ended up throwing 90% of it out with a simpler but just as effective solution. The benefit is that we can memoize ALL calls to the expensive methods, e.g. _global()

I'm going to add a draft PR with the changes. This follows option (1) above and doesn't even modify the contract for compute() and compute_for(). It really is about just demonstrating this feature in a fully backwards-compatible way and adding tests for the new feature. I hoped that having tests would help isolate the decision around breaking backwards compatibility.

My thoughts on backwards compatibility: We can certainly modify the contracts for compute() and compute_for(). This could remove some need for internal memoization, but the functional interface will still need it. More generally, the Derivative subclasses are thin classes whose __init__() method really only sets parameters for later calls via the d() method (and now, the x() method). I'd lean towards removing classes in favor of a purely functional interface, but I understand that's a very theoretical argument and I don't have that strong of an opinion. To me, it really only matters in debugging when classes grow too big.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants