Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

functionality for applying a given basis to new data #3

Closed
2 tasks done
fabian-s opened this issue Oct 3, 2019 · 9 comments · Fixed by #78
Closed
2 tasks done

functionality for applying a given basis to new data #3

fabian-s opened this issue Oct 3, 2019 · 9 comments · Fixed by #78
Assignees

Comments

@fabian-s
Copy link
Contributor

fabian-s commented Oct 3, 2019

  • tfb_fpc:
    • ... difficult to do exactly -- score computation depends on internals of method (do PACE or simple ridge estimate or just LS etc)
  • tfb_spline
    • ... easy for unpenalized fits
    • terrible otherwise -- use same sp, fitting args etc. probably need to preserve more of that in tfb objects
  • tfb_wavelet
@fabian-s fabian-s added enhancement New feature or request basis representation labels Oct 3, 2019
@fabian-s fabian-s self-assigned this Oct 3, 2019
@fabian-s
Copy link
Contributor Author

fabian-s commented Oct 8, 2019

first step would probably add sp to tfb_spline class, similar to score_variance for tfb_fpc in tidyfun/tidyfun#86

@fabian-s fabian-s transferred this issue from tidyfun/tidyfun May 10, 2022
@fabian-s
Copy link
Contributor Author

common operation that should be easy: apply given FPC (or spline!) basis to NEW data

@fabian-s
Copy link
Contributor Author

interface:

1a: just write predict methods with first arg the fitted tfb/tfb_fpc, newdata
1b: write basis-extractor and predict methods with first arg the basis info and newdata
2a/b: samesame but use tfb(<new data>, old_tfb) or tfb(<new data>, tf_basis(old_tfb)) not predict

so: implicit basis extraction step or force users to make it explicit.

  • conceptually weird: tfb-vectors are not models (so predict seems weird), but "contain" a model that was used to create them

  • do we need to separate the modeling/fitting and the representation more clearly? can we boil down all these different estimation methods to "this is a (ridge-penalized & weighted) GLM, these are the smoothing params & variances"? --> unification on some low level in the code, mgcv call would use X from basis constructor and paraPen, likelihood, weights arg from fitting method. (but: how to do hat for the initial fit ?!?)

  • complication: all this also needs to work for all the different fpca.Whatever in the future

@jeff-goldsmith
Copy link
Contributor

seems like a general tfb class that takes a known "basis representation" (including things like tuning parameters) and a corresponding method for estimating coefficients would work. this would work for splines and FPCA-based methods, as long as everything for the coefficient estimation was included in the "basis representation".

there's probably a way for this to go along with the second bullet, although i dunno that we'd want to require that every coefficient estimating approach boil down to mgcv.

one way to make this concrete is to use something like

tfb_vec = tfb(tfd_vec, basis_object)

(which assumes that basis_object implies a coefficient estimating function).

a problem is that right now we use tfb to do spline smoothing when called directly; that creates an implicit basis object, which stays invisible to users -- but we'd be asking users to know it's explicit when applying the basis representation to new data.

two options:

  • write a method that extracts the basis object from a tfb vector created by tfb()
  • separate an initial smoothing from using a known basis representation on new data (e.g. keep tfb() as-is and add tfb_known_rep() or something with a less stupid name.
    • (could we repurpose tf_smooth() to do what tfb() does now, and use tfb() for the new case? or is that just worse all around)?

@fabian-s
Copy link
Contributor Author

fabian-s commented Jul 15, 2022

  • every tfb needs "basis set up" and "coefficient finding"
  • some basis set up functions will also return penalty params etc for the coef finding (FPCA) some won't.
    in some cases, "basis set up" and "coeff finding" will be iterative (at least when called the first time for new data)
  • coef finding functions can get penalty params etc as inputs, but don't have to. difficulty here: calling coef fitting function the 2nd time to reapply same method to new data will need penalty params from first fit.
    --> try to define very general tfb constructor that just takes these two functions as inputs. users responsible that the 2 mesh. (terrible UX, but high flex) EDIT: no, this works anyway for now - but need cleaner fpc implementations -- for v2.0
    --> define tfb_spline and tfb_fpc constructors with lots of guard rails and easy-to-understand arguments (like we have now)

can we then do away with tfb_spline, tfb_fpc subclasses? EDIT: no

@fabian-s fabian-s added this to the put it on CRAN milestone Jan 8, 2024
@fabian-s
Copy link
Contributor Author

fabian-s commented Feb 18, 2024

implement this in tf_rebase(object, basis_from) : see 3-tfb-basistransfer branch

  • tfd, tfd: just tf_evaluate

  • tfd, tfb_spline: simple constructor call with inherited args, smoothing params

  • tfd, tfb_fpc: more messy but works. not for tfd_irreg though (no good FPCA implementation yet)

  • tfb, tfb:

  • tfb_spline, tfb_fpc

  • tfb_spline, tfb_fpc

  • replace/use in (which) vec_cast methods? -- we could make c() much more tolerant!

  • use to make Ops more type-tolerant? (at least: for tfd_irreg!) see More careful handling of NA in tf #5, make tfd_irregoperations more tolerant #10

  • tfb_fpc will need lots of refactoring based with a proper FPCA implementation. FPCA result needs a predict method, and maybe a "preprocess_data" method. workaround right now is messy and hard to maintain.

@fabian-s
Copy link
Contributor Author

@jeff-goldsmith
if you have time, would appreciate your thoughts on this new feature tf_rebase (see test-rebase.R for some examples).
does this cover what we need for practical applications / what's missing?
is the documentation clear enough?

@jeff-goldsmith
Copy link
Contributor

this is awesome! tf_rebase worked like a charm and the help file is clear. Maybe we should add a (possibly contrived) example to the conversion vignette -- something like processing DTI$cca separately for male and female participants, then combining in various ways using tf_rebase?

@fabian-s
Copy link
Contributor Author

great, glad you like it -- see tidyfun/tidyfun#161 for your point about examples for tf_rebase

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants