Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#8 [3/4]: Implement model fitting with mgcv #20

Merged
merged 24 commits into from
Aug 29, 2024

Conversation

zsusswein
Copy link
Collaborator

@zsusswein zsusswein commented Jun 30, 2024

Important

This PR a trial for using a stacked PR workflow. Do not merge.

Extends #18 to fit a GAM using the supplied formula and data. There are two implemented backends, mgcv::gam() and mgcv::bam(). Dispatch to the appropriate backend is handled via an internal S3 class -- the methods are exported but the generic is private so the methods are still internal to the package. Decent default arguments are supplied, with optional user-supplied arguments passed via a list.

There's also a new warning for legal but unwise parameterization.

Copy link

codecov bot commented Jun 30, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (5ee9a2c) to head (0283df9).

Additional details and impacted files
@@            Coverage Diff             @@
##              main       #20    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files            3         6     +3     
  Lines           87       268   +181     
==========================================
+ Hits            87       268   +181     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

R/fit_model.R Outdated Show resolved Hide resolved
R/fit_model.R Outdated Show resolved Hide resolved
R/fit_model.R Outdated Show resolved Hide resolved
R/fit_model.R Outdated Show resolved Hide resolved
R/formula.R Outdated Show resolved Hide resolved
R/formula.R Show resolved Hide resolved
Copy link
Collaborator

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks pretty good so far.

Having everything based on main makes it quite a pain. Are you just doing this to get the CI? Why not just update the CI to trigger on PRs against branches that aren't main.

My only concern here is about how opinionated this is getting - i.e. very. We might want that and in that case its fine but I do think we should be taking choices here that still allow people to use it for other things even if the default behaviour is very opinionated.

My other concern is there are a few notes about implementing other backends etc but what is coded here doesn't really seem set up for success there (unless success is lots of internal character string based switches which I assume it isn't). Is there an issue scoping out how extensions would be implemented?

@zsusswein
Copy link
Collaborator Author

Having everything based on main makes it quite a pain. Are you just doing this to get the CI? Why not just update the CI to trigger on PRs against branches that aren't main.

Yep this is just for CI. I'll switch the base before opening the PR. I think before updating CI rules we want to check in on whether the whole stacked PR thing is working.

My only concern here is about how opinionated this is getting - i.e. very. We might want that and in that case its fine but I do think we should be taking choices here that still allow people to use it for other things even if the default behaviour is very opinionated.

Agreed -- I think the push and pull here is really helpful. I don't want to be overly rigid with my view of the target user and target use-case, but also I want to avoid stepping into the territory of EpiNow2/Epinowcast/EpiAware. Those are closer to custom model foundries than I think this package should be. That doesn't we should prevent any/all customization (and the current implementation is likely too constraining), but I do want to hear a convincing use-case for e.g. custom model formula and arguments. What would the user flow be? Who is this user? Does this package provide enough benefit over handrolling in mgcv or jumping right to EpiNow2 for us to optimize for their use-case?

My other concern is there are a few notes about implementing other backends etc but what is coded here doesn't really seem set up for success there (unless success is lots of internal character string based switches which I assume it isn't). Is there an issue scoping out how extensions would be implemented?

The current flow is here is that RtGam() fits the model and is the class constructor. I'd strongly prefer we avoid the public data constructor -> public model constructor flow, but perhaps an internal S3 class that switches would be possible? Issue to discuss is #21

@seabbs
Copy link
Collaborator

seabbs commented Jul 2, 2024

I think before updating CI rules we want to check in on whether the whole stacked PR thing is working.

Agree but from a review perspective the current setup of having to wade through all the old commits is very bad which makes it hard to know if stacked PRs are working

Those are closer to custom model foundries than I think this package should be. That doesn't we should prevent any/all customization (and the current implementation is likely too constraining), but I do want to hear a convincing use-case for e.g. custom model formula and arguments. What would the user flow be? Who is this user? Does this package provide enough benefit over handrolling in mgcv or jumping right to EpiNow2 for us to optimize for their use-case?

Yes I agree. I think the main user is NNH right? The question is do we want to ever on the fly do things like use a Poisson observation model because something isn't working or try some settings tweaks in bam because again something is a problem.

If you have to leave the package to make any of these tiny tweaks you are using all the benefits of the good defaults and the nice pre and post-processing

constructor flow, but perhaps an internal S3 class that switches would be possible? Issue to discuss is #21

@zsusswein
Copy link
Collaborator Author

Sure, but this PR isn't open for review yet as I note at the top?

@seabbs
Copy link
Collaborator

seabbs commented Jul 2, 2024

Sure, but this PR isn't open for review yet as I note at the top?

Fair enough. @seabbs out for now then!

@zsusswein zsusswein changed the base branch from main to 08-model-fit-helpers July 6, 2024 03:26
@zsusswein zsusswein marked this pull request as ready for review July 6, 2024 03:40
@zsusswein zsusswein requested a review from kgostic as a code owner July 6, 2024 03:40
@zsusswein zsusswein requested a review from seabbs July 6, 2024 03:40
R/RtGam.R Outdated Show resolved Hide resolved
R/fit_model.R Outdated Show resolved Hide resolved
R/fit_model.R Outdated Show resolved Hide resolved
Copy link
Collaborator

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. It looks like the conclusion of #21 is to allow for them in the future which I think makes sense.

A few points for discussion.

  • ... vs a list + does this allow actual default args to fit_model vs an internal list?
  • Having the class check be in fit_model.default vs when preparing the class. Doing so allows for user extension at the cost of control and potentially some opaque errors (though not at the moment).

@zsusswein
Copy link
Collaborator Author

... vs a list + does this allow actual default args to fit_model vs an internal list?

Addressing here

Having the class check be in fit_model.default vs when preparing the class. Doing so allows for user extension at the cost of control and potentially some opaque errors (though not at the moment).

I think the counterpoint to consider is what to do if/when we need to move to S3 for other steps (I'm thinking predict and maybe formula would be needed). Centralizing the check for backends into one place adds clarity I'd think, at the cost of extensibility as you noted.

@seabbs
Copy link
Collaborator

seabbs commented Jul 8, 2024

(I'm thinking predict and maybe formula would be needed)

If you did do this and backends didn't have methods these would by default throw an informative error (but not a incredibly pretty one) if you just had no default method

It makes documentation slightly more straightforward
@zsusswein zsusswein requested a review from seabbs July 10, 2024 18:25
Copy link
Collaborator

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looking good a few points to clear up about method dispatch etc. Can you summarise some of your responses/decisions to prior comments? I have a few times just flagged the same thing and looking back at your answers I am not clear that the decisions have all been justified.

R/RtGam.R Outdated Show resolved Hide resolved
R/fit_model.R Outdated Show resolved Hide resolved
R/fit_model.R Outdated Show resolved Hide resolved
R/formula.R Show resolved Hide resolved
Copy link
Collaborator

@kgostic kgostic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No major comments. Documentation is pretty verbose, but I think that's probably a good thing. My only comment is that if you're writing that much, might be a good idea to refer back to a reference in case people want even more, or in case the methods/best practices evolve.

R/formula.R Show resolved Hide resolved
As suggested by @seabbs. This allows ... to be evaluated and handles
dispatch.
Copy link
Collaborator

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

zsusswein and others added 11 commits August 29, 2024 10:24
Has basic info but still needs model fit diagnostics
Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
By pointing the public function to the slot with the stored diagnostic
list. Re-order `RtGam()` to use the public function for diagnostic
reporting.
@zsusswein zsusswein merged commit eed7b48 into 08-model-fit-helpers Aug 29, 2024
1 of 2 checks passed
@zsusswein zsusswein deleted the 08-fit-model branch August 29, 2024 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants