Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for Generalized Additive Models (GAM) could be improved #7200

Closed
exalate-issue-sync bot opened this issue May 11, 2023 · 8 comments
Closed
Assignees

Comments

@exalate-issue-sync
Copy link

We feel like in general the documentation for some statistical models could be improved, particularly for the GAM in this case. Sometimes we need to investigate source code to understand what a parameter does, how does it relate to more common names in the literature, or range of values accepted.

As a concrete example, the parameter ‘bs’ which sets the basis spline fit has a default value of 0 that corresponds to a cubic basis spline representation. There aren’t any other values listed, but looking into the source code h2o-3/GAMModel.java at jenkins-3.34.0.3 · h2oai/h2o-3 (github.com), we can find a few hints about its usage (for example lines 215, 216).

As developers ourselves, we understand how this one can be trickier to fix, but it would vastly improve the use of these models.

Ps: in general, I would say that models that are more traditional in statistics and not data science (e.g. GLM with less typical distributions, GAM, Cox models vs. Random Forests / Gradient Boosting) are the ones that could use better documentation.

@exalate-issue-sync
Copy link
Author

Arun Aryasomayajula commented: Any updates [~accountid:5d1185d4f46aa30c271c7cc6] ?

@exalate-issue-sync
Copy link
Author

hannah.tillman commented: Hey [~accountid:5fa438f822f3990076aa232d] ! I am starting this ticket now 🙂 Currently, I am going through the schemas and python docs for these estimators to find the needed information and gathering it all in a spreadsheet. I’ll mark this ticket as officially “in progress” when I start adding the gathered information to the user guide.

@exalate-issue-sync
Copy link
Author

Narasimha Durgam commented: Hello [~accountid:5d1185d4f46aa30c271c7cc6]! I think adding more context on parameters “spline_orders“ & “bs“ would be helpful!
For example:

The spline_orders parameter specifies the order of the polynomials used in monotone splines. For example, spline_orders=3 means a polynomial of order 3 will be used in the splines.

The bs (As suggested in ticket description) parameter allows for the selection of different spline types. The acceptable range of values is from 0 to 2, currently, we have 0 for cubic splines, 1 for thin-plate splines, and 2 for monotone splines.

Please feel free to add or edit information as needed. Thank you!

@exalate-issue-sync
Copy link
Author

Wendy Wong commented: [~accountid:557058:04659f86-fbfe-4d01-90c9-146c34df6ee6] asked me to ask [~accountid:6335b09597148a8301fd22dc] to help with this one.

@exalate-issue-sync
Copy link
Author

Arun Aryasomayajula commented: [~accountid:5d1185d4f46aa30c271c7cc6] any updates on this JIRA?

@exalate-issue-sync
Copy link
Author

hannah.tillman commented: [~accountid:5fa438f822f3990076aa232d] New PR currently being reviewed: [https://github.com//pull/6422|https://github.com//pull/6422|smart-link]

@h2o-ops-ro
Copy link
Collaborator

JIRA Issue Details

Jira Issue: PUBDEV-8461
Assignee: amin.sedaghat
Reporter: Arun Aryasomayajula
State: Resolved
Fix Version: 3.42.0.1
Attachments: N/A
Development PRs: Available

@h2o-ops-ro
Copy link
Collaborator

Linked PRs from JIRA

#6251
#6422
#6724

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants