-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation for Generalized Additive Models (GAM) could be improved #7200
Comments
Arun Aryasomayajula commented: Any updates [~accountid:5d1185d4f46aa30c271c7cc6] ? |
hannah.tillman commented: Hey [~accountid:5fa438f822f3990076aa232d] ! I am starting this ticket now 🙂 Currently, I am going through the schemas and python docs for these estimators to find the needed information and gathering it all in a spreadsheet. I’ll mark this ticket as officially “in progress” when I start adding the gathered information to the user guide. |
Narasimha Durgam commented: Hello [~accountid:5d1185d4f46aa30c271c7cc6]! I think adding more context on parameters “spline_orders“ & “bs“ would be helpful! The spline_orders parameter specifies the order of the polynomials used in monotone splines. For example, spline_orders=3 means a polynomial of order 3 will be used in the splines.The bs (As suggested in ticket description) parameter allows for the selection of different spline types. The acceptable range of values is from 0 to 2, currently, we have 0 for cubic splines, 1 for thin-plate splines, and 2 for monotone splines.Please feel free to add or edit information as needed. Thank you! |
Wendy Wong commented: [~accountid:557058:04659f86-fbfe-4d01-90c9-146c34df6ee6] asked me to ask [~accountid:6335b09597148a8301fd22dc] to help with this one. |
Arun Aryasomayajula commented: [~accountid:5d1185d4f46aa30c271c7cc6] any updates on this JIRA? |
hannah.tillman commented: [~accountid:5fa438f822f3990076aa232d] New PR currently being reviewed: [https://github.com//pull/6422|https://github.com//pull/6422|smart-link] |
JIRA Issue Details Jira Issue: PUBDEV-8461 |
We feel like in general the documentation for some statistical models could be improved, particularly for the GAM in this case. Sometimes we need to investigate source code to understand what a parameter does, how does it relate to more common names in the literature, or range of values accepted.
As a concrete example, the parameter ‘bs’ which sets the basis spline fit has a default value of 0 that corresponds to a cubic basis spline representation. There aren’t any other values listed, but looking into the source code h2o-3/GAMModel.java at jenkins-3.34.0.3 · h2oai/h2o-3 (github.com), we can find a few hints about its usage (for example lines 215, 216).
As developers ourselves, we understand how this one can be trickier to fix, but it would vastly improve the use of these models.
Ps: in general, I would say that models that are more traditional in statistics and not data science (e.g. GLM with less typical distributions, GAM, Cox models vs. Random Forests / Gradient Boosting) are the ones that could use better documentation.
The text was updated successfully, but these errors were encountered: