Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide StatsAPI interface for regression #171

Open
wildart opened this issue Jan 17, 2022 · 6 comments
Open

Provide StatsAPI interface for regression #171

wildart opened this issue Jan 17, 2022 · 6 comments

Comments

@wildart
Copy link
Collaborator

wildart commented Jan 17, 2022

Currently, regression algorithms implemented as stand-alone functions, while other methods use StatsAPI interface, i.e. fit/predict.

We should have properly derived types from StatsAPI.RegressionModel and corresponding implemented interface for various regression algorithms.

@kescobo
Copy link
Contributor

kescobo commented Jan 18, 2022

Responding to question in #109 - @wildart I'd be happy to take a stab at it, if there's a well defined API / clear instructions for implementation. I'm afraid I'm not that familiar with most of the methods in this package or with StatsAPI, but if there's a regular structure, I can probably figure it out.

@wildart
Copy link
Collaborator Author

wildart commented Jan 18, 2022

Basically, every algorithm in this package has fit method for building a model, see StatisticalModel , and predict for predicting response of a model, see RegressionModel. These two methods are a bare minimum what is required for the regression implementation. The rest of the interface could be approached later.

So, there need to be defined a type derived from RegressionModel that would hold the model parameters, regression coefficients. The fit method would call existed implementation, ridge, and form an object of the model type. The predict method should form prediction given the model parameters. You can look at other algorithms' implementations for guidance, e.g. PCA.

@kescobo
Copy link
Contributor

kescobo commented Jan 18, 2022

That makes sense. As I said, I'm happy to take a stab, though realistically it's unlikely to be in the next week or two - I'm teaching this semester and need to get a lot more prep done. If there's not a rush on it, I can definitely tackle it by ~mid February.

@wildart
Copy link
Collaborator Author

wildart commented Jan 19, 2022

Any help is appreciated at any time.

@kescobo
Copy link
Contributor

kescobo commented Feb 7, 2022

Looking at this a bit more closely today, I do not think I'm the right person for this job, sorry! I feel like if I had a strong handle on the package interface OR the statistical methods, I could use one thing to reason about the other. But being a novice on both, even using your hints above, I'm not sure how to get started :-(

@wildart
Copy link
Collaborator Author

wildart commented Feb 9, 2022

For minimal implementation, you would need to

  1. Define a type for a regression model, derived from StatsAPI.RegressionModel, e.q. OLS that will hold the coefficients of the OLS regression model.
  2. Define a fit function that accepts three parameters: OLS type, independent x and dependent y variables. This function will execute llsq which will calculate a regression model parameters, and return an OLS object.
  3. Define a predict function, see it description here: https://github.com/JuliaStats/StatsAPI.jl/blob/00ce15f034e7ffdf16ec988766246755fcab47c4/src/regressionmodel.jl#L74-L81

The data parameters should be of AbstractMatrix or AbstractVector types. You may want to include generic placeholder for kw-arguments to path through parameters to llsq call.

The rest of the methods for RegressionModel are optional at this point. Feel free to implement any of them. See any method implementation in this repo: MDS, PCA, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants