Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update of the GLM models, OLS+Binary #31

Merged
merged 11 commits into from
Aug 26, 2019
Merged

Update of the GLM models, OLS+Binary #31

merged 11 commits into from
Aug 26, 2019

Conversation

tlienart
Copy link
Collaborator

@tlienart tlienart commented Aug 13, 2019

  • Prob OLS regression
  • Prob Binary classification (probit, logit, ...)

Future

(will be another pr)

Notes

  • decode is now used properly, thanks for the feedback
  • MLJBase 0.4 seems to require different metadata, (scitype_union -> scitype and no input_multivariate etc) so I added a few fixes for those; given that 0.4 has not been merged but will soon, I've added a statement that checks the version and acts accordingly; it'll be easy to remove later on.
    • this incurred a few missed commits where things were passing locally but not on Travis, I'll bring this up in the next call, it mostly seems to be a versioning issue.

@tlienart tlienart marked this pull request as ready for review August 19, 2019 09:44
@tlienart tlienart requested a review from ablaom August 19, 2019 09:44
@tlienart
Copy link
Collaborator Author

I'll wait until you've pushed your changes corresponding to 0.4; then will merge master here, cleanup and re-submit :)

@tlienart tlienart removed the request for review from ablaom August 19, 2019 20:52
@tlienart
Copy link
Collaborator Author

tlienart commented Aug 22, 2019

Ok it passes on Julia 1 and 1.1 on Travis. It fails on linux-nightly apparently due to an issue with conda installation as far as I can tell but works on mac-nightly so I reckon it can be ignored.

The PR should be ok for reviewing @ablaom

@tlienart tlienart requested a review from ablaom August 22, 2019 20:43
src/GLM.jl Outdated Show resolved Hide resolved
src/GLM.jl Show resolved Hide resolved
Copy link
Member

@ablaom ablaom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great. My annotations are just editorial.

I see that the old OLS, which was deterministic is gone. I guess this is because you can just use NormalRegressor with predict_mean?

Since we are changing names anyway, I should prefer to insert "Linear" in all the names. I mean, we have "DecisionTreeRegressor", and so forth. In the context of 50 odd models "NormalRegressor" is a bit mysterious. So LinearNormalRegressor, LinearBinaryClassfier, and so forth. ??

Can I suggest adding "Linear" to the names to distinguish Regressors and Classifiers from other Regressors and Classifiers (eg, DecisionTreeRegressor/Classifier). Also, I suggest the name reflect the scitype rather than the distribution being fit, as one model might fit to multiple distributions but we are generally having a separate model for different target scitpyes). Here are my suggestions:

LinearRegressor (for what is now NormalRegressor)
LinearBinaryClassifier as now
LinearCountRegressor (instead of PoissonRegressor)
LinearMulticlassRegressor (for targets with possibly more than one level, unordered)

Please don't use "Multivariate" for multi-level targets as I expect this would cause a lot of confusion. I expect "Multivariate" would normally refer to more than one feature (in case of inputs - the usual case) or more than one target column, in the case of targets. At, least that is how the term is used in documentation.

These are my suggestions. Go ahead and do what you feel best. This is ready to pull. Before, registering and tagging a new release, we might want to wait for the Count and Multiclass, yes?

@tlienart
Copy link
Collaborator Author

tlienart commented Aug 26, 2019

Ok, I've implemented your suggestions; will wait for tests to pass and merge.

Re Count, I'll do that in another PR;

Re Multiclass, is that supported now? Could you point me to the equivalent of MultivariateFinite? thanks!
Edit: oh ok there's coerce(Multiclass, y), I'll try using that

@tlienart tlienart merged commit a79056a into master Aug 26, 2019
@ablaom
Copy link
Member

ablaom commented Aug 26, 2019

Nevermind, about MutlivariateFinite; I had forgot our earlier discussion.

I'm puzzled why you have moved LIBSVM from optional dependency to actual dependency.

@tlienart tlienart deleted the glm_2 branch September 10, 2019 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants