-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update of the GLM models, OLS+Binary #31
Conversation
* proper use of decode git commit -am
I'll wait until you've pushed your changes corresponding to 0.4; then will merge master here, cleanup and re-submit :) |
Ok it passes on Julia 1 and 1.1 on Travis. It fails on linux-nightly apparently due to an issue with conda installation as far as I can tell but works on mac-nightly so I reckon it can be ignored. The PR should be ok for reviewing @ablaom |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really great. My annotations are just editorial.
I see that the old OLS, which was deterministic is gone. I guess this is because you can just use NormalRegressor with predict_mean?
Since we are changing names anyway, I should prefer to insert "Linear" in all the names. I mean, we have "DecisionTreeRegressor", and so forth. In the context of 50 odd models "NormalRegressor" is a bit mysterious. So LinearNormalRegressor, LinearBinaryClassfier, and so forth. ??
Can I suggest adding "Linear" to the names to distinguish Regressors and Classifiers from other Regressors and Classifiers (eg, DecisionTreeRegressor/Classifier). Also, I suggest the name reflect the scitype rather than the distribution being fit, as one model might fit to multiple distributions but we are generally having a separate model for different target scitpyes). Here are my suggestions:
LinearRegressor (for what is now NormalRegressor)
LinearBinaryClassifier as now
LinearCountRegressor (instead of PoissonRegressor)
LinearMulticlassRegressor (for targets with possibly more than one level, unordered)
Please don't use "Multivariate" for multi-level targets as I expect this would cause a lot of confusion. I expect "Multivariate" would normally refer to more than one feature (in case of inputs - the usual case) or more than one target column, in the case of targets. At, least that is how the term is used in documentation.
These are my suggestions. Go ahead and do what you feel best. This is ready to pull. Before, registering and tagging a new release, we might want to wait for the Count and Multiclass, yes?
Ok, I've implemented your suggestions; will wait for tests to pass and merge. Re Count, I'll do that in another PR; Re Multiclass, is that supported now? Could you point me to the equivalent of |
Nevermind, about MutlivariateFinite; I had forgot our earlier discussion. I'm puzzled why you have moved LIBSVM from optional dependency to actual dependency. |
Future
(will be another pr)
Notes
scitype_union
->scitype
and noinput_multivariate
etc) so I added a few fixes for those; given that 0.4 has not been merged but will soon, I've added a statement that checks the version and acts accordingly; it'll be easy to remove later on.