Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model: scikit: Data type of feature used for predict not respected #651

Closed
pdxjohnny opened this issue May 19, 2020 · 7 comments · Fixed by #980
Closed

model: scikit: Data type of feature used for predict not respected #651

pdxjohnny opened this issue May 19, 2020 · 7 comments · Fixed by #980
Labels
good first issue Good for newcomers kind/ml Issues partaining to machine learning tXS Esitmated Time To Complete: Extra Short

Comments

@pdxjohnny
Copy link
Member

Predicted values are comming out as floats from regression models. We should
typecast to the feature's dtype.

@pdxjohnny
Copy link
Member Author

Specifically if we have a float we should round to an int if the dtype is int

@pdxjohnny pdxjohnny added this to the 0.5.0 Beta Release milestone May 20, 2020
@pdxjohnny pdxjohnny added good first issue Good for newcomers kind/ml Issues partaining to machine learning labels Jul 29, 2020
@pdxjohnny
Copy link
Member Author

Check out model/scikit/dffml_model_scikit/scikit_base.py within the predict method. We should cast the type to self.config.predict.dtype before calling Record.predicted

@pdxjohnny pdxjohnny added the tXS Esitmated Time To Complete: Extra Short label Jul 29, 2020
aadarshsingh191198 added a commit to aadarshsingh191198/dffml that referenced this issue Oct 7, 2020
aadarshsingh191198 added a commit to aadarshsingh191198/dffml that referenced this issue Oct 7, 2020
@aadarshsingh191198
Copy link
Contributor

Hi @pdxjohnny, can you please help me with the tests which are failing? (model/scikit, 3.7), (model/scikit, 3.8) and (model/xgboost, 3.8). The error shows

AssertionError: '0' not found in [-1, 0, 1, 2, 3, 4, 5, 6, 7]

Now, why would someone check for a string in an array of ints. Also, why is the error only in these 3 models?

@pdxjohnny
Copy link
Member Author

It should not be checking for a string. That's is the issue here. Don't worry about model/xgboost that is an intermittent failure.

It is currently checking for a string because of the change you made, with means the change was either not entirely correct, or the tests also need updating somehow.

@aadarshsingh191198
Copy link
Contributor

Okay, I will look into it once again keeping in mind the tests for model/scikit ... Thanks for the guidance.

@B-Anupam
Copy link

Hey!
I am Interested in Machine Learning projects and looking openly for a contribution.
Please let me know how can I contribute to this issue?

@pdxjohnny
Copy link
Member Author

@B-Anupam This issue was just fixed (thank you @sidhu1012!). Issue #867 seems like a good target for a first contribution. Please read the contributing documentation first: https://intel.github.io/dffml/master/contributing/index.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers kind/ml Issues partaining to machine learning tXS Esitmated Time To Complete: Extra Short
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants