Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test accuracy of predict methods #26

Closed
aazuspan opened this issue May 25, 2023 · 5 comments · Fixed by #32
Closed

Test accuracy of predict methods #26

aazuspan opened this issue May 25, 2023 · 5 comments · Fixed by #32
Assignees
Labels
estimator Related to one or more estimators testing Related to test code or data

Comments

@aazuspan
Copy link
Contributor

Currently, we test the accuracy of fit and kneighbors against yaImpute in test_port.py, but we don't yet test the accuracy of predict, which makes it easy for small mistakes that affect results but not functionality (like #23) to go unnoticed.

As discussed in that issue, we should add some tests that compare prediction results of our estimators to those generated with yaImpute.

@aazuspan aazuspan added testing Related to test code or data estimator Related to one or more estimators labels May 25, 2023
@grovduck
Copy link
Member

grovduck commented May 27, 2023

@aazuspan, a quick note. I figured out my issue with the predictions and it was how I was calling yaImpute. I've got some tests for both equal-weights and distance-weighted predictions and they are all passing. Woot! test_port.py and conftest.py are a complete shambles right now (with lots of duplication), so I'm putting them into a gist just so you can see the work. It will probably inform how we tackle #2 and #3.

Note that when you call kneighbors without an argument, you get the k closest neighbors not including itself, but predict has to have an X argument which therefore uses itself as the first neighbor and factors into the calculation. yaImpute doesn't use itself as a neighbor and returns a prediction based on the k independent neighbors. So therefore, I'm not testing clf.predict(X_train) at present.

@aazuspan
Copy link
Contributor Author

That's great news @grovduck!

Agreed that test_port.py could definitely use some clean up... If we could use parametrize like we do in test_estimators.py that would be great, but I'm not sure how to account for the fact that GNN and MSN are fit with the spp arg while the rest aren't. I suppose we may need two sets of parametrized tests based on whether or not they use ancillary fitting data. But also, given the connection you mention with other issues, it might not be worth spending too much effort cleaning these up yet.

Thanks for the code reference and the explanation of the issue!

@grovduck grovduck self-assigned this May 31, 2023
@grovduck
Copy link
Member

@aazuspan - is this worth pushing this more or less as is, knowing that we will be cleaning up the testing data and functions as part of #3 or should we focus on getting #3 in there first?

@aazuspan
Copy link
Contributor Author

I'd say go ahead as is!

grovduck added a commit that referenced this issue May 31, 2023
We exported data from yaImpute for the five core estimators at k=5
for both unweighted (equal-weighting to all neighbors) and weighted
(neighbors are weighted by their inverse distance) predictions of the
y attributes.  The tests assert that the predict method in each
estimator is returning the same values.  Note that the constructor
of each estimator uses a custom function (yaimpute_weights) to the
weights parameter in order to match the weighting scheme of yaImpute.
grovduck added a commit that referenced this issue May 31, 2023
We exported data from yaImpute for the five core estimators at k=5
for both unweighted (equal-weighting to all neighbors) and weighted
(neighbors are weighted by their inverse distance) predictions of the
y attributes.  The tests assert that the predict method in each
estimator is returning the same values.  Note that the constructor
of each estimator uses a custom function (yaimpute_weights) to the
weights parameter in order to match the weighting scheme of yaImpute.

This commit fixes #26.
grovduck added a commit that referenced this issue May 31, 2023
We exported data from yaImpute for the five core estimators at k=5
for both unweighted (equal-weighting to all neighbors) and weighted
(neighbors are weighted by their inverse distance) predictions of the
y attributes.  The tests assert that the predict method in each
estimator is returning the same values.  Note that the constructor
of each estimator uses a custom function (yaimpute_weights) to the
weights parameter in order to match the weighting scheme of yaImpute.

This commit fixes #26.
@grovduck
Copy link
Member

Resolved by #32

@grovduck grovduck linked a pull request Jun 7, 2023 that will close this issue
@grovduck grovduck added this to the Core Estimators milestone Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
estimator Related to one or more estimators testing Related to test code or data
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants