-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New phase assessModel #296
Comments
I have added new methods on the client - getMatchedMarkedRecordsStat(Dataset markedRecords), getUnmatchedMarkedRecordsStat(Dataset markedRecords), getUnsureMarkedRecordsStat and getMarkedRecords() you can use them to build the logic |
Generated Config File from Arguments object |
Statistics for model 100
|
need to look at the right model internally for this - should be expose label model or should we expose the actual model |
Write a python script which whill expose the model stats - confusion matrix and number of records marked, unmarked, matches, non matches, not sure.
We will use the Labeller class. The python script takes the conf and passes it to the Client. Client will invoke the Labeller. Refer to the python api example at https://github.com/zinggAI/zingg/blob/main/api/scala/FebrlExample.py.
The script calls getMarkedRecords, getMarkedRecordsStat, getUnmarkedRecords on the Client and provides the stats. You can convert the df returned by the Client to python df. To build the confusion matrix, following can be used.
import pandas as pd
import seaborn as sn
import matplotlib.pyplot as plt
confusion_matrix = pd.crosstab(markedRecords['z_isMatch'], markedRecords['z_prediction'], rownames=['Actual'], colnames=['Predicted'])
sn.heatmap(confusion_matrix, annot=True)
plt.show()
The text was updated successfully, but these errors were encountered: