-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation for conditional feature distributions #193
Comments
I implemented this one for my Outreachy application: |
I think I understand most of the specification, however, I have a question regarding the input. For |
Sorry for the delay in getting you feedback on this! The goal would be, given a column of data, to create a separate histogram corresponding to the subset of rows that fall into each cell of the confusion matrix. For example, with two classes there would be 4 histograms. To determine this grouping, you can use the column of true labels (eg. Re your question about the input, this "column of data" could in principle be any feature in the original dataset or any computed column, eg. the average of two features or the scores predicted by the model. The code should be the same either way, since it should work the same so long as the input is a pandas series or list the same length as the dataset. So you wouldn't need to do anything special to support these different cases, but they should "just work" when testing in a notebook. Re the output, it would always be a histogram, so the output would have bins over the range of column values on the x-axis and (relative) frequency counts on the y-axis. (In the case of categorical data, it will be a barplot instead, but it follows the same idea). I think the outstanding work that needs to be done for this issue is:
Don't worry about tweaking the code that controls the graphics at this point, as that may change when we integrate into a common report. This is mainly about making the functionality available. Hope this helps! Let me know if you have further questions! |
Implement the computation for conditional feature distributions.
The text was updated successfully, but these errors were encountered: