You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After running/training a classifier on some observations for a binary decision problem, I often like to quickly extract the most "easy" and "difficult" observations from each target class. In other words, I would like a method (or methods) that will quickly provide me with:
(1) The 'n' observations with the largest decision statistic from the positive class
(2) The 'n' observations with the lowest decision statistic from the positive class
(3) The 'n' observations with the largest decision statistic from the negative class
(4) The 'n' observations with the lowest decision statistic from the negative class
Alternatively, it would be nice to have a single method that independently sorts the observations under each target class according to their decision statistics.
The text was updated successfully, but these errors were encountered:
As spec'd out, something to get (1)-(4), I don't think this should be a method of prtDataSetClass, and it probably shouldn't be a method of prtClass or prtAction.
It shouldn't be a method of prtDataSetClass because it makes some assumptions - e.g., that you have only one feature. That you have a "positive" and "negative" class, etc.
If you have those circumstances, there's at least one quick ways to do this:
%Example, sort into H0 and H1, sorted by yOut confidence:
yOut = classifier.run(ds);
[sorted,inds] = sort(yOut.X);
dsSort = ds.retainObservations(inds); %sort the dataSet
dsSort0 = dsSort .retainClasses(0);
dsSort1 = dsSort .retainClasses(1);
Now, the first N of dsSort0 are the easy H0, the last N are hard, and vice-versa for dsSort1.
One way to put some of these together might be: "sortBy":
e.g.
ds = ds.sortBy(sortVector,'withinClass',true);
So, you cold do:
yOut = classifier.run(ds);
ds = ds.sortBy(yOut.X,'withinClass',true);
But that doesn't actually save a whole ton of code...?
For now I don't see a super good reason to make a method that does the code in the example above...
After running/training a classifier on some observations for a binary decision problem, I often like to quickly extract the most "easy" and "difficult" observations from each target class. In other words, I would like a method (or methods) that will quickly provide me with:
(1) The 'n' observations with the largest decision statistic from the positive class
(2) The 'n' observations with the lowest decision statistic from the positive class
(3) The 'n' observations with the largest decision statistic from the negative class
(4) The 'n' observations with the lowest decision statistic from the negative class
Alternatively, it would be nice to have a single method that independently sorts the observations under each target class according to their decision statistics.
The text was updated successfully, but these errors were encountered: