Feature Request: Method for obtaining the highest/lowest confidence observations from each target class #19

jmmalo03 · 2013-08-02T17:58:53Z

After running/training a classifier on some observations for a binary decision problem, I often like to quickly extract the most "easy" and "difficult" observations from each target class. In other words, I would like a method (or methods) that will quickly provide me with:
(1) The 'n' observations with the largest decision statistic from the positive class
(2) The 'n' observations with the lowest decision statistic from the positive class
(3) The 'n' observations with the largest decision statistic from the negative class
(4) The 'n' observations with the lowest decision statistic from the negative class

Alternatively, it would be nice to have a single method that independently sorts the observations under each target class according to their decision statistics.

peterTorrione · 2013-08-05T13:31:08Z

As spec'd out, something to get (1)-(4), I don't think this should be a method of prtDataSetClass, and it probably shouldn't be a method of prtClass or prtAction.

It shouldn't be a method of prtDataSetClass because it makes some assumptions - e.g., that you have only one feature. That you have a "positive" and "negative" class, etc.

If you have those circumstances, there's at least one quick ways to do this:

%Example, sort into H0 and H1, sorted by yOut confidence:
yOut = classifier.run(ds);
[sorted,inds] = sort(yOut.X);
dsSort = ds.retainObservations(inds); %sort the dataSet
dsSort0 = dsSort .retainClasses(0);
dsSort1 = dsSort .retainClasses(1);

Now, the first N of dsSort0 are the easy H0, the last N are hard, and vice-versa for dsSort1.

One way to put some of these together might be: "sortBy":
e.g.
ds = ds.sortBy(sortVector,'withinClass',true);

So, you cold do:
yOut = classifier.run(ds);
ds = ds.sortBy(yOut.X,'withinClass',true);

But that doesn't actually save a whole ton of code...?

For now I don't see a super good reason to make a method that does the code in the example above...

patrickkwang added the feature request label Apr 26, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Method for obtaining the highest/lowest confidence observations from each target class #19

Feature Request: Method for obtaining the highest/lowest confidence observations from each target class #19

jmmalo03 commented Aug 2, 2013

peterTorrione commented Aug 5, 2013

Feature Request: Method for obtaining the highest/lowest confidence observations from each target class #19

Feature Request: Method for obtaining the highest/lowest confidence observations from each target class #19

Comments

jmmalo03 commented Aug 2, 2013

peterTorrione commented Aug 5, 2013