-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Consider this method:
machinelearning/src/Microsoft.ML.Core/Data/IDataView.cs
Lines 114 to 115 in 9067a1b
| RowCursor[] GetRowCursorSet(out IRowCursorConsolidator consolidator, | |
| Func<int, bool> needCol, int n, Random rand = null); |
This returns an instance of the mysterious IRowCursorConsolidator interface. Why an interface? In retrospect I'm not quite sure.
Having an interface allows for different implementations, but we've never actually really exploited that capability. Nor, even if we were of such a mind to do so, would it be clear how we could. What would they even do differently? The semantics around Batch and whatnot are sufficiently clear and simple as to make only one implementation obvious, and even if we did have different implementations since the resulting cursors result most often from transformers (that is, components downstream from the set creation), they couldn't really do anything radically different anyway, since to do anything implementation specific on any cursor would be to break the composability at the core of what makes IDataView work at all.
So: get rid of this interface, and replace all usage of it with a simple utility method somewhere that can be called to do the reconciliation. It needn't even be a public utility method, but there may be reasons to do so.