Specialize row_group_slots() and findrow() on column types to improve performance #79

… performance Looping over columns is very slow when their type is unknown at compile time. Specialize the method on the types of the key (grouping) columns by passing a tuple of columns rather than a DataTable. This will force compiling a specific method for each combination of key types, but their number should remain relatively low and the one-time cost is worth it. This dramatically improves performance of groupby(), but does not have a large effect on join() since it is very inefficient in other areas. Also add return type assertion for rowhash(). The fact that the type of the columns isn't known at compile time appears to confuse inference, which isn't able to detect that this function always returns UInt. This reduces a lot the number of allocations when calling join(), but doesn't really change performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specialize row_group_slots() and findrow() on column types to improve performance #79

Specialize row_group_slots() and findrow() on column types to improve performance #79

Commits on Aug 7, 2017