Add check in panelsum() when factors are sorted #4

sergiocorreia · 2016-12-10T08:29:11Z

This would speed up F.sort() calls when factors are sorted in the dataset; particularly useful if we run this method a lot (e.g. reghdfe)

First, create .is_sorted

Then, intercept this loop and replace (not tested):

p[index[level] = index[level] + 1] = obs

with

p[idx = index[level] = index[level] + 1] = obs
if (is_sorted) {
    if (idx < last_idx)) is_sorted = 0 // set is_sorted = 1 before the loop
    last_idx = idx // initially set last_idx = 0
}

Also benchmark it to see if the slowdown is high (in which case we make the sort check optional and unroll the loop)

Finally, sort() and _sort() should add a line like if (is_sorted) return(data)

The text was updated successfully, but these errors were encountered:

Note that instead of testing if the data is sorted, we ask Stata for the "sortedby" value. This should work with most of the relevant cases

sergiocorreia added the enhancement label Dec 10, 2016

sergiocorreia added a commit that referenced this issue Jan 4, 2017

Impleent #4

fc3f6a7

Note that instead of testing if the data is sorted, we ask Stata for the "sortedby" value. This should work with most of the relevant cases

sergiocorreia closed this as completed Jan 4, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add check in panelsum() when factors are sorted #4

Add check in panelsum() when factors are sorted #4

sergiocorreia commented Dec 10, 2016 •

edited

Loading

Add check in panelsum() when factors are sorted #4

Add check in panelsum() when factors are sorted #4

Comments

sergiocorreia commented Dec 10, 2016 • edited Loading

sergiocorreia commented Dec 10, 2016 •

edited

Loading