Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add check in panelsum() when factors are sorted #4

Closed
sergiocorreia opened this issue Dec 10, 2016 · 0 comments
Closed

Add check in panelsum() when factors are sorted #4

sergiocorreia opened this issue Dec 10, 2016 · 0 comments

Comments

@sergiocorreia
Copy link
Owner

sergiocorreia commented Dec 10, 2016

This would speed up F.sort() calls when factors are sorted in the dataset; particularly useful if we run this method a lot (e.g. reghdfe)

First, create .is_sorted

Then, intercept this loop and replace (not tested):

p[index[level] = index[level] + 1] = obs

with

p[idx = index[level] = index[level] + 1] = obs
if (is_sorted) {
    if (idx < last_idx)) is_sorted = 0 // set is_sorted = 1 before the loop
    last_idx = idx // initially set last_idx = 0
}

Also benchmark it to see if the slowdown is high (in which case we make the sort check optional and unroll the loop)

Finally, sort() and _sort() should add a line like if (is_sorted) return(data)

sergiocorreia added a commit that referenced this issue Jan 4, 2017
Note that instead of testing if the data is sorted, we ask Stata for the
"sortedby" value. This should work with most of the relevant cases
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant