Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rolling regression #4075

Closed
waynelapierre opened this issue Nov 24, 2019 · 15 comments
Closed

Rolling regression #4075

waynelapierre opened this issue Nov 24, 2019 · 15 comments

Comments

@waynelapierre
Copy link

Just curious when will the rolling regression functionalities be added to data.table. I am totally using data.table for my data wrangling tasks and really hope I can do rolling regression in data.table.

@jangorecki
Copy link
Member

Yes, it is going to be added, someday, as mentioned in #2778
There is a long list of rolling functions to implement, efficient implementation of those is usually tricky, even more tricky in plain C. Rolling regression will be a little bit different because currently rolling functions takes atomic vector on input. You are welcome to propose an API for a rolling regression.

@waynelapierre
Copy link
Author

I am currently using the package rollRegres for rolling regressions, maybe you could borrow some ideas from it?

@jangorecki
Copy link
Member

I just found it and it looks very neat, pretty lightweight. It also seems to be feature rich.

@MichaelChirico
Copy link
Member

have you tried frollapply?

as for an optimized rolling regression, if I'm not mistaken there is a formula from linear algebra for updating a linear regression when adding/subtracting one observation, but I'm not sure what linear algebra facilities are available in standard C libraries

@jangorecki
Copy link
Member

jangorecki commented Nov 24, 2019

@MichaelChirico frollapply won't yet work for regression, see #2778 (comment), still waiting for feedback. When supported in frollapply, it will be much slower than rollRegres, but at least it will very flexible and a little bit more lightweight.

@smingerson
Copy link

smingerson commented Nov 24, 2019

The rollRegres package has been removed from CRAN for misrepresentation of authorship.

@jangorecki
Copy link
Member

GitHub version already fixes that so probably sooner or later will come back to CRAN.

@smingerson
Copy link

Great! I swear I looked for a Github page before mentioning it, but I simply must be blind. Found it now.

@waynelapierre
Copy link
Author

I believe the rolling regression function is highly demanded by finance guys like me. I hope to see this functionality added soon. Fingers crossed...

@waynelapierre
Copy link
Author

By the way, the r package roll also has a roll_lm function. Maybe you could borrow some ideas from it.

@jangorecki
Copy link
Member

I briefly went through https://cran.r-project.org/web/packages/rollRegres/vignettes/Comparisons.html and I am now not sure if we really want to have it in data.table. I agree it is highly demanded feature, but not sure if it is really necessary to have it in DT. @mattdowle what is your opinion on that?

@mattdowle
Copy link
Member

mattdowle commented Mar 3, 2020

I haven't clicked any links or looked at any details, but yes if I understand correctly, if rollRegress can be used with data.table, then that's ideal and there's no need to build it into data.table.
@waynelapierre You stated you're already using rollRegress. Is the problem merely its removal from CRAN, or were there any other problems/inefficiencies using it together with data.table? If the later, please provide more detail by way of example code and benchmarks.

@MichaelChirico
Copy link
Member

Seems rollRegress is back on CRAN. My understanding is it's using some pretty smart stuff already to do the rolling regression efficiently.

Unless there's something specific & substantial to be gained by including this level of sophistication in data.table directly, I would push to add FRs & bugfixes to rollRegress instead.

@jangorecki
Copy link
Member

Closing this FR. So far we are not really convinced to have that in scope. I remember I had to convince Matt to have even rolling mean :) We can always re-open in future, so feedback on that FR is still welcome here in this closed issue.

@jangorecki
Copy link
Member

This can be achieved using by.column=FALSE #4887 but it obviously will not be super fast.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants