Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update pivot_ functions for performance #117

Open
etiennebacher opened this issue Oct 10, 2022 · 1 comment
Open

Update pivot_ functions for performance #117

etiennebacher opened this issue Oct 10, 2022 · 1 comment
Labels
feature request New feature or request

Comments

@etiennebacher
Copy link
Contributor

etiennebacher commented Oct 10, 2022

Hi @nathaneastwood, I rewrote the pivot_ functions in {datawizard} to use stack() and unstack() instead of reshape(), as suggested by @grantmcdermott in #48. This comes with important performance gains, especially with large datasets (a few million rows).

All code and benchmarks are in this PR: easystats/datawizard#285

I will probably make a PR here to implement this but I open this issue first just in case I forget about this and someone else wants to do it.


Edit: there were several fixes to make in the original implementation in the PR I linked to. It's better to rely on the functions in the main branch of datawizard rather than on the code in the PR.

@etiennebacher etiennebacher added the feature request New feature or request label Oct 10, 2022
@nathaneastwood
Copy link
Owner

The performance improvements look really great @etiennebacher. I'd definitely appreciate a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants