-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Profiling, memory usage and performance review #16
Comments
I ran some benchmarks using some of the base methods that can be used for operating on DataFrames. The dataset used was nycflights13 that consists on 336776 rows and 20 columns. The benchmark code was the following:
The two most obvious improvements that could be extracted from the cpu profile were that it seemed that a lot of small memory allocations were taking place and thus the GC was being impacted very heavily and also a lot of reflection was being made when new With these two changes, already present on ffcfe13, we could observe a significant speedup with the same dataset:
Further optimization is still on the cards by improving the algorithms, reducing memory allocations or using concurrency for some processes. For the latter, I've done some experiments on the |
As of c9b8f46 I've optimized the
Additionally, |
I've been exploring the performance of Gota on the same dataset when comparing it with Pandas and R/Dplyr. The results are quite promising! For the majority of the cases Gota is slower than both other systems. The bottleneck seems to be that due to the data structures used, the big number of small memory allocations seem to hit the garbage collector pretty heavily specially when copying the data (Which due to the nature of the library, happens quite often). When modifying the
|
After studying the problem I've settled for a solution that is almost as performant as the experimental solution but don't change the overall structure of the code significantly, so it should be as easy to maintain as it was previously. Essentially the main sticking point of the performance was that storing the elements of a Changing the main data structure of
And now instead stores the data inside the structure and an additional flag for describing if the element is
With these changes we observe a significant speedup with respect to the unoptimised version and we start to get closer to the performance achieved by Pandas/R:
This has been merged on |
So far there has not been an study on the performance of this library in terms of speed and memory consumption. I'm prioritising now those features that impact the users directly, since the API design is still on flux, but this should be addressed on the near future.
The text was updated successfully, but these errors were encountered: