Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
Inefficient Gini Coefficient calculation? #855
When I run:
This is not the case when I run it using my own code which does not make use of list comprehensions, and instead calculates Gini like this.
Anyway, I thought I would mention that it be worth considering using numpy to calculate the difference.
Thanks for raising this issue. I am just about to leave for a week of travel and will definitely be looking into this. At first glance the efficient approach is based on a ranking of the values, which assumes no ties. I originally ruled that out as we had to both handle ties (in an upstream application) and also needed to keep tabs on geographical positions of each observation. So the memory inefficient implementation is what we used.
That said, once I have a little time after vacation, I think it might be possible to refactor things with an eye towards a more efficient approach (as you point to) that also addresses our upstream needs.