Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Fix memory consumption in AuthorTopicModel #2122
This fix reduced the memory consumption on our project with ≈400.000 docs from 32GB to 2GB for the entire duration of the training.
I can't imagine there was a reason for that nested loop, must have just slipped my mind. I only tested scalability w.r.t. running time, if I'd tested memory consumption as well I should have caught this.
There is some stuff in my thesis about asymptotic complexity of memory consumption (pdf, section 220.127.116.11). The algorithm doesn't scale terribly well w.r.t. memory consumption. The empirical results showed that running time scaled as expected, compared to the theoretical scalability, but as I said I didn't test memory consumption.
I'm glad this problem was caught, hopefully it fixes the issues people are having. Thanks @philipphager.