You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I was running linear discriminant training, I noticed very high memory usage.
In my case training set consists of 500k samples from 20k classes.
Feature dimension is less than 400.
Memory usage reached more than 100GB.
The problem is that covariance matrices are stored in a list, which is already not necessary since we only need average of those matrices. After that np.average is called which will copy to convert the list to numpy.array. Both these factors dramatically increase memory usage.
I think we can simply take the sum in the loop if precision is not a problem.
The text was updated successfully, but these errors were encountered:
I think you're right, and don't think that precision was the intention here. _class_means uses a similar idiom in a way that wastes memory (but much less so). PR welcome.
When I was running linear discriminant training, I noticed very high memory usage.
In my case training set consists of 500k samples from 20k classes.
Feature dimension is less than 400.
Memory usage reached more than 100GB.
After debugging I found the problem:
In implementation:
The problem is that covariance matrices are stored in a list, which is already not necessary since we only need average of those matrices. After that
np.average
is called which will copy to convert the list to numpy.array. Both these factors dramatically increase memory usage.I think we can simply take the sum in the loop if precision is not a problem.
The text was updated successfully, but these errors were encountered: