You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The function getCoresetFromManager() of class BucketManager is responsible to retrieve the coreset summarized in the buckets.
Why does the funcion return only the last bucket if it is full? And about new objects? The last bucket has the oldest objects of stream, and the new objects can spend much time to reach it.
See that when the last bucket is full, the next (2^(L-1))*m objects will make no difference to clustering, since only last bucket is returned.
The text was updated successfully, but these errors were encountered:
This part of the getCoresetFromManager() function does not make sense to me either. Having looked a bit deeper, I am not sure that this behaviour is consistent with the original paper. It seems to me that the coreset should be computed from all of the non-empty buckets whenever it is needed so that the clustering produced makes use of the most recent instances.
This is one of the modifications I have made to MOA's StreamKM++ algorithm and made available on Github here.
The function getCoresetFromManager() of class BucketManager is responsible to retrieve the coreset summarized in the buckets.
Why does the funcion return only the last bucket if it is full? And about new objects? The last bucket has the oldest objects of stream, and the new objects can spend much time to reach it.
See that when the last bucket is full, the next (2^(L-1))*m objects will make no difference to clustering, since only last bucket is returned.
The text was updated successfully, but these errors were encountered: