-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Numerosity-based input? #23
Comments
Well, it's not impossible but it definitely needs a bit of work. Right now all the functions in Alternatively, if you worry only about the space consumed on the disk but don't mind if the run-length encoded representation from the disk is "expanded" into a full vector in memory, then it's enough to re-write the first part of |
Yes, the latter is what we are doing now. Only, on a graph with 12 billion nodes, so it's a bit slow and memory consuming 😂. |
Yeah, I can imagine. Sorry about that :( Unfortunately I'm not doing research any more so it's hard to find spare time for these side-projects, that's why I cannot promise that I'll implement this in the near future. It would definitely be a nice addition. |
It is very common, when trying to analyse distributions of, say, web graph, to have the distribution in the form
1 143245
2 395599
etc., that is, for each possible value the associated numerosity. The way we're using plfit now is simply that of generating a sample by printing 143254 times "1", 395599 times "2", and so on, but this is slow, memory-hungry and definitely suboptimal. How difficult would be to have plfit use directly a sample specified as above?
The text was updated successfully, but these errors were encountered: