Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SpaceTally logic should handle large images #5188

Closed
VincentBlondeau opened this issue Nov 20, 2019 · 6 comments · Fixed by #6177
Closed

SpaceTally logic should handle large images #5188

VincentBlondeau opened this issue Nov 20, 2019 · 6 comments · Fixed by #6177
Labels
Project: Large Images to make sure that pharo IDE can react when used with large images

Comments

@VincentBlondeau
Copy link
Contributor

In our image we have 200 000 classes, and a GC of 5 seconds. So, we need 11 days to run the following code: SpaceTally printSpaceAnalysis

The main question is that why do we need the GC in the middle?

@MarcusDenker
Copy link
Member

I think the idea is that the data would be less correct without it (even with doing a GC right before the instances that are just created for the report do change the result, but not that much).

I think that a correct version would need to analyse the image "from the outside" to do a report... e.g. just on a dead image or via some VM level code that stops everything while counting.

@bencoman
Copy link
Contributor

bencoman commented Nov 29, 2019

some VM level code that stops everything while counting

IIUC, on Linux the image is stored as mmap'd segments, so maybe clone [1] with:

  • CLONE_VM unset - so the parent image can continue while the child has a copy-on-write static image to process
  • CLONE_FILES set - so that the child can communicate results back to the parent

Half an example here [2]

[1] https://linux.die.net/man/2/clone
[2] https://eli.thegreenplace.net/2018/launching-linux-threads-and-processes-with-clone/

clone() creates a new process, in a manner similar to fork(2). It is actually a library function layered on top of the underlying clone() system call, ...

@VincentBlondeau
Copy link
Contributor Author

Ps: I want to run it under Windows ;)

@Ducasse Ducasse added the Project: Large Images to make sure that pharo IDE can react when used with large images label Jan 22, 2020
@eliotmiranda
Copy link

The Spur VM provides an allObjects primitive which means one can collect all the data in one primitive and then do the analysis on that one collection with no GC necessary. I’m sure a full space tally of a 1Gb image implemented above the allObjects primitive would take much less than 10 minutes.

@Ducasse
Copy link
Member

Ducasse commented Jan 22, 2020

Thanks for the pointer.

@VincentBlondeau
Copy link
Contributor Author

Fixed in #6177 implemented following @eliotmiranda advice and works like a charm!

@MarcusDenker MarcusDenker moved this from To do to Done in Large Images Sep 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Project: Large Images to make sure that pharo IDE can react when used with large images
Projects
Large Images
  
Done
Development

Successfully merging a pull request may close this issue.

5 participants