Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Enhance handling of large eventlogs #29
Original reporter: jan.stolarek@
When I enable detailed spark logging via -lf flag I end up with huge eventlog files (130MB). Attempting to load these into ThreadScope practically kills my OS - memory runs out, swapping begins and I am forced to kill TS (which takes some time before the OS actually responds and kills the process). This makes -lf flag useless for my program and I think this might not be uncommon situation. It would be good if TS supported some sort of lazy loading of big eventlogs, so users could at least view parts of the log.
This is still a problem. Loading a 1G eventlog file is impossible even with 32G RAM. I think we need two things:
One idea comes to mind is to use something like SQLite which makes these operations almost trivial.
One thing that may be a problem is when scrolling the "Raw events" tab because of querying filesystem-backed event database (SQLite or not), so we may have to implement lazy rendering of "Raw events" (as far as I can see it doesn't support this currently,
Any other ideas?