Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the profile data to JSON conversion #2

Closed
jiffyclub opened this issue Aug 19, 2012 · 6 comments
Closed

Improve the profile data to JSON conversion #2

jiffyclub opened this issue Aug 19, 2012 · 6 comments

Comments

@jiffyclub
Copy link
Owner

The function stats_to_tree_dict in upload.py currently fails for some profiles for unknown reasons. The current known profile which causes it to fail is large, deep, and contains complex recursion.

@jiffyclub jiffyclub reopened this Aug 20, 2012
@jiffyclub
Copy link
Owner Author

Referred to the wrong ticket in a commit and accidentally closed this.

jiffyclub added a commit that referenced this issue Aug 21, 2012
…unctions instead of methods. Made the stats_to_tree_dict function much more efficient by stashing the dictionaries representing visited nodes and reusing those dicts instead of making new ones. This should help with #2, but doesn't fix it since the json.dumps function still chokes on especially large profiles.
@jiffyclub
Copy link
Owner Author

dba503e improves the stats_to_tree_dict function but large profiles still break json.dumps and would probably not work well with the d3 visualization anyway. To truly scale WSV we'll want to explore options like proactive pruning of the result JSON and allowing the user to select a sub-node of the program tree to be the root of the visualization (or pick a sensible sub-node automatically).

jiffyclub added a commit that referenced this issue Aug 28, 2012
is not a good idea as implemented in dba503e. The size/value of a node will
change depending on its parent so it is necessary to recalculate that
every time a node is visited from a new parent (which should be every time
it is visited). This commit undoes the change from dba503e, which means
we are back to square one on #2.
@jiffyclub
Copy link
Owner Author

I had to undo the changes from dba503e because under that code a node's size was calculated only once the first time the node was visited, when really the node size needs to be calculated separately each time the node is visited from a different parent, meaning basically that it is not possible to re-use dictionaries to represent nodes at different places in the call tree.

We could cut down on memory usage by tracking fewer properties of every node, by somehow reusing all properties but the size, or using something more memory efficient than dictionaries, but this may only get us a little way (and cause problems with the JSON module). We were finding that the JSON module couldn't handle the larger call trees anyway, so for the time being it may be most practical to just focus on providing elegant failures when it's not possible to process a profile. That is described in issues #6 and #11.

@jiffyclub
Copy link
Owner Author

The problem we are having with this particularly large profile is that the memory necessary to represent it as a dictionary tree and/or JSON string is extremely large. There also seem to be practical limits on the size/complexity of a sunburst that can be elegantly displayed and still have a responsive visualization.

I am going to close this issue and open a new one more properly focused on large profile handling. See new issue #12 for continued conversation.

This was referenced Aug 28, 2012
@embray
Copy link
Collaborator

embray commented Aug 28, 2012

I'm not so sure about this. Why not use the dictionaries like you were before, but use a separate data structure just for tracking the node sizes?

@embray
Copy link
Collaborator

embray commented Aug 28, 2012

Or for each visited node you could do a structure like:

{'size': ..., 'invariants': {...}}

Where invariants is the part of the node dict that's the same each time it's visited, and can be copied.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants