Don't perform layout on connected components containing just one node #1

fedarko · 2017-08-09T21:38:23Z

From @fedarko on August 2, 2017 3:15

The repeated costs of running sfdp (in SPQR mode) and dot (in standard mode) on cc's of single nodes aren't that bad when the number of those components is less than around 1000, but this quickly gets out of hand when there are literally tens of thousands of those components (for the one graph I'm working with now there's over 100k "single-node" components).

Solution: for a component containing a single node of width w inches and h inches, with no edges and with no node groups, just set the bounding box of the resulting connected components (in SPQR and standard mode) to (w + some padding, h + some padding). Look at some current ways Graphviz lays this stuff out for guidance.

This has the potential to save a lot of time (by preventing lots of repeated calls to pygraphviz).

Copied from original issue: fedarko/MetagenomeScope#252

The text was updated successfully, but these errors were encountered:

fedarko · 2017-08-09T21:38:24Z

got algorithm mostly figured out by this point; will implement soon

This should make the code a bit easier to follow, also (in retrospect, assigning heights/widths after layout was a bit silly). Next up is #1. #78 was the main thing I needed to do before that, so this shouldn't be too difficult to do now that I understand Graphviz' rounding algorithms better.

fedarko · 2017-08-14T01:45:38Z

One small thing -- I guess this will mean that .gv/.xdot export isn't (easily) doable on 1-node components, then?

(We could "fake" this info, I guess, but I don't think that'd be super valuable? I guess we can do it anyway.)

Still need to implement it for the SPQR modes, although that shouldn't be that bad to do. Also I'm kinda waffling as to whether or not to preserve my fixes for #78 -- the fact that Graphviz' edge control points and component bounding boxes are both based on the rounded coordinates might mean that it'd be easier (plus look nicer, plus be more consistent with before) if I just performed the rounding method manually in the faking process, and reverted the changes done in #78. I dunno. I'll look into it.

Still want to do some more testing tomorrow to make sure this handles node/edge/etc. count information 100% appropriately. Also want to implement gv/xdot export with the faked data, maybe. (Shouldn't be too hard to do .gv -- but for .xdot, I'm kinda thinking that'd be kind of difficult to do without hardcoding stuff -- will look further into that tomorrow. (It'd probably be ok to just say that 1-component layouts can only have .gv files generated.)

fedarko · 2017-08-14T08:22:03Z

Remaining TODOs related to this:

do some more testing to verify the SPQR layout "faking" is entirely ok
~~Add .gv export with the "faked" info if -pg is passed~~ -- this actually seems kind of useless, and IMO it'd be a better use of time to just take a few seconds and mention the lack of gv/xdot export for single-node components in the README.

fedarko · 2017-08-14T21:36:41Z

After doing some testing -- looks ok.

Due to changes in the preprocessing script, these files have become sort of irrelevant. I suppose the _with_optimizations log file is useful in that it highlights the effects of #1 being resolved, but IMO that's not really pertinent enough to justify it being included. These files are still backed up on my system -- I guess I'll also upload them to the CBCB computing infrastructure so that there's multiple backups of them, in addition to the ones in the repository history.

In the -pg help text. I guess it's actually possible for a component to have 1 nodegroup and no edges (due to user-defined patterns), and the layout of that wouldn't be faked.

fedarko self-assigned this Aug 9, 2017

fedarko added codeissue Not quite bugs, but still issues with the code (e.g. style) collateoptimization Optimizations for the preprocessing script labels Aug 9, 2017

fedarko mentioned this issue Aug 9, 2017

Don't perform layout on connected components containing just one node fedarko/MetagenomeScope#252

Closed

fedarko added a commit that referenced this issue Aug 9, 2017

Add comments re: #1 to collate.py

c345254

fedarko added a commit that referenced this issue Aug 14, 2017

Add optimization runtime log for #1

116d0b5

fedarko closed this as completed in 4072535 Aug 14, 2017

fedarko added a commit that referenced this issue Aug 14, 2017

Fix verbiage in note re: #1 in README a bit

01da2e8

fedarko referenced this issue Apr 1, 2018

State that "nontrivial" cc's have no node groups

850cf8d

In the -pg help text. I guess it's actually possible for a component to have 1 nodegroup and no edges (due to user-defined patterns), and the layout of that wouldn't be faked.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't perform layout on connected components containing just one node #1

Don't perform layout on connected components containing just one node #1

fedarko commented Aug 9, 2017

fedarko commented Aug 9, 2017 •

edited

Loading

fedarko commented Aug 14, 2017

fedarko commented Aug 14, 2017 •

edited

Loading

fedarko commented Aug 14, 2017 •

edited

Loading

Don't perform layout on connected components containing just one node #1

Don't perform layout on connected components containing just one node #1

Comments

fedarko commented Aug 9, 2017

fedarko commented Aug 9, 2017 • edited Loading

fedarko commented Aug 14, 2017

fedarko commented Aug 14, 2017 • edited Loading

fedarko commented Aug 14, 2017 • edited Loading

fedarko commented Aug 9, 2017 •

edited

Loading

fedarko commented Aug 14, 2017 •

edited

Loading

fedarko commented Aug 14, 2017 •

edited

Loading