Partition percentages incorrectly composed across filters #637

Closed
suthers opened this Issue Jun 3, 2012 · 2 comments

Projects

None yet

3 participants

@suthers
suthers commented Jun 3, 2012

First, this is a critical bug: Gephi is giving incorrect results without warning, undermining confidence in any of its computations with filters.

This is a little hard to explain, but the gist is that if you (1) turn on a filter, (2) compute some percentages, and (3) either change the filter or turn the filter off, change it, and turn it on again, then computing the percentages of step (2) under the new filter will give drastically incorrect (too small) results in a manner that suggests that the % under the second filter is being taken from the results of the first filter, not the global total.

That is, certain computations behave as if the filters are composed rather than changed.

Specifically I see this with the following sequence (there may be others):

I have an Attributes/Partition/Modularity Class filter defined.

A1. I select the first modularity class as the filter (display only that class).
A2. Under Partition/Nodes (upper left) I select one of my node attributes for the partition (a label indicating whether the node is an actor, a discussion, a chat or a file).
A3. I read off the percentages displayed into a spreadsheet, which I have sent up to verify that they sum to 100% (this is how I detected the bug).

I then repeat at #1:
B1. I select the next modularity class as the filter. (First time I was just selecting and unselecting checkboxes without turning the filter off. When I discovered the bug I turned the filter off, changed its definition, and turned it on again, but the bug remains.)
B2. Select the note attribute under Partition/Nodes
B3. Read off the percentages displayed.

Here of course I expect the percentages in B3 to be of the class chosen in B1, i.e., they add up to 100%. But they do not: they add up to 15.45%. Gephi seems to be taking a percentage of an already reduced set.

I see this as a CRITICAL BUG that should be fixed immediately, because it means Gephi is giving incorrect results without warning, greatly UNDERMINING CONFIDENCE in anything it does.

ONLY the first percentages computed are correct. The only way I can avoid this bug is to CLOSE THE PROJECT, read it in, and start over for EACH of my over 200 partitions. This is a large graph that takes about a minute to read in and display.

As an additional note, I wish I did not have to do steps 1-3 manually for over 200 partitions. Isn't there a way to output how node attributes are distributed a cross a class? I was able to Group the partitions and run various stats on this meta-graph, which is nice, but I can't figure out how to get it to run descriptive distributional stats on attributes I have defined (i.e., node type).

@sheymann
Member

You can export the Node Table in CSV to compute stats in other tools like R.

We'll investigate this bug.

@mbastian mbastian was assigned Jun 11, 2012
@mbastian
Member
mbastian commented Jul 8, 2012

Thanks for the report. Problem fixed. The partition module is scheduled to be completely rewritten in the next major version.

Note also that we try to reserve the critical tag for unconditional crashes.

@mbastian mbastian added a commit to mbastian/gephi that referenced this issue Jul 8, 2012
@mbastian mbastian Fix issue #637 1b0cda8
@mbastian mbastian closed this Jul 8, 2012
@mbastian mbastian added Fix Released and removed Fix Committed labels Nov 21, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment