feat: igraph_is_complete #2510

aagon · 2024-02-24T03:07:36Z

Follow-up of #2490.

codecov · 2024-02-24T03:12:15Z

Codecov Report

Attention: Patch coverage is 89.18919% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 84.37%. Comparing base (8def9e3) to head (13c78d1).
Report is 96 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #2510      +/-   ##
==========================================
- Coverage   84.37%   84.37%   -0.01%     
==========================================
  Files         374      375       +1     
  Lines       61881    61608     -273     
  Branches        0    12055   +12055     
==========================================
- Hits        52212    51980     -232     
+ Misses       9669     9628      -41

Files	Coverage Δ
src/properties/complete.c	`89.18% <89.18%> (ø)`

... and 111 files with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5eabcd9...13c78d1. Read the comment docs.

szhorvat · 2024-02-24T21:34:58Z

Let me know when this is ready for review. To fix the CodeQL CI failure, you can rebase on the latest master branch (but this is not critical).

interfaces/functions.yaml

aagon · 2024-02-25T22:18:16Z

It can be reviewed now, or I can add the support for directed graphs, or even igraph_is_clique, whatever you prefer.

szhorvat · 2024-02-26T19:46:18Z

I had a look at the code (not the test though). Generally, it is very well written.

Regarding the ordering of neighbour lists: it is indeed guaranteed that these are sorted. This is documented for igraph_neighbors() and some igraph code relies on it, including the code that "simplifies" the neighbour lists.

We are missing support for directed graphs. Can you please add this?

Regarding running igraph_is_simple(): On a simple graph, this function needs to check each edge. It basically performs the same traversal as the following code does, iterating through all neighbour lists. Thus if we call this function, and the graph is non-simple, we end up doing twice the amount of work we'd need to. However, on simple graphs it has no disadvantage, and it is a cheap and safe way to make good use of the cache.

Thus I agree with you that this is a good way to implement is_complete().

One more thing that is missing is an overflow check when computing the largest possible number of edges. What if vcount*vcount overflows? A quick way would be to use the macros from math/safe_intop.h, which even has a IGRAPH_SAFE_N_CHOOSE_2. However, these macros all throw an error on overflow, and this is not what we want. If the vertex count is so high that vcount*(vcount-1)/2 is not representable, then the graph obviously can't be complete. It would be better to compare vcount against the largest vertex count with which a complete graph is still representable with igraph. But I'm not sure how to compute this value at compile time ...

Maybe instead of computing it, we can just assume that we're working with either standard 32-bit or 64-bit signed integers, and hard-code the limit for these two cases, similar to how it's done here:

igraph/src/math/safe_intop.c

Lines 74 to 82 in 07d794d

    
           #if IGRAPH_INTEGER_SIZE == 32 
        
               /* Nothing else to do. */ 
        
           #elif IGRAPH_INTEGER_SIZE == 64 
        
               k |= k >> 32; 
        
           #else 
        
               /* If values other than 32 or 64 become allowed, 
        
                * this code will need to be updated. */ 
        
           #  error "Unexpected IGRAPH_INTEGER_SIZE value." 
        
           #endif

szhorvat · 2024-02-26T19:51:40Z

So, in the undirected case, the max possible vcount is $2^{32} = 4294967296$ with 64-bit and $2^{16} = 65536$ with 32-bit integers.

In the directed case, this will be 3037000500 and 46341, respectively. I hope this helps.

A graph with a vcount strictly larger than these cannot be complete.

szhorvat · 2024-02-26T19:55:56Z

Finally, we still need to be smart about computing (vcount * vcount) - vcount) / 2 in case vcount is too close to the above mentioned values. We could do something like:

complete_ecount = vcount % 2 == 0 ? (vcount/2) * (vcount - 1) : vcount * ((vcount-1)/2)

This won't overflow, assuming that the result is representable, which we will be checking earlier.

ntamas · 2024-02-26T20:43:58Z

Maybe instead of computing it, we can just assume that we're working with either standard 32-bit or 64-bit signed integers, and hard-code the limit for these two cases

I agree. It's unlikely that we'll ever want to support architectures that are not 32-bit or 64-bit anyway.

src/properties/complete.c

aagon · 2024-02-27T05:42:17Z

Thank you for the review.

Regarding the ordering of neighbour lists: it is indeed guaranteed that these are sorted.

I did not mean the neighbour list, but the list of all edges in the graph, iterated on according to edge ID. Those are not always sorted according to the "lowest adjacent vertex -- highest adjacent vertex" scheme I was mentioning, unless I'm mistaken. I asked because it would have allowed to count the edges directly on this list, rather than rely on igraph_i_neighbours. This would have avoided to check edges twice on undirected graphs. But this was too complicated anyway.

We are missing support for directed graphs. Can you please add this?

Done.

In the directed case, this will be 3037000500 and 46341, respectively. I hope this helps.

This was most useful, I am glad not to have had to do this myself. I've added a comment on how those numbers are produced.

complete_ecount = vcount % 2 == 0 ? (vcount/2) * (vcount - 1) : vcount * ((vcount-1)/2)

This was very smart indeed, it is guaranteed to be very cheap.

I think that the result of this function should be cached in a cache entry "is_complete". If you agree, I will add a placeholder comment to remind myself to add caching once you're finished refactoring it.

szhorvat

I added a few comments, but haven't had time to examine the test in detail yet.

Can you please include this function in the docs? I suggest adding it in docs/cliques.xxml. Feel free to propose a different location if you find one.

I'll get back to this later, including answering your questions.

src/properties/complete.c

tests/unit/igraph_is_complete.c

src/properties/complete.c

tests/unit/igraph_is_complete.c

szhorvat · 2024-02-28T15:02:00Z

I think that the result of this function should be cached in a cache entry "is_complete". If you agree, I will add a placeholder comment to remind myself to add caching once you're finished refactoring it.

I thought about this a bit, and it seems to me that this property would not be needed very often. If the graph is simple, caching that already achieved O(1). So caching whether the graph is complete is relevant only for multigraphs.

The original motivation for caching was to avoid running the most common checks multiple times. There are several functions which need to check if the graph is simple, or connected, before running their algorithm. Thus these checks were frequently run redundantly.

Do you have a use case in mind where it is important to not repeat complete graph checks on a multigraph?

aagon · 2024-02-28T23:26:34Z

Can you please include this function in the docs? I suggest adding it in docs/cliques.xxml.

Done. I'm fine with adding it there, but then maybe I should change the header file in which it is declared ? I'll let you decide.

Can you please add a very brief comment about what this function does, especially starting at _ /* If directed, we must add the mutuals manually */_? I'm a bit confused about this part.

I refactored the unit test, added comments, and used the occasion to remove all the unneeded memory management and calls to IGRAPH_CHECK.

If the graph is simple, caching that already achieved O(1).

Agreed. But in that case, we should try and cache the simplicity of a graph wherever we can. For instance, a graph generated with igraph_grg_game is necessarily simple, but this is not cached (I thought of this because this is how I generate my graphs, but there may be a lot of other places).

Do you have a use case in mind where it is important to not repeat complete graph checks on a multigraph?

No. I do not use multigraphs. I agree with your decision to not cache the completeness. But then, see my other point about caching simplicity much more aggressively (I can help track the places where this can be done, if you want).

szhorvat · 2024-03-01T11:55:46Z

@ntamas What do you think about adding an igraph_bool_t directed parameter here, and making it possible to treat directed graphs as undirected? Clique functions ignore edge directions (although they do issue a warning). Perhaps there is value in making this possible?

szhorvat · 2024-03-01T12:01:18Z

But then, see my other point about caching simplicity much more aggressively (I can help track the places where this can be done, if you want).

I'd be lying if I said I wasn't very tempted to do this, but experience has taught us that there is a rather high risk of introducing bugs. Caching bugs can have very bad consequences and could be very difficult to debug.

Examples:

There was a recent bug in igraph_barabasi_game() where it incorrectly generated multigraphs. I was surprised by this, I assumed it couldn't happen. But with complex functions there's always a chance of bugs.
Some functions have edge cases that are hard to miss. Does igraph_lattice() produce simple graphs? Usually yes, but if the lattice is periodic, and of size 1 or 2 in one direction, then it will have self-loops of multi-edges. The same goes for make_ring (cycle graphs).

I do wonder if igraph_add_edges() could verify that the graph stays simple without a performance penalty. Checking for self-loops is trivial, and definitely doable. Checking for multi-edges should be doable since this function does sort the edge list internally. If we can localize these checks to a single function (add_eges()) then all graph generators gain the benefit, and the risk is significantly reduced.

IMO keeping track of whether graphs are simple would be a huge benefit. Lots of checks become easier for simple graphs. EDIT: Opened an issue, #2522

ntamas · 2024-03-02T16:49:04Z

What do you think about adding an igraph_bool_t directed parameter here, and making it possible to treat directed graphs as undirected?

Seems OK to me, we do this in many other places. But I th9nk we can merge the current PR as is and then we can add this later, right?

szhorvat · 2024-03-02T16:57:02Z

Yes, it is marked as experimental.

ntamas

Okay, this looks good to me then.

chore: update changelog feat: igraph_is_complete() checks if a graph is complete (igraph/igraph#2510) Merge pull request igraph/igraph#2525 from igraph/fix/linegraph-keep-loops do not generate full scanner tables with flex chore: improve readability of some code refactor: igraph_degree() now uses the cache when checking for self-loops refactor: minor readability cleanup for minimum_size_separators() chore: add note about empty minimal st separator fix tests: minor cleanup in regression tests chore: update changelog fix: correct `is_(minimal_)separator()` for disconnected graphs, fixes igraph/igraph#1480 tests: add more `is_separator()` tests tests: fix premature memory deallocation in is_separator example fix: igraph_all_minimal_st_separators() does not return empty separators any more, fixes igraph/igraph#2517 fix: GraphML reader does not accept duplicate attribute names any more, fixes igraph/igraph#2506 fuzzing: assertion readability in vertex separator fuzzer fuzzing: improve vertex separator fuzzers docs: doc improvements for ECC and Voronoi refactor: minor prettification in trie code fix: assert that keys in a trie are not NULL to make Coverity happy refactor: improved error messages from C attribute handler (igraph/igraph#2513) chore: updated changelog fix: prevent insertion of empty keys in a trie fix: prevent empty ID attributes for <key> tags in GraphML docs: change mentions to pointer vectors to vector lists where appropriate & other improvements fix: -CF causes buffer overflows with Flex, try -Cf instead fix: bug_2506 test case should not be run if igraph is compiled without GraphML support ci: run apt-get update in codeql-analysis pipeline fix: fix attribute value comparison in igraph_read_graph_graphml() fix: fix duplicate attribute detection in igraph_read_graph_graphml(), fixes igraph/igraph#2506 refactor: make use of IGRAPH_CHECK_OOM() in a few places style: fix wording for a few warnings refactor: more detailed error messages in C attribute handler refactor: experimental `-Cf` option for Flex for better performance at the cost of larger code size. Hopefully this avoids fuzzer timeouts. fix: `igraph_disjoint_union()` and `igraph_disjoint_union_many()` now check for overflow chore: Update CONTRIBUTING.md fuzzing: expand file format round-tripping fuzzers fix: disallow writing invalid graph attributes to GML files, fixes igraph/igraph#2505 fuzzing: write_all fuzzer fixes refactor: remove non-exposed igraph_maximum_matching() function whihc was already noted as removed in the 0.10 changelog fuzzing: activate write_all fuzzer fuzzing: add cohesive_blocks() Merge pull request igraph/igraph#2504 from igraph/fix/writing-invalid-lgl-ncol-files docs: document that `write_graph_graphml()` assumes that strings are UTF-8 encoded, fixes igraph/igraph#2503 docs: mention that the variation of information is in natural units

szhorvat · 2024-03-31T15:36:42Z

@aagon Would you be interested in implementing igraph_is_clique() as well? The issue is #2380

aagon mentioned this pull request Feb 24, 2024

Wishlist: igraph_is_complete() decides whether a graph is complete #2490

Closed

szhorvat added this to the 0.10.11 milestone Feb 24, 2024

szhorvat reviewed Feb 25, 2024

View reviewed changes

interfaces/functions.yaml Outdated Show resolved Hide resolved

aagon force-pushed the feat/igraph_is_complete branch from f87c0f8 to 6fb6749 Compare February 25, 2024 22:03

szhorvat reviewed Feb 26, 2024

View reviewed changes

src/properties/complete.c Outdated Show resolved Hide resolved

aagon force-pushed the feat/igraph_is_complete branch from 6fb6749 to fa1f36b Compare February 27, 2024 05:22

aagon force-pushed the feat/igraph_is_complete branch 2 times, most recently from 1babfbc to 83b87c5 Compare February 27, 2024 06:13

szhorvat requested changes Feb 28, 2024

View reviewed changes

aagon force-pushed the feat/igraph_is_complete branch 2 times, most recently from 1d512b3 to 836f278 Compare February 28, 2024 23:06

Minimal working implementation of igraph_is_complete

a10f4fd

aagon force-pushed the feat/igraph_is_complete branch from 836f278 to a10f4fd Compare February 28, 2024 23:16

address some nitpicks

fabe381

Merge branch 'master' into feat/igraph_is_complete

13c78d1

ntamas approved these changes Mar 2, 2024

View reviewed changes

szhorvat marked this pull request as ready for review March 2, 2024 17:32

szhorvat merged commit 43f1468 into igraph:master Mar 2, 2024
25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: igraph_is_complete #2510

feat: igraph_is_complete #2510

aagon commented Feb 24, 2024

codecov bot commented Feb 24, 2024 •

edited

szhorvat commented Feb 24, 2024

aagon commented Feb 25, 2024

szhorvat commented Feb 26, 2024

szhorvat commented Feb 26, 2024 •

edited

szhorvat commented Feb 26, 2024 •

edited

ntamas commented Feb 26, 2024

aagon commented Feb 27, 2024

szhorvat left a comment

szhorvat commented Feb 28, 2024

aagon commented Feb 28, 2024

szhorvat commented Mar 1, 2024

szhorvat commented Mar 1, 2024 •

edited

ntamas commented Mar 2, 2024

szhorvat commented Mar 2, 2024

ntamas left a comment

szhorvat commented Mar 31, 2024 •

edited

feat: igraph_is_complete #2510

feat: igraph_is_complete #2510

Conversation

aagon commented Feb 24, 2024

codecov bot commented Feb 24, 2024 • edited

Codecov Report

szhorvat commented Feb 24, 2024

aagon commented Feb 25, 2024

szhorvat commented Feb 26, 2024

szhorvat commented Feb 26, 2024 • edited

szhorvat commented Feb 26, 2024 • edited

ntamas commented Feb 26, 2024

aagon commented Feb 27, 2024

szhorvat left a comment

Choose a reason for hiding this comment

szhorvat commented Feb 28, 2024

aagon commented Feb 28, 2024

szhorvat commented Mar 1, 2024

szhorvat commented Mar 1, 2024 • edited

ntamas commented Mar 2, 2024

szhorvat commented Mar 2, 2024

ntamas left a comment

Choose a reason for hiding this comment

szhorvat commented Mar 31, 2024 • edited

codecov bot commented Feb 24, 2024 •

edited

szhorvat commented Feb 26, 2024 •

edited

szhorvat commented Feb 26, 2024 •

edited

szhorvat commented Mar 1, 2024 •

edited

szhorvat commented Mar 31, 2024 •

edited