Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cycle basis calculations #1957

Merged
merged 24 commits into from
Mar 26, 2022
Merged

Conversation

szhorvat
Copy link
Member

@szhorvat szhorvat commented Feb 10, 2022

Continuing from #1786

This PR implements some cycle basis calculations.

  • Fundamental cycle basis
  • Unweighed minimum cycle basis

It has a special feature where it can limit the length of cycles found, which drastically reduces computation time. This will be a unique feature of igraph.

Within the minimum cycle basis computation, I represent cycles as sorted vectors of edge IDs. This makes certain operations, such as "adding" cycles, efficient. However, in the end we obtain the cycles with an edge ID ordering that does not match the cycle structure. Therefore, in a final step we order the cycles. This operation was much easier with a map data structure, so I did it in C++. I could do it in pure C without writing a map from scratch, but it would be much more complicated, and I don't have the time now. There is an option that controls whether this reordering should be done. Setting it to false will speed the function up, though I did not yet time to benchmark by how much.

This PR also adds a igraph_vector_list_remove_consecutive_duplicates function.

@szhorvat szhorvat changed the base branch from master to develop February 10, 2022 11:55
@codecov
Copy link

codecov bot commented Feb 10, 2022

Codecov Report

Merging #1957 (6508af9) into develop (1d2ab9e) will increase coverage by 0.32%.
The diff coverage is 97.94%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1957      +/-   ##
===========================================
+ Coverage    75.78%   76.11%   +0.32%     
===========================================
  Files          350      352       +2     
  Lines        57486    57749     +263     
===========================================
+ Hits         43567    43956     +389     
+ Misses       13919    13793     -126     
Impacted Files Coverage Δ
src/misc/order_cycle.cpp 97.22% <97.22%> (ø)
src/misc/cycle_bases.c 97.95% <97.95%> (ø)
src/core/typed_list.pmt 96.38% <100.00%> (+0.16%) ⬆️
src/constructors/lcf.c 93.61% <0.00%> (-4.06%) ⬇️
src/core/progress.c 66.66% <0.00%> (-3.93%) ⬇️
src/io/gml-tree.c 68.03% <0.00%> (-0.23%) ⬇️
src/linalg/arpack.c 73.94% <0.00%> (-0.07%) ⬇️
src/constructors/basic_constructors.c 100.00% <0.00%> (ø)
src/core/vector.pmt 88.00% <0.00%> (+0.01%) ⬆️
src/io/gml.c 79.60% <0.00%> (+0.04%) ⬆️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1d2ab9e...6508af9. Read the comment docs.

@szhorvat szhorvat force-pushed the feature/cycle-bases-develop branch 2 times, most recently from 30db2f1 to fbbce79 Compare March 19, 2022 20:17
@szhorvat szhorvat force-pushed the feature/cycle-bases-develop branch from d710e9f to 4f4764c Compare March 20, 2022 07:32
@szhorvat szhorvat marked this pull request as ready for review March 20, 2022 09:51
@szhorvat szhorvat requested review from ntamas and vtraag March 20, 2022 09:51
@szhorvat
Copy link
Member Author

szhorvat commented Mar 20, 2022

Please DO NOT MERGE, just give feedback. I will merge myself later. At first, I expect technical comments, as understanding the algorithm would take time. I discussed the algorithm with @vtraag last year, but I am not sure if he remembers.

include/igraph.h Show resolved Hide resolved
include/igraph_cycles.h Show resolved Hide resolved
include/igraph_cycles.h Outdated Show resolved Hide resolved
src/core/typed_list.pmt Outdated Show resolved Hide resolved
src/misc/cycle_bases.c Outdated Show resolved Hide resolved
src/misc/cycle_bases.c Outdated Show resolved Hide resolved
src/misc/cycle_bases.c Outdated Show resolved Hide resolved
src/misc/cycle_bases.c Outdated Show resolved Hide resolved
src/misc/cycle_bases.c Outdated Show resolved Hide resolved
src/misc/cycle_bases.c Outdated Show resolved Hide resolved
@szhorvat
Copy link
Member Author

Where feedback might be useful is an efficient algorithm to put edges in cycle order.

Currently, the minimum_cycle_basis() function produces cycles as sorted lists of edge IDs. This is actually sufficient for a surprisingly large number of applications, but in many cases users will expect the edge IDs to be ordered along the cycle instead. To do this, I used std::map from C++, hence the separate C++ source file. I'm not very happy with this, but it was the fastest way to handle this issue.

How the algorithm works is that it produces BFS-based fundamental cycles starting from all vertices (of degree d >= 3). This produces a list of candidate cycles of length (E - V + 1)*V, where E and V are edge and vertex counts. I.e., it produces a pretty huge list. These cycles are originally in the correct order, but they are sorted by edge ID for further processing. What we would do is to duplicate the original list in memory, and keep handles to the unsorted versions of cycles from the sorted one. However, this would complicate the code quite a bit (couldn't use vector_list anymore), and would more than double the already significant memory requirements. So I don't like it. This is why I opted to restore the cycle order on the final result instead.

@szhorvat
Copy link
Member Author

Yet another thing where I need feedback is a refactoring for IGRAPH_HANDLE_EXCEPTIONS(...). Now the argument is a large block of code, but this causes problems. Notice the separate typedef std::map<igraph_integer_t, eid_pair_t> inclist_t; I have in that file. Why do I need that? Because writing map<igraph_integer_t, eid_pair_t> within IGRAPH_HANDLE_EXCEPTIONS(...) confuses the C preprocessor with that comma, and just won't work.

We probably want a begin/end style solution for IGRAPH_HANDLE_EXCEPTIONS, but I want to see concrete suggestions please.

@ntamas
Copy link
Member

ntamas commented Mar 23, 2022

Added a begin/end style solution as the first iteration. I think it's good enough. Another alternative would have been to move the function body to a separate function and just wrap a call to that function within IGRAPH_HANDLE_EXCEPTIONS().

catch (const std::bad_alloc &e) { IGRAPH_ERROR(e.what(), IGRAPH_ENOMEM); /* LCOV_EXCL_LINE */ } \
catch (const std::exception &e) { IGRAPH_ERROR(e.what(), IGRAPH_FAILURE); } \
catch (...) { IGRAPH_ERROR("Unknown exception caught.", IGRAPH_FAILURE); }

#define IGRAPH_HANDLE_EXCEPTIONS(code) \
IGRAPH_HANDLE_EXCEPTIONS_BEGIN; \
Copy link
Member Author

@szhorvat szhorvat Mar 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ; is a bit strange here as it expands to try {;.

This is in fact why I was reluctant to do this. Maybe we can use the following pattern?

IGRAPH_HANDLE_EXCEPTIONS_BEGIN {

} IGRAPH_HANDLE_EXCEPTIONS_END;

??

A bit like

do {

} while (cond);

Not sure what's best.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel free to remove the ;, it has no effect. I just wanted to test that one is allowed to use IGRAPH_HANDLE_EXCEPTIONS_BEGIN or _END with or without a semicolon at the end without breaking thigs.

@ntamas
Copy link
Member

ntamas commented Mar 23, 2022

Where feedback might be useful is an efficient algorithm to put edges in cycle order.

Do you mean that the input of the algorithm is a graph and a list of edge IDs in arbitrary order that are guaranteed to form a cycle, and the result should be a list where the edge IDs are in the order they are traversed along the cycle? I don't see any particular problem with the current implementation.

@szhorvat
Copy link
Member Author

szhorvat commented Mar 23, 2022

Let me clarify.

Here's an example of a graph with edges labelled by ID:

image

3, 4, 8, 5 form a cycle, in this order. However, the (internal) computation returns the edge ID in sorted order, 3, 4, 5, 8. I use a separate function, igraph_i_order_cycle(), to put them back in cycle order, i.e. 3, 4, 8, 5. This function is written in C++, not C, because it uses std::map.

I found this a bit ugly.

@szhorvat szhorvat force-pushed the feature/cycle-bases-develop branch from cdb3b24 to 6508af9 Compare March 25, 2022 15:22
@szhorvat szhorvat requested a review from ntamas March 25, 2022 15:23
@szhorvat szhorvat merged commit 4055643 into igraph:develop Mar 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants