Implement TreeSet-based acyclic decompression#55
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #55 +/- ##
==========================================
+ Coverage 99.81% 99.83% +0.02%
==========================================
Files 11 11
Lines 537 605 +68
==========================================
+ Hits 536 604 +68
Misses 1 1 ☔ View full report in Codecov by Sentry. |
|
Thanks! I merge |
TreeSet-based acyclic decompression
|
Random tests strike again 😎 |
|
Fixed 😎 |
cda2d9d to
0fadeeb
Compare
|
I added a few more tests to cover all possible bases, let me just merge those and then I think this PR will be ready to roll |
gdalle
left a comment
There was a problem hiding this comment.
Great work! I just asked for some clarifications to make the code easier to read, and to ensure that we don't do something that has accidentally high complexity
|
Another thing to make sure of is that decompression does not modify the |
|
|
What matters is that we can call decompression twice in a row and get the correct result, which I think is the case. Just to make sure, you could duplicate the decompression test lines in test/utils.jl so that they run twice. |
09ca6ed to
ba96934
Compare
|
@gdalle We can still optimize the DFS order for the trees, but we have already made significant improvements to the decompression process. |
using LinearAlgebra, SparseArrays
using SparseMatrixColorings
using MatrixMarket, SuiteSparseMatrixCollection
ssmc = ssmc_db(verbose=false)
dataset = ssmc[(ssmc.numerical_symmetry .== 1) .& (200000 .≤ ssmc.nrows .≤ 300000), :]
paths = fetch_ssmc(dataset, format="MM")
nb_pb = length(paths)
for (i, path) in enumerate(paths)
pb = split(path, '/')[end]
println("Problem $i/$nb_pb: $pb")
# println("Lecture des données: ")
A = MatrixMarket.mmread(path * "/$pb.mtx")
println("size(A) = ", size(A), " | nnz(A) = ", nnz(A))
# Star coloring
result = coloring(
A,
ColoringProblem(;
structure=:symmetric, partition=:column, decompression=:direct
),
GreedyColoringAlgorithm(),
)
timer = @elapsed coloring(
A,
ColoringProblem(;
structure=:symmetric, partition=:column, decompression=:direct
),
GreedyColoringAlgorithm(),
)
colors = column_colors(result)
ncolors = unique(colors) |> length
println("Star coloring: ", ncolors)
println("timer: ", timer)
# Acyclic coloring
result = coloring(
A,
ColoringProblem(;
structure=:symmetric, partition=:column, decompression=:substitution
),
GreedyColoringAlgorithm(),
)
timer = @elapsed coloring(
A,
ColoringProblem(;
structure=:symmetric, partition=:column, decompression=:substitution
),
GreedyColoringAlgorithm(),
)
colors = column_colors(result)
ncolors = unique(colors) |> length
println("Acyclic coloring: ", ncolors)
println("timer: ", timer)
println()
endProblem 1/38: pwtk
size(A) = (217918, 217918) | nnz(A) = 11524432
Star coloring: 108
timer: 13.275666601
Acyclic coloring: 53
timer: 10.017835665
Problem 2/38: Lin
size(A) = (256000, 256000) | nnz(A) = 1766400
Star coloring: 11
timer: 0.570040176
Acyclic coloring: 7
timer: 0.678558693
Problem 3/38: bmw3_2
size(A) = (227362, 227362) | nnz(A) = 11288630
Star coloring: 135
timer: 12.257366888
Acyclic coloring: 71
timer: 7.83447513
Problem 4/38: fcondp2
size(A) = (201822, 201822) | nnz(A) = 11294316
Star coloring: 111
timer: 12.289692995
Acyclic coloring: 66
timer: 9.947373563
Problem 5/38: halfb
size(A) = (224617, 224617) | nnz(A) = 12387821
Star coloring: 117
timer: 14.194495256
Acyclic coloring: 60
timer: 9.290101493
Problem 6/38: hood
size(A) = (220542, 220542) | nnz(A) = 9895422
Star coloring: 89
timer: 9.413525359
Acyclic coloring: 55
timer: 6.929516684
Problem 7/38: troll
size(A) = (213453, 213453) | nnz(A) = 11985111
Star coloring: 108
timer: 15.561040888
Acyclic coloring: 57
timer: 11.82566261
Problem 8/38: CO
size(A) = (221119, 221119) | nnz(A) = 7666057
Star coloring: 264
timer: 10.860656442
Acyclic coloring: 152
timer: 9.852414572
Problem 9/38: Ga41As41H72
size(A) = (268096, 268096) | nnz(A) = 18488476
Star coloring: 565
timer: 182.098963275
Acyclic coloring: 391
timer: 42.221200902
Problem 10/38: Si87H76
size(A) = (240369, 240369) | nnz(A) = 10661631
Star coloring: 283
timer: 31.545367728
Acyclic coloring: 176
timer: 17.024328905
Problem 11/38: BenElechi1
size(A) = (245874, 245874) | nnz(A) = 13150496
Star coloring: 71
timer: 15.361302623
Acyclic coloring: 42
timer: 11.290947979
Problem 12/38: 3Dspectralwave2
size(A) = (292008, 292008) | nnz(A) = 12935272
Star coloring: 102
timer: 30.127591632
Acyclic coloring: 39
timer: 23.510229553
Problem 13/38: thermomech_dM
size(A) = (204316, 204316) | nnz(A) = 1423116
Star coloring: 11
timer: 0.59547018
Acyclic coloring: 7
timer: 0.526237929
Problem 14/38: offshore
size(A) = (259789, 259789) | nnz(A) = 4242673
Star coloring: 29
timer: 2.930412158
Acyclic coloring: 14
timer: 2.581180775
Problem 15/38: citationCiteseer
size(A) = (268495, 268495) | nnz(A) = 2313294
Star coloring: 213
timer: 4.579315949
Acyclic coloring: 65
timer: 5.125990285
Problem 16/38: coAuthorsCiteseer
size(A) = (227320, 227320) | nnz(A) = 1628268
Star coloring: 106
timer: 1.37077632
Acyclic coloring: 87
timer: 1.615801889
Problem 17/38: coAuthorsDBLP
size(A) = (299067, 299067) | nnz(A) = 1955352
Star coloring: 128
timer: 1.737375587
Acyclic coloring: 115
timer: 2.237246244
Problem 18/38: delaunay_n18
size(A) = (262144, 262144) | nnz(A) = 1572792
Star coloring: 17
timer: 0.779370072
Acyclic coloring: 7
timer: 0.592075091 |
|
Very nice! I had no idea the number of colors would improve so much! |
@gdalle The optimized version of the acyclic coloring is finally working! 💪
I was able to verify it on this example: