Skip to content

Decompression of acyclic coloring with SparseMatrixCSC#130

Merged
gdalle merged 6 commits intomainfrom
am/acyclic_decompression_csc
Oct 9, 2024
Merged

Decompression of acyclic coloring with SparseMatrixCSC#130
gdalle merged 6 commits intomainfrom
am/acyclic_decompression_csc

Conversation

@amontoison
Copy link
Copy Markdown
Collaborator

@amontoison amontoison commented Sep 28, 2024

close #93

  • Add the following method:
decompress!(A::SparseMatrixCSC{R}, B::AbstractMatrix{R}, result::TreeSetColoringResult, uplo::Symbol=:F)
   ...
end

@amontoison amontoison added the benchmark Run benchmarks on PR label Sep 28, 2024
@amontoison amontoison force-pushed the am/acyclic_decompression_csc branch from e8c4cd0 to ec19388 Compare September 28, 2024 05:38
@JuliaDiff JuliaDiff deleted a comment from github-actions Bot Sep 28, 2024
@amontoison amontoison removed the benchmark Run benchmarks on PR label Sep 28, 2024
@JuliaDiff JuliaDiff deleted a comment from github-actions Bot Sep 28, 2024
@amontoison amontoison added the benchmark Run benchmarks on PR label Sep 28, 2024
@codecov
Copy link
Copy Markdown

codecov Bot commented Sep 28, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (7146a7d) to head (9c35c8a).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #130   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           12        12           
  Lines          884       960   +76     
=========================================
+ Hits           884       960   +76     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@amontoison amontoison force-pushed the am/acyclic_decompression_csc branch 2 times, most recently from c79873e to 8635b3d Compare September 29, 2024 00:07
@JuliaDiff JuliaDiff deleted a comment from github-actions Bot Sep 29, 2024
@amontoison amontoison removed the benchmark Run benchmarks on PR label Sep 29, 2024
@amontoison amontoison requested a review from gdalle September 29, 2024 00:12
@JuliaDiff JuliaDiff deleted a comment from github-actions Bot Sep 29, 2024
@amontoison amontoison added the benchmark Run benchmarks on PR label Sep 29, 2024
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Sep 29, 2024

Benchmark Results

main 9c35c8a... main/9c35c8aa19b9c7...
coloring/nonsymmetric/column/direct/n=1000/p=0.002 0.0882 ± 0.0038 ms 0.0882 ± 0.0037 ms 1
coloring/nonsymmetric/column/direct/n=1000/p=0.005 0.188 ± 0.0068 ms 0.19 ± 0.0064 ms 0.994
coloring/nonsymmetric/column/direct/n=1000/p=0.01 0.387 ± 0.013 ms 0.391 ± 0.012 ms 0.989
coloring/nonsymmetric/column/direct/n=100000/p=0.0001 0.069 ± 0.0034 s 0.0675 ± 0.0025 s 1.02
coloring/nonsymmetric/column/direct/n=100000/p=2.0e-5 12.7 ± 0.47 ms 12.9 ± 0.54 ms 0.986
coloring/nonsymmetric/column/direct/n=100000/p=5.0e-5 31.6 ± 1.2 ms 31.2 ± 0.94 ms 1.01
coloring/nonsymmetric/row/direct/n=1000/p=0.002 0.0864 ± 0.0036 ms 0.0865 ± 0.0035 ms 0.999
coloring/nonsymmetric/row/direct/n=1000/p=0.005 0.187 ± 0.0074 ms 0.187 ± 0.0064 ms 1
coloring/nonsymmetric/row/direct/n=1000/p=0.01 0.388 ± 0.013 ms 0.386 ± 0.013 ms 1.01
coloring/nonsymmetric/row/direct/n=100000/p=0.0001 0.0702 ± 0.0033 s 0.0704 ± 0.003 s 0.997
coloring/nonsymmetric/row/direct/n=100000/p=2.0e-5 12.7 ± 0.64 ms 12.7 ± 0.55 ms 1
coloring/nonsymmetric/row/direct/n=100000/p=5.0e-5 31.4 ± 1.4 ms 31.1 ± 1.2 ms 1.01
coloring/symmetric/column/direct/n=1000/p=0.002 0.25 ± 0.012 ms 0.252 ± 0.012 ms 0.994
coloring/symmetric/column/direct/n=1000/p=0.005 0.677 ± 0.028 ms 0.679 ± 0.028 ms 0.996
coloring/symmetric/column/direct/n=1000/p=0.01 1.63 ± 0.066 ms 1.64 ± 0.07 ms 0.993
coloring/symmetric/column/direct/n=100000/p=0.0001 0.404 ± 0.03 s 0.369 ± 0.057 s 1.1
coloring/symmetric/column/direct/n=100000/p=2.0e-5 0.0342 ± 0.0017 s 0.0341 ± 0.0018 s 1
coloring/symmetric/column/direct/n=100000/p=5.0e-5 0.121 ± 0.015 s 0.118 ± 0.0076 s 1.03
coloring/symmetric/column/substitution/n=1000/p=0.002 0.576 ± 0.031 ms 0.628 ± 0.034 ms 0.918
coloring/symmetric/column/substitution/n=1000/p=0.005 1.56 ± 0.078 ms 1.68 ± 0.083 ms 0.925
coloring/symmetric/column/substitution/n=1000/p=0.01 3.31 ± 0.14 ms 3.57 ± 0.12 ms 0.927
coloring/symmetric/column/substitution/n=100000/p=0.0001 0.808 ± 0.045 s 0.855 ± 0.035 s 0.945
coloring/symmetric/column/substitution/n=100000/p=2.0e-5 0.0966 ± 0.0099 s 0.0998 ± 0.01 s 0.968
coloring/symmetric/column/substitution/n=100000/p=5.0e-5 0.304 ± 0.02 s 0.315 ± 0.025 s 0.965
decompress/nonsymmetric/column/direct/n=1000/p=0.002 4.72 ± 0.57 μs 4.28 ± 0.49 μs 1.1
decompress/nonsymmetric/column/direct/n=1000/p=0.005 8.09 ± 1.1 μs 7.66 ± 0.84 μs 1.06
decompress/nonsymmetric/column/direct/n=1000/p=0.01 24.6 ± 6.9 μs 22.8 ± 6.8 μs 1.08
decompress/nonsymmetric/column/direct/n=100000/p=0.0001 5.12 ± 0.4 ms 5.32 ± 0.36 ms 0.963
decompress/nonsymmetric/column/direct/n=100000/p=2.0e-5 0.992 ± 0.18 ms 1.04 ± 0.11 ms 0.956
decompress/nonsymmetric/column/direct/n=100000/p=5.0e-5 2.52 ± 0.7 ms 2.63 ± 0.55 ms 0.958
decompress/nonsymmetric/row/direct/n=1000/p=0.002 5.23 ± 0.56 μs 4.46 ± 0.5 μs 1.17
decompress/nonsymmetric/row/direct/n=1000/p=0.005 8.22 ± 0.79 μs 7.24 ± 0.59 μs 1.14
decompress/nonsymmetric/row/direct/n=1000/p=0.01 17 ± 3.5 μs 16.1 ± 2.6 μs 1.06
decompress/nonsymmetric/row/direct/n=100000/p=0.0001 2.02 ± 0.3 ms 2.03 ± 0.41 ms 0.998
decompress/nonsymmetric/row/direct/n=100000/p=2.0e-5 0.445 ± 0.28 ms 0.402 ± 0.069 ms 1.11
decompress/nonsymmetric/row/direct/n=100000/p=5.0e-5 1.04 ± 0.79 ms 0.959 ± 0.54 ms 1.08
decompress/symmetric/column/direct/n=1000/p=0.002 5.4 ± 0.68 μs 4.73 ± 0.61 μs 1.14
decompress/symmetric/column/direct/n=1000/p=0.005 8.07 ± 0.84 μs 7.69 ± 0.77 μs 1.05
decompress/symmetric/column/direct/n=1000/p=0.01 22.3 ± 4.6 μs 21.7 ± 5.1 μs 1.03
decompress/symmetric/column/direct/n=100000/p=0.0001 4.9 ± 0.19 ms 4.87 ± 0.19 ms 1.01
decompress/symmetric/column/direct/n=100000/p=2.0e-5 0.972 ± 0.13 ms 0.93 ± 0.079 ms 1.04
decompress/symmetric/column/direct/n=100000/p=5.0e-5 2.42 ± 0.087 ms 2.37 ± 0.2 ms 1.02
decompress/symmetric/column/substitution/n=1000/p=0.002 0.07 ± 0.0035 ms 21.8 ± 2.4 μs 3.21
decompress/symmetric/column/substitution/n=1000/p=0.005 0.171 ± 0.0074 ms 0.0393 ± 0.002 ms 4.35
decompress/symmetric/column/substitution/n=1000/p=0.01 0.353 ± 0.011 ms 0.0749 ± 0.0034 ms 4.71
decompress/symmetric/column/substitution/n=100000/p=0.0001 0.074 ± 0.01 s 20.3 ± 2.6 ms 3.64
decompress/symmetric/column/substitution/n=100000/p=2.0e-5 13 ± 1.1 ms 5.63 ± 0.4 ms 2.31
decompress/symmetric/column/substitution/n=100000/p=5.0e-5 0.0327 ± 0.0023 s 10.9 ± 1.8 ms 3
time_to_load 0.269 ± 0.0012 s 0.268 ± 0.0011 s 1
main 9c35c8a... main/9c35c8aa19b9c7...
coloring/nonsymmetric/column/direct/n=1000/p=0.002 24 allocs: 0.0588 MB 24 allocs: 0.0587 MB 1
coloring/nonsymmetric/column/direct/n=1000/p=0.005 24 allocs: 0.102 MB 24 allocs: 0.103 MB 0.997
coloring/nonsymmetric/column/direct/n=1000/p=0.01 24 allocs: 0.176 MB 24 allocs: 0.178 MB 0.994
coloring/nonsymmetric/column/direct/n=100000/p=0.0001 24 allocs: 18.3 MB 24 allocs: 18.3 MB 0.999
coloring/nonsymmetric/column/direct/n=100000/p=2.0e-5 24 allocs: 6.07 MB 24 allocs: 6.08 MB 0.998
coloring/nonsymmetric/column/direct/n=100000/p=5.0e-5 24 allocs: 10.6 MB 24 allocs: 10.6 MB 1
coloring/nonsymmetric/row/direct/n=1000/p=0.002 24 allocs: 0.0586 MB 24 allocs: 0.0582 MB 1.01
coloring/nonsymmetric/row/direct/n=1000/p=0.005 24 allocs: 0.103 MB 24 allocs: 0.103 MB 1
coloring/nonsymmetric/row/direct/n=1000/p=0.01 24 allocs: 0.178 MB 24 allocs: 0.177 MB 1
coloring/nonsymmetric/row/direct/n=100000/p=0.0001 24 allocs: 18.3 MB 24 allocs: 18.3 MB 1
coloring/nonsymmetric/row/direct/n=100000/p=2.0e-5 24 allocs: 6.07 MB 24 allocs: 6.08 MB 0.999
coloring/nonsymmetric/row/direct/n=100000/p=5.0e-5 24 allocs: 10.6 MB 24 allocs: 10.6 MB 1
coloring/symmetric/column/direct/n=1000/p=0.002 1.08 k allocs: 0.265 MB 1.09 k allocs: 0.266 MB 0.998
coloring/symmetric/column/direct/n=1000/p=0.005 2.65 k allocs: 0.403 MB 2.71 k allocs: 0.407 MB 0.99
coloring/symmetric/column/direct/n=1000/p=0.01 5.74 k allocs: 1.06 MB 5.75 k allocs: 1.06 MB 1
coloring/symmetric/column/direct/n=100000/p=0.0001 0.604 M allocs: 0.103 GB 0.602 M allocs: 0.103 GB 1
coloring/symmetric/column/direct/n=100000/p=2.0e-5 0.117 M allocs: 22.8 MB 0.117 M allocs: 22.8 MB 1
coloring/symmetric/column/direct/n=100000/p=5.0e-5 0.287 M allocs: 0.0506 GB 0.287 M allocs: 0.0506 GB 1
coloring/symmetric/column/substitution/n=1000/p=0.002 4.96 k allocs: 0.645 MB 4.89 k allocs: 0.66 MB 0.976
coloring/symmetric/column/substitution/n=1000/p=0.005 9.78 k allocs: 1.17 MB 9.71 k allocs: 1.21 MB 0.967
coloring/symmetric/column/substitution/n=1000/p=0.01 19 k allocs: 2.63 MB 18.9 k allocs: 2.69 MB 0.976
coloring/symmetric/column/substitution/n=100000/p=0.0001 1.91 M allocs: 0.241 GB 1.91 M allocs: 0.25 GB 0.964
coloring/symmetric/column/substitution/n=100000/p=2.0e-5 0.549 M allocs: 0.0607 GB 0.55 M allocs: 0.0623 GB 0.975
coloring/symmetric/column/substitution/n=100000/p=5.0e-5 1.03 M allocs: 0.125 GB 1.03 M allocs: 0.129 GB 0.967
decompress/nonsymmetric/column/direct/n=1000/p=0.002 9 allocs: 0.035 MB 9 allocs: 0.0342 MB 1.02
decompress/nonsymmetric/column/direct/n=1000/p=0.005 9 allocs: 0.0786 MB 9 allocs: 0.0786 MB 1
decompress/nonsymmetric/column/direct/n=1000/p=0.01 9 allocs: 0.153 MB 9 allocs: 0.153 MB 1
decompress/nonsymmetric/column/direct/n=100000/p=0.0001 9 allocs: 16 MB 9 allocs: 16 MB 1
decompress/nonsymmetric/column/direct/n=100000/p=2.0e-5 9 allocs: 3.78 MB 9 allocs: 3.79 MB 0.996
decompress/nonsymmetric/column/direct/n=100000/p=5.0e-5 9 allocs: 8.35 MB 9 allocs: 8.36 MB 0.999
decompress/nonsymmetric/row/direct/n=1000/p=0.002 9 allocs: 0.0347 MB 9 allocs: 0.0352 MB 0.986
decompress/nonsymmetric/row/direct/n=1000/p=0.005 9 allocs: 0.0792 MB 9 allocs: 0.0794 MB 0.997
decompress/nonsymmetric/row/direct/n=1000/p=0.01 9 allocs: 0.154 MB 9 allocs: 0.155 MB 0.991
decompress/nonsymmetric/row/direct/n=100000/p=0.0001 9 allocs: 16 MB 9 allocs: 16 MB 1
decompress/nonsymmetric/row/direct/n=100000/p=2.0e-5 9 allocs: 3.79 MB 9 allocs: 3.79 MB 1
decompress/nonsymmetric/row/direct/n=100000/p=5.0e-5 9 allocs: 8.36 MB 9 allocs: 8.34 MB 1
decompress/symmetric/column/direct/n=1000/p=0.002 9 allocs: 0.0347 MB 9 allocs: 0.0347 MB 1
decompress/symmetric/column/direct/n=1000/p=0.005 9 allocs: 0.0788 MB 9 allocs: 0.0784 MB 1
decompress/symmetric/column/direct/n=1000/p=0.01 9 allocs: 0.152 MB 9 allocs: 0.153 MB 0.994
decompress/symmetric/column/direct/n=100000/p=0.0001 9 allocs: 16 MB 9 allocs: 16 MB 0.999
decompress/symmetric/column/direct/n=100000/p=2.0e-5 9 allocs: 3.79 MB 9 allocs: 3.79 MB 0.999
decompress/symmetric/column/direct/n=100000/p=5.0e-5 9 allocs: 8.37 MB 9 allocs: 8.37 MB 1
decompress/symmetric/column/substitution/n=1000/p=0.002 9 allocs: 0.035 MB 9 allocs: 0.0351 MB 0.997
decompress/symmetric/column/substitution/n=1000/p=0.005 9 allocs: 0.0787 MB 9 allocs: 0.0789 MB 0.997
decompress/symmetric/column/substitution/n=1000/p=0.01 9 allocs: 0.153 MB 9 allocs: 0.154 MB 0.994
decompress/symmetric/column/substitution/n=100000/p=0.0001 9 allocs: 16 MB 9 allocs: 16 MB 1
decompress/symmetric/column/substitution/n=100000/p=2.0e-5 9 allocs: 3.79 MB 9 allocs: 3.8 MB 0.997
decompress/symmetric/column/substitution/n=100000/p=5.0e-5 9 allocs: 8.39 MB 9 allocs: 8.36 MB 1
time_to_load 0.157 k allocs: 11.1 kB 0.157 k allocs: 11.1 kB 1

@amontoison
Copy link
Copy Markdown
Collaborator Author

@gdalle I spend a few hours to implement an optimized version and it's 2-4x time slower 💀

@gdalle
Copy link
Copy Markdown
Member

gdalle commented Sep 29, 2024

I think you're reading the benchmarks wrong? The last column is main over your branch, so your method is actually faster?

@amontoison
Copy link
Copy Markdown
Collaborator Author

amontoison commented Sep 29, 2024

Yep, you are right. I don't know how to read the benchmarks. 🤪
In ADNLPModels.jl, we have an improvement if the ratio is less than 1. The ratio is inverted here.
The results are quite good!

@amontoison amontoison force-pushed the am/acyclic_decompression_csc branch from ab963e0 to a9158ff Compare October 7, 2024 16:49
@amontoison
Copy link
Copy Markdown
Collaborator Author

@gdalle Can you review this PR if you find time?

@gdalle
Copy link
Copy Markdown
Member

gdalle commented Oct 7, 2024

Sure, I'll do it tomorrow

Comment thread src/decompression.jl
Comment thread src/decompression.jl
Comment thread src/decompression.jl
Comment thread src/result.jl
Comment thread src/result.jl
@gdalle gdalle merged commit 3968cfe into main Oct 9, 2024
@gdalle gdalle deleted the am/acyclic_decompression_csc branch October 9, 2024 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmark Run benchmarks on PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Decompression of acyclic coloring with SparseMatrixCSC

2 participants