Skip to content

Conversation

@zommiommy
Copy link
Collaborator

@zommiommy zommiommy commented Oct 11, 2025

We had a BatchIterator that would both store the batch on disk and load it.

In this PR, I replaced it with a generic trait BatchCodec that explicitly defines the encoding and decoding of batches, allowing for different implementations. The compression format of BatchIterator is now done by GapsCodec.

This had the nice consequence that now SortPair, ParSortPair, transpose, and simplify don't have to be aware of BitSerializer, BitDeserializer, and memory-mapping.

Moreover, we can now select at compile-time the codes to use (it used to be always Gamma), and there's a new implementation called GroupedGapsCodec that, instead of encoding each arc as <src-gap><dst-gap>, it encodes groups of arcs with the same source as <src-gap><outdegree><dst-gap1><dst-gap2>....

Here's the effect of these changes on a couple of graphs, running RUSTFLAGS="-Ctarget-cpu=native" cargo run --release -- transform transpose $GRAPH ${GRAPH}-t.
The codec codes are: GapsCodec<SRC_CODE, DST_CODE>, and GroupedGaps<OUTDEGREE_CODE, SRC_CODE, DST_CODE>.

Graph Codec bits / arc time (s)
enwiki-2024 GapsCodec<Gamma, Gamma> 16.97 7.921
enwiki-2024 GapsCodec<Gamma, Delta> 14.23 8.273
enwiki-2024 GapsCodec<Unary, Pi(2)> 12.99 8.253
enwiki-2024 GapsCodec<Gamma, ExpGolomb(1)> 16.13 8.588
enwiki-2024 GroupedGaps<Gamma, Gamma, Gamma> 16.24 7.500
enwiki-2024 GroupedGaps<Delta, Gamma, Delta> 13.51 8.055
enwiki-2024 GroupedGaps<ExpGolomb(3), Gamma, Delta> 13.46 8.033
enwiki-2024 GroupedGaps<ExpGolomb(3), Golomb(2), Pi(2)> 12.25 7.99
enwiki-2024 GroupedGaps<ExpGolomb(2), ExpGolomb(1), ExpGolomb(1) 15.33 8.052
twitter-2010 GapsCodec<Gamma, Gamma> 19.13 80.598
twitter-2010 GapsCodec<Gamma, Delta> 15.57 84.414
twitter-2010 GapsCodec<Unary, Pi(2)> 14.33 84.634
twitter-2010 GapsCodec<Gamma, ExpGolomb(1)> 18.28 85.141
twitter-2010 GroupedGaps<Gamma, Gamma, Gamma> 18.32 79.156
twitter-2010 GroupedGaps<Delta, Gamma, Delta> 14.77 83.364
twitter-2010 GroupedGaps<ExpGolomb(3), Gamma, Delta> 14.73 81.955
twitter-2010 GroupedGaps<ExpGolomb(3), Golomb(2), Pi(2)> 13.49 81.050
twitter-2010 GroupedGaps<ExpGolomb(2), ExpGolomb(1), ExpGolomb(1) 17.43 81.351
eu-2015 GapsCodec<Gamma, Gamma> 5.55, 2'949.72
eu-2015 GapsCodec<Gamma, Delta> 6.05 2'861.18
eu-2015 GapsCodec<Unary, Pi(2)> 6.27 2'842.616
eu-2015 GapsCodec<Gamma, ExpGolomb(1)> 4.64 2'846.274
eu-2015 GroupedGaps<Gamma, Gamma, Gamma> 4.65 2'735.416
eu-2015 GroupedGaps<Delta, Gamma, Delta> 5.15 2'933.321
eu-2015 GroupedGaps<ExpGolomb(3), Gamma, Delta> 5.14 3'428.260
eu-2015 GroupedGaps<ExpGolomb(3), Golomb(2), Pi(2)> 5.20 2'886.386
eu-2015 GroupedGaps<ExpGolomb(2), ExpGolomb(1), ExpGolomb(1) 3.72 2'898.69

The time measurements are the average of two runs, except for eu-2015 where it's only one run. In the first two cases, all arcs fit in a single batch, which might explain why Unary is the best code for src; however, on larger graphs, I doubt it will remain optimal. Instead, eu-2015 is split into 30 batches of 3'136'103'168 arcs each.

Given the results above, I believe that GroupedGaps<ExpGolomb(3), Gamma, Delta> could be a suitable default that utilises universal codes.

@vigna
Copy link
Owner

vigna commented Oct 11, 2025

That's amazing but those two graphs have very specific structure, and they're small. I'd try a could of billion-node web graphs such as eu or gsh and a small SWH dataset before reaching any conclusion.

@zommiommy
Copy link
Collaborator Author

Ok, running the benchmarks on eu-2015, another "BatchCodec" I wanted to experiment with is just to write the src and dst as diffs as Little-Endian 64-bit integers in a Zstd stream

@vigna
Copy link
Owner

vigna commented Oct 11, 2025

Good idea. Also, at least in theory, with wildly different number of partitions/processors, as that affects the distribution, too...

impl<L> RadixKey for Triple<L> {
const LEVELS: usize = 16;

fn get_level(&self, level: usize) -> u8 {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be arithmetized if pairs are [usize; 2] instead of (usize, usize), and this is how things were working in the previous code. Is there any particular reason to change it?

Copy link
Collaborator Author

@zommiommy zommiommy Oct 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason is to have Triple to be #[repr(transparent)] so the transmute is safe. Before we were transmuting between (usize, size, L) and

pub struct Triple<L> {
    pair: [usize; 2],
    label: L
}

We are not guaranteed that they have the same memory layout if the labels have alignment bigger than 16 bytes. Example https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=14c83315faf1a8830cb53aec0af4c803

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this version avoids the bound-check https://godbolt.org/z/vKjh1aqj5

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But by adding an unreachable_unchecked, I can remove the bound check, and now the arithmetized version is slightly better https://godbolt.org/z/Tfrsb69dj

@vigna
Copy link
Owner

vigna commented Oct 12, 2025

Given the results above, I believe that GroupedGaps<ExpGolomb(3), Gamma, Delta> could be a suitable default that utilises universal codes.

Frankly, it is difficult to read any hard evidence from this data, except that with very specifically selected codes grouped gaps perform better. But we don't have the luxury. I mean, at some point we're trying again to compress a graph.

GroupedGaps<Delta, Gamma, Delta> and GroupedGaps<Gamma, Gamma, Gamma> seems to perform pretty well in a range of cases.

These two lines are inexplicable to me:

eu-2015 | GroupedGaps<ExpGolomb(3), Gamma, Delta> | 5.14 | 3'428.260
eu-2015 | GroupedGaps<ExpGolomb(3), Golomb(2), Pi(2)> | 5.20 | 2'886.386

Golomb is really slow. Albeit the dispatch system might remapped to Rice(1).

@vigna
Copy link
Owner

vigna commented Oct 13, 2025

Or we can place a bet on the graph being web-like and use π₂ for the successors. That has a significant impact on space and it's fast. That's a bit like the ζ₃ default assumption we make for compression (even if technically the distribution is similar to ζ₄).

Maybe we should try this on a small SWH dataset to see what happens, at least for some combination of ɣ, δ and π₂ for successors. It would be really weird if we perform worse on the very only source of financing. 😂

@vigna
Copy link
Owner

vigna commented Oct 13, 2025

The docs are amazing BTW. Albeit here and there AI went a bit too far 😂.

@zommiommy
Copy link
Collaborator Author

These two lines are inexplicable to me:

eu-2015 | GroupedGaps<ExpGolomb(3), Gamma, Delta> | 5.14 | 3'428.260
eu-2015 | GroupedGaps<ExpGolomb(3), Golomb(2), Pi(2)> | 5.20 | 2'886.386

Golomb is really slow. Albeit the dispatch system might remapped to Rice(1).

Yeah, Indeed it's not a Golomb(2) but a Rice(1), here's why https://github.com/vigna/dsi-bitstream-rs/blob/34ef68344607414875d9d9ac81f081e7b68fe8df/src/dispatch/static.rs#L104C6-L104C20

And yeah, I don't think the change of code makes it that much slower. I think it's an outlier, and I need to re-run the experiment. Perhaps there was an issue with my system.

@zommiommy
Copy link
Collaborator Author

zommiommy commented Oct 15, 2025

While transposing the 2025-05-18 swh graph, using batches of 4'217'583'744 arcs (~48GB), I get:

Codec batch 1 bits/arc batch 2 bits / arc batch 3 bits/arc
Gaps<Gamma, Gamma> 8.46 8.43 7.85
Gaps<Gamma, Delta> 7.92 7.90 7.47
GroupedGaps<Gamma, Gamma, Gamma> 7.77 7.69 7.12
GroupedGaps<Delta, Gamma, Delta> 7.29 7.21 6.77
GroupedGaps<ExpGolomb(3), Gamma, Delta> 7.27 7.20 6.76
GroupedGaps<ExpGolomb(1), ExpGolomb(1), Omega> 6.62 6.56 6.00

Another detail is that all these experiments are conducted on LLP-ed graphs, which definitely affects the distribution of gaps and thus the optimal codes.

As you suggested, I think GroupedGaps<Delta, Gamma, Delta> is good enough as a default.

@vigna
Copy link
Owner

vigna commented Oct 15, 2025

Ok now I'm confused. There are the only combinations you tried or the best ones?

@zommiommy
Copy link
Collaborator Author

zommiommy commented Oct 15, 2025

The ones I tried, on the first three batches. I didn't compute the best ones yet

@zommiommy
Copy link
Collaborator Author

The best codes for the grouped gaps on the first few batches of the latest swh graphs are:

Batch 1

Outdegree stats
  Code: ExpGolomb(1) Size:   826822146
  Code: ExpGolomb(2) Size:   983652894
  Code: Zeta(2)      Size:  1008979093
  Code: Gamma        Size:  1046122192
  Code: Zeta(1)      Size:  1046122192
  Code: ExpGolomb(0) Size:  1046122192
  Code: Omega        Size:  1078801532
  Code: ExpGolomb(3) Size:  1201024790
  Code: Zeta(3)      Size:  1220366868
  Code: Pi(2)        Size:  1233353382
Src stats
  Code: ExpGolomb(1) Size:   687969746
  Code: Zeta(2)      Size:   922633714
  Code: ExpGolomb(2) Size:   926210806
  Code: Omega        Size:   940796998
  Code: Gamma        Size:   946742420
  Code: Zeta(1)      Size:   946742420
  Code: ExpGolomb(0) Size:   946742420
  Code: Zeta(3)      Size:  1178647606
  Code: Pi(2)        Size:  1179326251
  Code: ExpGolomb(3) Size:  1184344760
Dst stats
  Code: Omega        Size: 26389279419
  Code: Zeta(2)      Size: 26456991952
  Code: ExpGolomb(1) Size: 27001073964
  Code: Pi(2)        Size: 28012185117
  Code: Delta        Size: 28529143231
  Code: Zeta(3)      Size: 28577104495
  Code: ExpGolomb(2) Size: 29485168268
  Code: Gamma        Size: 30777499162
  Code: Zeta(1)      Size: 30777499162
  Code: ExpGolomb(0) Size: 30777499162

Batch 2

Outdegree stats
  Code: ExpGolomb(1) Size:   652113812
  Code: ExpGolomb(2) Size:   819015090
  Code: Zeta(2)      Size:   834866657
  Code: Gamma        Size:   857479784
  Code: Zeta(1)      Size:   857479784
  Code: ExpGolomb(0) Size:   857479784
  Code: Omega        Size:   877467199
  Code: ExpGolomb(3) Size:  1024090268
  Code: Zeta(3)      Size:  1036255400
  Code: Pi(2)        Size:  1044289759
Src stats
  Code: ExpGolomb(1) Size:   591930392
  Code: Zeta(2)      Size:   800522374
  Code: ExpGolomb(2) Size:   802174478
  Code: Omega        Size:   817631278
  Code: Gamma        Size:   820035360
  Code: Zeta(1)      Size:   820035360
  Code: ExpGolomb(0) Size:   820035360
  Code: Zeta(3)      Size:  1024576369
  Code: Pi(2)        Size:  1025245511
  Code: ExpGolomb(3) Size:  1027365132
Dst stats
  Code: Omega        Size: 26424759521
  Code: Zeta(2)      Size: 26534666897
  Code: ExpGolomb(1) Size: 26920562118
  Code: Pi(2)        Size: 28139074459
  Code: Delta        Size: 28539926175
  Code: Zeta(3)      Size: 28619985395
  Code: ExpGolomb(2) Size: 29398177596
  Code: Gamma        Size: 30762480430
  Code: Zeta(1)      Size: 30762480430
  Code: ExpGolomb(0) Size: 30762480430

Batch 3

Outdegree stats
  Code: ExpGolomb(1) Size:   701958214
  Code: ExpGolomb(2) Size:   823630262
  Code: Zeta(2)      Size:   848584593
  Code: Gamma        Size:   881486766
  Code: Zeta(1)      Size:   881486766
  Code: ExpGolomb(0) Size:   881486766
  Code: Omega        Size:   912592540
  Code: ExpGolomb(3) Size:   998793882
  Code: Zeta(3)      Size:  1017715602
  Code: Pi(2)        Size:  1029976088
Src stats
  Code: ExpGolomb(1) Size:   546081978
  Code: Zeta(2)      Size:   748727998
  Code: ExpGolomb(2) Size:   752494884
  Code: Omega        Size:   758899516
  Code: Gamma        Size:   764892138
  Code: Zeta(1)      Size:   764892138
  Code: ExpGolomb(0) Size:   764892138
  Code: Zeta(3)      Size:   965445015
  Code: Pi(2)        Size:   965453262
  Code: ExpGolomb(3) Size:   970791896
Dst stats
  Code: Omega        Size: 24037769582
  Code: ExpGolomb(1) Size: 24387505744
  Code: Zeta(2)      Size: 24550143998
  Code: Pi(2)        Size: 26644930965
  Code: Delta        Size: 26736720380
  Code: Zeta(3)      Size: 26909967899
  Code: ExpGolomb(2) Size: 27455899062
  Code: Gamma        Size: 28367620096
  Code: Zeta(1)      Size: 28367620096
  Code: ExpGolomb(0) Size: 28367620096

@vigna vigna merged commit 6f920f6 into main Nov 14, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants