Skip to content

hongtaicao/data-sets-surf-repository

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 

Repository files navigation

LDBC benchmark data sets

The LDBC benchmark data sets are stored under SURF's CWI repositories.

💡 The LDBC SNB Business Intelligence (BI) workload's data sets are stored in Cloudflare R2. See the links to the BI data sets.

💡 The LDBC SNB Interactive v2 workload's data sets are stored in Cloudflare R2. See the links to the Interactive v2 data sets and update streams.

Usage

The data sets are stored on tape, therefore, you may have to stage them before they can be downloaded. To do so, visit the repository of the data set and click "Request" for offline files. Staging a 20 GB file takes approx. 3-5 minutes, while staging a 200 GB one takes approx. 10-15 minutes.

To decompress, use curl and zstd.

curl --silent --fail set_url_here | tar -xv --use-compress-program=unzstd

We provide the download-data-set.sh script, which attempts to download the data set and stages it to disk if necessary. Replace the data_set_url with one of the URLs linked below in this README (right click and select Copy Link Address).

./download-data-set.sh data_set_url

Example:

./download-data-set.sh https://repository.surfsara.nl/datasets/cwi/snb/files/social_network-csv_basic-longdateformatter/social_network-csv_basic-longdateformatter-sf0.1.tar.zst

LDBC Graphalytics

📥 Repository

Graph and validation data sets

data set number of vertices number of edges
cit-Patents.tar.zst 3774768 16518947
com-friendster.tar.zst 65608366 1806067135
datagen-7_5-fb.tar.zst 633432 34185747
datagen-7_6-fb.tar.zst 754147 42162988
datagen-7_7-zf.tar.zst 13180508 32791267
datagen-7_8-zf.tar.zst 16521886 41025255
datagen-7_9-fb.tar.zst 1387587 85670523
datagen-8_0-fb.tar.zst 1706561 107507376
datagen-8_1-fb.tar.zst 2072117 134267822
datagen-8_2-zf.tar.zst 43734497 106440188
datagen-8_3-zf.tar.zst 53525014 130579909
datagen-8_4-fb.tar.zst 3809084 269479177
datagen-8_5-fb.tar.zst 4599739 332026902
datagen-8_6-fb.tar.zst 5667674 421988619
datagen-8_7-zf.tar.zst 145050709 340157363
datagen-8_8-zf.tar.zst 168308893 413354288
datagen-8_9-fb.tar.zst 10572901 848681908
datagen-9_0-fb.tar.zst 12857671 1049527225
datagen-9_1-fb.tar.zst 16087483 1342158397
datagen-9_2-zf.tar.zst 434943376 1042340732
datagen-9_3-zf.tar.zst 555270053 1309998551
datagen-9_4-fb.tar.zst 29310565 2588948669
datagen-sf10k-fb.tar.zst 33484375 2912009743
datagen-sf3k-fb.tar.zst 100218750 9404822538
dota-league.tar.zst 61170 50870313
example-directed.tar.zst 10 17
example-undirected.tar.zst 9 12
graph500-22.tar.zst 2396657 64155735
graph500-23.tar.zst 4610222 129333677
graph500-24.tar.zst 8870942 260379520
graph500-25.tar.zst 17062472 523602831
graph500-26.tar.zst 32804978 1051922853
graph500-27.tar.zst 63081040 2111642032
graph500-28.tar.zst 121242388 4236163958
graph500-29.tar.zst 232999630 8493569115
graph500-30.tar.zst 447797986 17022117362
kgs.tar.zst 832247 17891698
twitter_mpi.tar.zst 52579678 1963263508
wiki-Talk.tar.zst 2394385 5021410

Graphs as sparse matrices in Matrix Market format

data set number of vertices number of edges
matrix-market/cit-Patents.tar.zst 3774768 16518947
matrix-market/com-friendster.tar.zst 65608366 1806067135
matrix-market/datagen-7_5-fb-bool.tar.zst 633432 34185747
matrix-market/datagen-7_5-fb-fp64.tar.zst 633432 34185747
matrix-market/datagen-7_6-fb-bool.tar.zst 754147 42162988
matrix-market/datagen-7_6-fb-fp64.tar.zst 754147 42162988
matrix-market/datagen-7_7-zf-bool.tar.zst 13180508 32791267
matrix-market/datagen-7_7-zf-fp64.tar.zst 13180508 32791267
matrix-market/datagen-7_8-zf-bool.tar.zst 16521886 41025255
matrix-market/datagen-7_8-zf-fp64.tar.zst 16521886 41025255
matrix-market/datagen-7_9-fb-bool.tar.zst 1387587 85670523
matrix-market/datagen-7_9-fb-fp64.tar.zst 1387587 85670523
matrix-market/datagen-8_0-fb-bool.tar.zst 1706561 107507376
matrix-market/datagen-8_0-fb-fp64.tar.zst 1706561 107507376
matrix-market/datagen-8_1-fb-bool.tar.zst 2072117 134267822
matrix-market/datagen-8_1-fb-fp64.tar.zst 2072117 134267822
matrix-market/datagen-8_2-zf-bool.tar.zst 43734497 106440188
matrix-market/datagen-8_2-zf-fp64.tar.zst 43734497 106440188
matrix-market/datagen-8_3-zf-bool.tar.zst 53525014 130579909
matrix-market/datagen-8_3-zf-fp64.tar.zst 53525014 130579909
matrix-market/datagen-8_4-fb-bool.tar.zst 3809084 269479177
matrix-market/datagen-8_4-fb-fp64.tar.zst 3809084 269479177
matrix-market/datagen-8_5-fb-bool.tar.zst 4599739 332026902
matrix-market/datagen-8_5-fb-fp64.tar.zst 4599739 332026902
matrix-market/datagen-8_6-fb-bool.tar.zst 5667674 421988619
matrix-market/datagen-8_6-fb-fp64.tar.zst 5667674 421988619
matrix-market/datagen-8_7-zf-bool.tar.zst 145050709 340157363
matrix-market/datagen-8_7-zf-fp64.tar.zst 145050709 340157363
matrix-market/datagen-8_8-zf-bool.tar.zst 168308893 413354288
matrix-market/datagen-8_8-zf-fp64.tar.zst 168308893 413354288
matrix-market/datagen-8_9-fb-bool.tar.zst 10572901 848681908
matrix-market/datagen-8_9-fb-fp64.tar.zst 10572901 848681908
matrix-market/datagen-9_0-fb-bool.tar.zst 12857671 1049527225
matrix-market/datagen-9_0-fb-fp64.tar.zst 12857671 1049527225
matrix-market/datagen-9_1-fb-bool.tar.zst 16087483 1342158397
matrix-market/datagen-9_1-fb-fp64.tar.zst 16087483 1342158397
matrix-market/datagen-9_2-zf-bool.tar.zst 434943376 1042340732
matrix-market/datagen-9_2-zf-fp64.tar.zst 434943376 1042340732
matrix-market/datagen-9_3-zf-bool.tar.zst 555270053 1309998551
matrix-market/datagen-9_3-zf-fp64.tar.zst 555270053 1309998551
matrix-market/datagen-9_4-fb-bool.tar.zst 29310565 2588948669
matrix-market/datagen-9_4-fb-fp64.tar.zst 29310565 2588948669
matrix-market/datagen-sf10k-fb-bool.tar.zst 33484375 2912009743
matrix-market/datagen-sf10k-fb-fp64.tar.zst 33484375 2912009743
matrix-market/datagen-sf3k-fb-bool.tar.zst 100218750 9404822538
matrix-market/datagen-sf3k-fb-fp64.tar.zst 100218750 9404822538
matrix-market/dota-league-bool.tar.zst 61170 50870313
matrix-market/dota-league-fp64.tar.zst 61170 50870313
matrix-market/example-directed-bool.tar.zst 10 17
matrix-market/example-directed-fp64.tar.zst 10 17
matrix-market/example-undirected-bool.tar.zst 9 12
matrix-market/example-undirected-fp64.tar.zst 9 12
matrix-market/graph500-22.tar.zst 2396657 64155735
matrix-market/graph500-23.tar.zst 4610222 129333677
matrix-market/graph500-24.tar.zst 8870942 260379520
matrix-market/graph500-25.tar.zst 17062472 523602831
matrix-market/graph500-26.tar.zst 32804978 1051922853
matrix-market/graph500-27.tar.zst 63081040 2111642032
matrix-market/graph500-28.tar.zst 121242388 4236163958
matrix-market/graph500-29.tar.zst 232999630 8493569115
matrix-market/graph500-30.tar.zst 447797986 17022117362
matrix-market/kgs-bool.tar.zst 832247 17891698
matrix-market/kgs-fp64.tar.zst 832247 17891698
matrix-market/twitter_mpi.tar.zst 52579678 1963263508
matrix-market/wiki-Talk.tar.zst 2394385 5021410

Social Network Benchmark (SNB) Interactive v1

📥 Repository

SNB Interactive v1: CsvBasic serializer using LongDateFormatter

These data sets were incorrectly generated, see the related issue, hence we removed their links. The correctly generated data sets will be deployed in the autumn of 2022.

SNB Interactive v1: CsvBasic serializer using StringDateFormatter

SNB Interactive v1: CsvComposite serializer using LongDateFormatter

These data sets were correctly generated unlike the other data sets using the LongDateFormatter. Feel free to use them.

SNB Interactive v1: CsvComposite serializer using StringDateFormatter

SNB Interactive v1: CsvCompositeMergeForeign serializer using LongDateFormatter

These data sets were incorrectly generated, see the related issue, hence we removed their links. The correctly generated data sets will be deployed in the autumn of 2022.

SNB Interactive v1: CsvCompositeMergeForeign serializer using StringDateFormatter

SNB Interactive v1: CsvMergeForeign serializer using LongDateFormatter

These data sets were incorrectly generated, see the related issue, hence we removed their links. The correctly generated data sets will be deployed in the autumn of 2022.

SNB Interactive v1: CsvMergeForeign serializer using StringDateFormatter

SNB Interactive v1: TTL serializer

Substitution parameters

All: substitution_parameters.tar.zst

Update streams

SF0.1

SF0.3

SF1

SF3

SF10

SF30

SF100

SF300

SF1000


Labelled Subgraph Query Benchmark (LSQB)

📥 Repository

Merged FK

Projected FK


SIGMOD 2014 Programming Contest

📥 Repository

Data sets used in the original contest

New data sets


Social Network Benchmark (SNB) Business Intelligence (BI)

📥 Repository

TBA

Releases

No releases published

Packages

No packages published

Languages

  • Shell 100.0%