Skip to content

zskong/Multi-view-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

120 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“š Multi-View Datasets Repository

A curated collection of source code and datasets for Multi-view Learning & Clustering.

Datasets Count Maintenance


πŸ“– Introduction

This repository is maintained as a comprehensive collection of source code and benchmark datasets for multi-view clustering research.

βš–οΈ Rights: Explanation and maintenance rights belong to the author.


πŸ—ΊοΈ Dataset Navigation

Jump to datasets based on sample size:


🟒 Small-scale Datasets

Sample size < 1,000

Dataset Samples Views Clusters Dimensions Source Note

CESC
(Cervical squamous cell carcinoma)
124 4 3 2000/2000/311/219 Link
Multi-omics
(mDNA,RNA,miRNA,RPPA)
Yale 165 3 15 4096/3304/6750 Link
3-Sources ⭐ 169 3 6 3560/3631/3068 Link
TwoMoon 200 2 2 2/2 Synthetic dataset
webkb 203 3 4 1703/230/230 Link
Sonar 208 3 2 20/20/20
MSRC 210 5 7 24/576/512/256/254 Link
MSRCV1 210 6 7 1302/48/512/100/256/210 Link
GBM (Glioblastoma multiforme) 248 3 4 534/5000/12042 Link Multi-omics(Gene expression,miRNA,DNA)
LGG (Lower gradeg lioma) 267 4 3 2000/2000/333/209 Link Multi-omics
ThreeRing 300 2 3 2/2 Synthetic dataset
Dermatology 366 2 6 11/22 Link
BRCA (Breast adenocar cinoma) 398 4 4 2000/2000/278/212 Link Multi-omics
ORL 400 3 40 4096/3304/6750 Link
ORL 400 4 40 512/59/864/254 Link
NGs ⭐ 500 3 5 2000/2000/2000 Link
20newsgroups 500 3 5 2000/2000/2000 Link
Caltech101_3view 512 3 11 254/512/36 Link
Forest 523 2 4 9/18 Link
BBCSport ⭐ 544 2 5 3183/3203 Link
Notting-Hill ⭐ 550 3 5 2000/3304/6750 Link
Prokaryotic 551 3 4 438/3/393 Link
Reuters 600 5 6 Need to deal Link
synthetic3d 600 3 3 3/3/3 Synthetic dataset
CUB 600 2 10 1024/300 Link
Movie 617 2 17 1878/1398 Link
YaleB 650 3 10 2500/3304/6750 Link
BBC4view_685 685 4 5 4659/4633/4665/4684 Link
WikipediaArticles 693 2 10 128/10 Link
ProteinFold 694 12 27 27/.../27 Link
Oxford 800 3 4 1764/10/128 Link

🟑 Medium-scale Datasets

Sample size 1,000 - 10,000

Dataset Samples Views Clusters Dimensions Source Note
WebKB2 1051 2 2 2949/334 Link
Reuters 1200 5 6 2000/2000/2000/2000/2000 Link
Flower17 ⭐ 1360 7 17 1360/.../1360 Link
COIL20_pca 1440 3 20 30/19/30 Link
COIL20 ⭐ 1440 3 20 4096/3304/6750 Link
FuCOIL20 1440 3 20 1024/1024/324 Link
RGB-D ⭐ 1449 2 13 2048/300 Link
Caltech101-7 ⭐ 1474 6 7 48/40/254/1984/512/928 Link
GRAZ02 1476 6 4 512/32/256/500/500/680 Link
Reuters_21578 1500 5 6 21531/24892/34251/15506/11547 Link
Youtube 1592 2 11 - Link
100Leaves 1600 3 100 64/64/64 Link
UCI-Digits ⭐ 2000 3 10 64/76/216 Link
HW2sources 2000 2 10 786/256 Link
Handwritten 2000 6 10 64/76/216/6/240/47 Link
Mfeat 2000 6 10 64/76/216/6/240/47 Link
NUS_WIDE 2000 5 31 65/226/145/74/129 Link
MNIST 2000 3 10 30/9/30 Link
LandUse-21 ⭐ 2100 3 21 20/59/40 Link
Caltech101-20 ⭐ 2386 6 20 48/40/254/1984/512/928 Link
YaleB_Extend (visualization) 2424 5 38 1024/1024/1024/1024/1024 Link
NUS 2400 6 12 64/144/73/128/225/500 Link
2V_BDGP 2500 2 5 1750/79 Link
BDGP_fea 2500 3 5 1000/500/250 Link
Toydata_5 (visualization) 2500 2 5 2/2 Synthetic dataset
Scene 2688 4 8 512/432/256/48 Link
Cora 2708 2 7 1433/2708 Link
Wiki_fea 2866 2 10 128/10 Link
Toydata_3 (visualization) 3000 2 3 2/2 Synthetic dataset
CiteSeer 3312 2 6 3312/3703 Link
ImageNet 4000 3 4 1764/10/128 Link
Scene15 ⭐ 4485 3 15 20/59/40 Link
NH_p4660 4660 3 5 2000/3304/6750 Link
2V_MNIST_USPS (visualization) 5000 2 10 784/784 Link
MITIndoor 5360 4 67 1770/3600/1240/4096 Link
VOC ⭐ (PASCAL VOC 2007) 5649 2 20 512/399 Link
CCV ⭐ 6773 3 20 20/20/20 Link
Caltech101-all 8677 4 101 3540/4800/1240/2048 Link
Caltech101-all_fea ⭐ 9144 5 102 48/40/254/512/928 Link
Fashion (visualization) 10000 3 10 784/784/784 Link
MNIST (small dimension) 10000 3 10 30/9/30 Link
Hdigit 10000 2 10 784/256 -
Mfeat 10000 2 10 784/256 Link
CIFAR10_deep 10000 4 10 1000/1000/1000/2048 -

πŸ”΄ Large-scale Datasets

Sample size > 10,000

Dataset Samples Views Clusters Dimensions Source Note
SUNRGBD 10335 2 45 4096/4096 -
ALOI100 ⭐ 10800 4 100 77/13/64/125 Link
Animal 11673 4 20 2689/2000/2001/2000 Link
STL-10 ⭐ 13000 3 10 1024/512/2048 Link
Reuters 18758 5 6 21531/24892/34251/15506/11547 Link
Cifar10-4 20000 3 4 324/10/128 Link
NUSWIDEOBJ ⭐ 30000 5 31 65/226/145/74/129 Link
MoisyMNIST ⭐ 30000 2 10 784/784 Link
AwA_fea ⭐ 30475 4 20 2688/2000/252/2000/2000/2000 Link
Caltech256_fea 30607 3 257 1024/512/2048 Link
VGGFace2-50 ⭐ 34027 4 50 944/576/512/640 Link
Cifar10-8 40000 3 8 324/10/128 Link
CIFAR10 50000 3 10 1024/512/2048 Link
CIFAR100 50000 3 100 1024/512/2048 Link
Noisy Mnist 50000 2 10 784/784/784 Link
fmnist ⭐ 60000 3 10 1280/512/512 Link
MNIST 60000 3 10 342/1024/64 | 784/784 Link
VGGFace4-100 72283 4 200 944/576/512/640 Link
tinyimage 100000 3 200 1280/512/512 Link
YoutubeFace ⭐ 101499 5 31 64/512/64/647/838 Link
YouTubeFace50 126054 4 50 944/576/512/640 Link
YouTube 152549 3 65 1024/768/1152 Link

πŸ”— Acknowledgements

Special thanks to the following sources for their contributions to the multi-view community:

About

some examples for me

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages