# Datasets Guide

- The `DatasetLoader` class provides a simple interface to access example multi-omics datasets included in this package.

    - Each dataset is loaded as a collection of **pandas DataFrames**, with table names as keys and the corresponding data as values.

    - Users can explore the structure of any dataset via the `.shape` property, which returns a mapping from table name to `(rows, columns)`. 

In BioNeuralNet `rows` reprecent subjects/patients and `columns` represent omics.

**Five datasets are available out-of-the-box:**

1. **`brca`**:

    - Breast Cancer cohort dataset from The Cancer Genome Atlas (TCGA).
    - Provides comprehensive omics DataFrames: `rna`, `mirna`, `meth`, `pam50`, `clinical`.
    - Full dataset description available at: [https://bioneuralnet.readthedocs.io/TCGA-BRCA](https://bioneuralnet.readthedocs.io/en/latest/notebooks/TCGA-BRCA.html)

2. **`lgg`**:

    - Brain Cancer (GBM + LGG) cohort dataset from The Cancer Genome Atlas (TCGA).
    - Provides the processed, feature-selected omics DataFrames: `rna`, `mirna`, `meth`, `clinical`, and `target` (histological subtype).
    - Full dataset description available at: [https://bioneuralnet.readthedocs.io/TCGA-GBMLGG](https://bioneuralnet.readthedocs.io/en/latest/notebooks/TCGA-GBMLGG.html)

3. **`kipan`**:

    - Kidney Cancer (KIRC + KIRP + KICH) cohort dataset from The Cancer Genome Atlas (TCGA).
    - Provides the processed, feature-selected omics DataFrames: `rna`, `mirna`, `meth`, `clinical`, and `target` (histological subtype).
    - Full dataset description available at: [https://bioneuralnet.readthedocs.io/TCGA-KIPAN](https://bioneuralnet.readthedocs.io/en/latest/notebooks/TCGA-KIPAN.html)

4. **`example`**:

    - Synthetic dataset designed for testing and demonstration.
    - Contains small DataFrames: `X1`, `X2`, `Y`, `clinical_data`
    - Useful for quick checks of package functionality.

5. **`monet`**:

    - Multi-omics benchmark dataset from the **Multi-Omics NETwork Analysis Workshop (MONET)**.
    - Includes multiple DataFrames: `gene_data`, `mirna_data`, `phenotype`, `rppa_data`, `clinical_data`
    - Workshop details: [https://coloradosph.cuanschutz.edu/research-and-practice/centers-programs/cida/learning/multi-omics-network-analysis-workshop](https://coloradosph.cuanschutz.edu/research-and-practice/centers-programs/cida/learning/multi-omics-network-analysis-workshop)

In [1]:
from bioneuralnet.datasets import DatasetLoader
import pandas as pd

for name in ["brca", "lgg", "kipan", "example", "monet"]:
    ds = DatasetLoader(name)
    print(f"{name} shapes:\n")
    for tbl, (rows, cols) in ds.shape.items():
        print(f"{tbl}: {rows} x {cols}")
    print("\n")

brca shapes:

mirna: 769 x 503
pam50: 769 x 1
clinical: 769 x 118
rna: 769 x 1687
meth: 769 x 1605


lgg shapes:

mirna: 511 x 548
target: 511 x 1
clinical: 511 x 13
rna: 511 x 2127
meth: 511 x 1823


kipan shapes:

mirna: 658 x 472
target: 658 x 1
clinical: 658 x 19
rna: 658 x 2284
meth: 658 x 2102


example shapes:

X1: 358 x 500
X2: 358 x 100
Y: 358 x 1
clinical_data: 358 x 6


monet shapes:

gene_data: 107 x 5039
mirna_data: 107 x 789
phenotype: 106 x 1
rppa_data: 107 x 175
clinical_data: 107 x 5




## TCGA-GBMLGG

In [2]:
from bioneuralnet.datasets import DatasetLoader

lgg_data = DatasetLoader("lgg")

dna_meth = lgg_data.data["meth"]
rna = lgg_data.data["rna"]
mirna = lgg_data.data["mirna"]
target = lgg_data.data["target"]
clinical = lgg_data.data["clinical"]

display(dna_meth.iloc[:, :5])
display(rna.iloc[:, :5])
display(mirna.iloc[:, :5])
display(target.head())
display(clinical.iloc[:, :5])

Unnamed: 0_level_0,LASS4,IQGAP1,LOC150776,PDK1,EDEM3
patient,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
TCGA-CS-4938,-1.387171,-0.677831,-2.963635,-1.499770,-1.783686
TCGA-CS-4941,-2.111112,-1.346418,-3.884122,-2.028158,-1.747969
TCGA-CS-4942,-1.547153,-0.571504,-2.869444,-1.859408,-1.804540
TCGA-CS-4943,-1.035440,-0.785394,-3.078968,-2.000622,-1.622944
TCGA-CS-4944,-1.596503,-0.578981,-3.306971,-1.621030,-1.671006
...,...,...,...,...,...
TCGA-WY-A85A,-1.474161,-0.777362,-3.467333,-1.753156,-1.824650
TCGA-WY-A85B,-1.123543,0.188325,-3.135227,-1.523323,-1.711966
TCGA-WY-A85C,-1.568864,-0.590647,-3.188085,-1.693936,-1.768378
TCGA-WY-A85D,-1.304971,-0.469292,-3.233014,-1.663317,-1.945929


Unnamed: 0_level_0,ZFYVE20_64145,CSRP2_1466,SUPT5H_6829,USP6NL_9712,LTC4S_4056
patient,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
TCGA-CS-4938,11.434513,11.807614,11.688333,8.788577,4.847476
TCGA-CS-4941,10.801449,11.011596,11.644230,8.906849,6.828902
TCGA-CS-4942,11.370552,11.326134,12.340740,8.806553,6.602022
TCGA-CS-4943,11.504806,9.938341,12.315672,8.676744,6.520776
TCGA-CS-4944,10.856026,10.205406,12.345653,8.641854,5.917823
...,...,...,...,...,...
TCGA-WY-A85A,11.329665,9.863635,12.208914,9.443412,5.618738
TCGA-WY-A85B,11.343484,11.053101,12.161549,9.094003,7.382373
TCGA-WY-A85C,11.325764,10.274564,12.412241,9.064337,4.811014
TCGA-WY-A85D,10.988021,12.870207,12.136973,8.574267,6.876940


Unnamed: 0_level_0,hsa_let_7a_1,hsa_let_7a_2,hsa_let_7a_3,hsa_let_7b,hsa_let_7c
patient,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
TCGA-CS-4938,12.622353,13.632728,12.651613,14.208930,14.376942
TCGA-CS-4941,11.809808,12.815815,11.820061,13.047853,11.955006
TCGA-CS-4942,11.113995,12.128618,11.165523,12.481790,11.858545
TCGA-CS-4943,10.887723,11.894748,10.928146,12.111898,11.737758
TCGA-CS-4944,11.794522,12.785454,11.816688,13.308937,13.326909
...,...,...,...,...,...
TCGA-WY-A85A,12.742532,13.735795,12.759925,13.285474,13.177777
TCGA-WY-A85B,12.890789,13.895735,12.899740,13.601504,13.596046
TCGA-WY-A85C,12.900637,13.903053,12.908947,13.607849,12.954439
TCGA-WY-A85D,12.663025,13.666785,12.685216,13.643331,13.520782


Unnamed: 0_level_0,target
patient,Unnamed: 1_level_1
TCGA-CS-4938,0
TCGA-CS-4941,0
TCGA-CS-4942,0
TCGA-CS-4943,0
TCGA-CS-4944,0


Unnamed: 0_level_0,years_to_birth,vital_status,days_to_death,days_to_last_followup,tumor_tissue_site
Patient_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
TCGA-CS-4938,31.0,0,,3574.0,central nervous system
TCGA-CS-4941,67.0,1,234.0,,central nervous system
TCGA-CS-4942,44.0,1,1335.0,,central nervous system
TCGA-CS-4943,37.0,1,1106.0,,central nervous system
TCGA-CS-4944,50.0,0,,1828.0,central nervous system
...,...,...,...,...,...
TCGA-WY-A85A,20.0,0,,1320.0,central nervous system
TCGA-WY-A85B,24.0,0,,1393.0,central nervous system
TCGA-WY-A85C,36.0,0,,1426.0,central nervous system
TCGA-WY-A85D,60.0,0,,1147.0,central nervous system


## TCGA-KIPAN

In [None]:
from bioneuralnet.datasets import DatasetLoader

kipan_data = DatasetLoader("kipan")

dna_meth = kipan_data.data["meth"]
rna = kipan_data.data["rna"]
mirna = kipan_data.data["mirna"]
target = kipan_data.data["target"]
clinical = kipan_data.data["clinical"]

display(dna_meth.iloc[:, :5])
display(rna.iloc[:, :5])
display(mirna.iloc[:, :5])
display(target.head())
display(clinical.iloc[:, :5])

Unnamed: 0_level_0,C19orf22,C15orf41,APOL5,ADCY10,MIR762
patient,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
TCGA-2K-A9WE,0.362406,-0.228636,1.823698,1.240816,-2.290156
TCGA-2Z-A9J1,0.107703,-0.631482,-0.145229,0.943370,-3.210824
TCGA-2Z-A9J2,0.416326,-0.142891,1.035523,1.129828,-1.963174
TCGA-2Z-A9J3,0.565581,-0.481517,0.623115,1.191905,-1.830991
TCGA-2Z-A9J5,0.404795,-0.411796,1.240255,1.217618,-2.055250
...,...,...,...,...,...
TCGA-Y8-A898,0.438930,-0.301226,1.359156,1.288669,-1.917179
TCGA-Y8-A8RY,0.439501,-0.478384,1.086509,1.580170,-2.041550
TCGA-Y8-A8RZ,0.590624,-0.597997,0.743282,1.490705,-1.910828
TCGA-Y8-A8S0,0.622166,-0.609677,1.017005,1.429859,-1.775636


Unnamed: 0_level_0,ARAP3_64411,GPRC5C_55890,RASGEF1C_255426,PLAC9_219348,C9orf169_375791
patient,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
TCGA-2K-A9WE,7.044557,11.846287,5.431249,3.846283,4.402456
TCGA-2Z-A9J1,5.894144,13.086847,3.631104,3.446680,5.747207
TCGA-2Z-A9J2,6.141980,13.203909,1.309118,5.396553,3.494953
TCGA-2Z-A9J3,5.057325,12.857734,5.976707,2.234195,5.348059
TCGA-2Z-A9J5,5.858956,12.255214,5.507484,4.693899,4.703483
...,...,...,...,...,...
TCGA-Y8-A898,5.924912,12.791161,5.314860,4.339950,4.861171
TCGA-Y8-A8RY,7.617959,12.314035,2.499116,4.402812,4.720237
TCGA-Y8-A8RZ,5.817999,13.075635,3.072569,0.985136,3.359451
TCGA-Y8-A8S0,4.691071,12.561155,4.712766,2.734135,4.308310


Unnamed: 0_level_0,hsa_let_7a_1,hsa_let_7a_2,hsa_let_7a_3,hsa_let_7b,hsa_let_7c
patient,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
TCGA-2K-A9WE,12.933499,13.933025,12.938528,12.861969,11.474055
TCGA-2Z-A9J1,12.535658,13.536437,12.531655,12.710724,10.355773
TCGA-2Z-A9J2,11.832278,12.838388,11.840725,11.038718,8.360210
TCGA-2Z-A9J3,12.557410,13.548913,12.564003,12.089286,11.283549
TCGA-2Z-A9J5,12.812767,13.813166,12.820922,12.838727,11.498841
...,...,...,...,...,...
TCGA-Y8-A898,11.899975,12.872997,11.899770,12.518892,10.930260
TCGA-Y8-A8RY,12.147298,13.149681,12.152644,11.780008,10.966539
TCGA-Y8-A8RZ,11.481351,12.460949,11.497565,10.073405,8.586229
TCGA-Y8-A8S0,12.866868,13.863945,12.875689,12.807527,12.631500


Unnamed: 0_level_0,target
patient,Unnamed: 1_level_1
TCGA-2K-A9WE,1
TCGA-2Z-A9J1,1
TCGA-2Z-A9J2,1
TCGA-2Z-A9J3,1
TCGA-2Z-A9J5,1


Unnamed: 0_level_0,years_to_birth,vital_status,days_to_death,days_to_last_followup,tumor_tissue_site
Patient_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
TCGA-2K-A9WE,53.0,0,,214.0,kidney
TCGA-2Z-A9J1,71.0,0,,2298.0,kidney
TCGA-2Z-A9J2,71.0,0,,1795.0,kidney
TCGA-2Z-A9J3,67.0,1,1771.0,,kidney
TCGA-2Z-A9J5,80.0,0,,3050.0,kidney
...,...,...,...,...,...
TCGA-Y8-A898,69.0,0,,475.0,kidney
TCGA-Y8-A8RY,63.0,0,,769.0,kidney
TCGA-Y8-A8RZ,55.0,0,,205.0,kidney
TCGA-Y8-A8S0,58.0,0,,183.0,kidney


## TCGA-BRCA

In [4]:
from bioneuralnet.datasets import DatasetLoader

brca = DatasetLoader("brca")
rna = brca.data["rna"]
mirna = brca.data["mirna"]
meth = brca.data["meth"]
pam50 = brca.data["pam50"]
clinical = brca.data["clinical"]

display(rna.iloc[:, :5])
display(mirna.iloc[:, :5])
display(meth.iloc[:, :5])
display(pam50.iloc[:, :5])
display(clinical.iloc[:, :5])

Unnamed: 0_level_0,OSR1_130497,ARRB1_408,RGS22_26166,SPNS2_124976,ZNF680_340252
patient,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
TCGA-3C-AAAU,3.417434,9.593811,9.441951,7.337854,8.413068
TCGA-3C-AALI,3.706044,7.944446,1.928427,8.723833,7.950801
TCGA-3C-AALJ,3.765460,8.193957,2.180498,7.573980,8.224889
TCGA-3C-AALK,7.458101,8.453001,3.534136,8.581324,8.922770
TCGA-4H-AAAK,6.967012,8.706918,4.015266,7.324267,8.314234
...,...,...,...,...,...
TCGA-WT-AB44,7.775446,7.179717,2.873912,9.924990,8.199838
TCGA-XX-A899,7.597686,11.792323,5.513153,10.070808,8.398323
TCGA-XX-A89A,7.543688,9.484033,5.812796,8.864885,9.226318
TCGA-Z7-A8R5,8.102061,9.648889,3.163467,10.413356,8.210767


Unnamed: 0_level_0,hsa_let_7a_1,hsa_let_7a_2,hsa_let_7a_3,hsa_let_7b,hsa_let_7c
patient,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
TCGA-3C-AAAU,13.129765,14.117933,13.147714,14.595135,8.414890
TCGA-3C-AALI,12.918069,13.922300,12.913194,14.512657,9.646536
TCGA-3C-AALJ,13.012033,14.010002,13.028483,13.419612,9.312455
TCGA-3C-AALK,13.144697,14.141721,13.151281,14.667196,11.511431
TCGA-4H-AAAK,13.411684,14.413518,13.420481,14.438548,11.693927
...,...,...,...,...,...
TCGA-WT-AB44,13.375715,14.366671,13.369827,14.514024,11.926315
TCGA-XX-A899,14.036155,15.036341,14.043313,14.339503,12.361761
TCGA-XX-A89A,13.679569,14.684855,13.691463,14.198207,12.684212
TCGA-Z7-A8R5,12.962088,13.966350,12.984897,14.320660,11.980246


Unnamed: 0_level_0,OR4N5,KIAA0196,RTF1,C19orf59,ZNF655
patient,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
TCGA-3C-AAAU,-3.017241,-2.467556,-1.592611,0.441655,-2.574662
TCGA-3C-AALI,-2.474721,-2.396231,-1.366665,-0.325130,-2.190856
TCGA-3C-AALJ,-0.962452,-2.455018,-1.658474,0.677827,-2.224305
TCGA-3C-AALK,0.328606,-2.314768,-1.730472,-0.494891,-2.318108
TCGA-4H-AAAK,-2.592710,-2.444838,-1.546156,-0.459881,-2.313006
...,...,...,...,...,...
TCGA-WT-AB44,0.807513,-2.489575,-1.408843,-0.803121,-2.185531
TCGA-XX-A899,-1.632685,-2.256371,-1.707611,-0.587766,-2.238850
TCGA-XX-A89A,-1.559722,-2.290714,-1.717666,-0.772229,-2.366888
TCGA-Z7-A8R5,-0.490567,-2.065821,-1.456703,-0.927756,-2.037512


Unnamed: 0_level_0,pam50
patient,Unnamed: 1_level_1
TCGA-3C-AAAU,3
TCGA-3C-AALI,2
TCGA-3C-AALJ,4
TCGA-3C-AALK,3
TCGA-4H-AAAK,3
...,...
TCGA-WT-AB44,3
TCGA-XX-A899,3
TCGA-XX-A89A,3
TCGA-Z7-A8R5,3


Unnamed: 0_level_0,synchronous_malignancy,ajcc_pathologic_stage,days_to_diagnosis,laterality,created_datetime
patient,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
TCGA-3C-AAAU,No,Stage X,0.0,Left,
TCGA-3C-AALI,No,Stage IIB,0.0,Right,
TCGA-3C-AALJ,No,Stage IIB,0.0,Right,
TCGA-3C-AALK,No,Stage IA,0.0,Right,
TCGA-4H-AAAK,No,Stage IIIA,0.0,Left,
...,...,...,...,...,...
TCGA-WT-AB44,No,Stage IA,0.0,Left,
TCGA-XX-A899,No,Stage IIIA,0.0,Right,
TCGA-XX-A89A,No,Stage IIB,0.0,Left,
TCGA-Z7-A8R5,No,Stage IIIA,0.0,Left,


## Example 1: Synthetic dataset

In [None]:
from bioneuralnet.datasets import DatasetLoader

Example = DatasetLoader("example")
omics1 = Example.data["X1"]
omics2= Example.data["X2"]
phenotype = Example.data["Y"]
clinical = Example.data["clinical_data"]

display(omics1.iloc[:, :5])
display(omics2.iloc[:, :5])
display(phenotype.iloc[:, :5])
display(clinical.iloc[:, :5])

Unnamed: 0,Gene_1,Gene_2,Gene_3,Gene_4,Gene_5
Samp_1,22.485701,40.353720,31.025745,20.847206,26.697293
Samp_2,37.058850,34.052233,33.487020,23.531461,26.754628
Samp_3,20.530767,31.669623,35.189567,20.952544,25.018826
Samp_4,33.186888,38.480880,18.897097,31.823300,34.049383
Samp_5,28.961981,41.060494,28.494956,18.374495,30.815238
...,...,...,...,...,...
Samp_354,24.520652,28.595409,31.299666,32.095379,33.659730
Samp_355,31.252789,28.988087,29.574195,31.189288,32.098841
Samp_356,24.894826,25.944887,30.852641,26.705158,30.102546
Samp_357,17.034337,38.574705,25.095201,37.062442,35.417758


Unnamed: 0,Mir_1,Mir_2,Mir_3,Mir_4,Mir_5
Samp_1,15.223913,17.545826,15.784719,14.891983,10.348205
Samp_2,16.306965,16.672830,13.361529,14.488549,12.660905
Samp_3,16.545119,16.735005,14.617472,17.845267,13.822790
Samp_4,13.986899,16.207432,16.293078,17.725286,12.300565
Samp_5,16.338332,17.393869,16.397925,15.853725,13.387675
...,...,...,...,...,...
Samp_354,15.065065,16.079830,14.635616,17.013845,11.612843
Samp_355,15.997576,15.448951,15.355566,16.501752,11.701778
Samp_356,15.206862,14.395378,16.218001,16.044955,13.650741
Samp_357,14.474129,15.482863,15.512549,15.136613,14.531277


Unnamed: 0,phenotype
Samp_1,235.067423
Samp_2,253.544991
Samp_3,234.204994
Samp_4,281.035429
Samp_5,245.447781
...,...
Samp_354,236.120451
Samp_355,222.572359
Samp_356,268.472285
Samp_357,235.808167


Unnamed: 0_level_0,Age,Gender,BMI,Chronic_Bronchitis,Emphysema
PatientID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Samp_1,78,0,31.2,1,1
Samp_2,68,1,19.2,1,0
Samp_3,54,1,19.3,0,1
Samp_4,47,1,36.2,0,0
Samp_5,60,1,26.2,0,1
...,...,...,...,...,...
Samp_354,71,0,23.0,1,0
Samp_355,62,1,25.5,0,1
Samp_356,61,0,21.1,1,0
Samp_357,64,0,37.6,0,0


## Monet: Set from the **Multi-Omics NETwork Analysis Workshop (MONET)**, Univ. of Colorado Anschutz  

In [None]:
from bioneuralnet.datasets import DatasetLoader

monet = DatasetLoader("monet")
gene = monet.data["gene_data"]
mirna = monet.data["mirna_data"]
phenotype = monet.data["phenotype"]
rppa = monet.data["rppa_data"]
clinical = monet.data["clinical_data"]

display(gene.iloc[:, :5])
display(mirna.iloc[:, :5])
display(phenotype.iloc[:, :5])
display(rppa.iloc[:, :5])
display(clinical.iloc[:, :5])

Unnamed: 0,A2ML1,AACSL,AADAC,AADAT,AATK
0,0.466671,0.074845,0.990309,-0.410873,1.897562
1,-0.524465,-0.146727,-0.735206,-0.628456,0.170962
2,-0.029879,-0.626509,-0.735206,-0.677892,0.020060
3,0.674895,-0.626509,-0.330409,-0.662162,-0.911966
4,-0.110607,-0.626509,-0.735206,-0.848542,0.042645
...,...,...,...,...,...
102,0.999600,1.343979,-0.735206,1.742674,0.253103
103,-0.919337,-0.626509,-0.735206,-1.380461,-0.899019
104,-0.606702,-0.626509,0.497658,-0.717505,-0.625122
105,1.911346,0.021380,-0.166624,0.785542,0.344953


Unnamed: 0,hsa-let-7a-1,hsa-let-7a-2,hsa-let-7a-3,hsa-let-7b,hsa-let-7c
0,-0.832527,-0.851616,-0.837155,-1.079659,-0.181270
1,0.229155,0.249696,0.234295,0.859289,-0.057729
2,0.414268,0.417023,0.408913,0.635059,1.195203
3,-0.855214,-0.869152,-0.862713,-1.955447,-0.572552
4,1.365310,1.356252,1.351750,1.259095,-0.316760
...,...,...,...,...,...
102,-1.402001,-1.401125,-1.349961,-1.534386,1.231456
103,2.551277,2.547046,2.563191,1.054769,0.981436
104,0.182138,0.188730,0.191094,-1.060615,0.345907
105,0.289489,0.292470,0.297778,-0.197850,1.040270


Unnamed: 0,0
0,1
1,0
2,0
3,0
4,0
...,...
101,0
102,1
103,0
104,0


Unnamed: 0,YWHAE,YWHAZ,EIF4EBP1,TP53BP1,ARAF
0,-0.357998,0.099812,-1.067285,-0.412211,-0.357998
1,-0.055031,-0.517445,0.032633,-0.743096,-0.055031
2,-0.137863,-0.559690,0.302764,-0.968388,-0.137863
3,-0.170726,-0.028206,-0.341461,0.282581,-0.170726
4,-1.430765,-0.138087,-0.545894,-0.616864,-1.430765
...,...,...,...,...,...
102,-0.708685,-0.778813,1.623365,-0.090612,-0.708685
103,0.261442,-0.407563,-0.567735,-0.186919,0.261442
104,1.350866,1.461061,-1.159541,-1.674874,1.350866
105,0.179510,-0.300029,-1.048938,-0.621680,0.179510


Unnamed: 0,overall_survival,status,years_to_birth,race,radiation_therapy
0,3015,0,37,blackorafricanamerican,yes
1,2348,1,73,white,yes
2,3011,0,41,asian,yes
3,3283,0,67,white,no
4,1873,0,42,white,no
...,...,...,...,...,...
102,2329,0,63,white,yes
103,1004,1,74,white,yes
104,984,0,46,white,yes
105,867,0,44,white,no


## Generating an Omics Network
At its core, BioNeuralNet leverages Graph Neural Networks to power downstream applications via learned embeddings. It supports a range of standard graph construction techniques for omics and other biological entities:

- **Cosine similarity / RBF kernel graphs** (`gen_similarity_graph`)  
- **Pearson / Spearman correlation graphs** (`gen_correlation_graph`)  
- **Soft-threshold (WGCNA-style) graphs** (`gen_threshold_graph`)  
- **Gaussian k-NN graphs** (`gen_gaussian_knn_graph`)  
- **Mutual information graphs** (`gen_mutual_info_graph`)  
- **Graphical Lasso (sparse inverse covariance) graphs** (`gen_lasso_graph`)  
- **Minimum Spanning Tree (MST) graphs** (`gen_mst_graph`)  
- **Shared Nearest Neighbor (SNN) graphs** (`gen_snn_graph`)

For more details on all of these utilities, see the [utils documentation](https://bioneuralnet.readthedocs.io/en/latest/utils.html).  
