Skip to content

Input data must be batch-effect correct ? Raw ? #34

@Simon-chevolleau

Description

@Simon-chevolleau

I use NetRep with a batch-effect corrected datasets (using fastMNN), those data are corrected, centered and scaled during fastMNN process. For the adjacency, I simply use the function from WGCNA and for the network:

# SAVE NETWORK AS EDGE/VERTICES FILE
edge_list <- as.data.frame(TOM)
edge_list$gene1 <- rownames(edge_list)

# RESHAPE THE DATA FROM WIDE TO LONG FORMAT
edge_list <- reshape(edge_list,
                     varying = names(edge_list)[-ncol(edge_list)],
                     v.names = "correlation",
                     timevar = "gene2",
                     times = names(edge_list)[-ncol(edge_list)],
                     idvar = "gene1",
                     direction = "long")

edge_list <- edge_list[edge_list$gene1 != edge_list$gene2, ]

edge_list$module1 <- gene_modules[edge_list$gene1, "Module"]
edge_list$module2 <- gene_modules[edge_list$gene2, "Module"]

# only keep correlations > 0.1
selection <- edge_list$correlation > 0.1
table(selection)
edge_list <- edge_list[selection, ]

edge_list <- unique(edge_list)

Results from those data are quite weird, see my plots (calculated on reference dataset for the first one and test dataset for the second).
Do you know what I did wrong there ? should I keep higher correlation for the network ? How can I determine the best threshold for correlation? Should I use raw data with batch inside it ?

Thank for the package ! It looks very promising.
Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions