-
Notifications
You must be signed in to change notification settings - Fork 2
essHIC.dist
stefanofranzini edited this page Sep 27, 2020
·
1 revision
essHIC.dist(filename,metafile=None)
The dist class provides methods to analyze distance matrices. You can apply clustering and dimensional reduction techniques to the distance matrix, as well as assay the quality of the classifier by comparing distances between HiC experiments to their known labeling.
- filename: string
- path of the file containing the distance matrix.
- metfile: string, default=None
- path of the metadata file containing cell type information and a list of the outliers to remove
- metafile: string
- the metadata file containing the cell types of the HiC matrices and a list of outliers.
- pseudo: numpy ndarray
- which experiments are pseudo-replicates according to the metadata file.
- col2lab: dictionary
- dictionary which turns integer "color" numbers into the corresponding cell type label.
- lab2col: dictionary
- dictionary which turns cell type labels into the corresponding integer "color" number.
- colors: numpy ndarray
- list of the integer "color" numbers of the experiments, which encode their cell type.
- mask: numpy ndarray
- which experiments are being removed from the distance matrix according to the metadata file.
- dist: numpy masked ndarray
- masked array which contains the distances between all couples of experiments in the dataset. It masks the outliers according to the metadata file.
- mdist: numpy ndarray
- array which contains distances between all couples of experiments except the outliers given by the metadata file.
- mcol: numpy ndarray
- list of the integer "color" numbers of the experiments, except the outliers.
- mpsd: numpy ndarray
- list of the pseudo-replicate experiments after removing outliers.
- dlist: numpy ndarray
- list of the distances at which the ROC curve has been computed.
- roc: numpy ndarray
- ROC curve values.
- sim_map: numpy ndarray
- affinity map computed from the distance matrix.
- MDSrep: numpy ndarray
- n-dimensional positions of the experiments according to multidimensional scaling embedding.
- clusters: numpy ndarray
- list of the clusters labels.
method | function |
---|---|
print_dist | prints the distance to a file. |
order | orders experiments according to their cell type. |
get_cmap | builds a matrix which is one when two experiments have the same cell-type and zero otherwise. |
get_roc_area | returns the area under the ROC curve. |
get_roc | computes the ROC curve. |
get_gauss_sim | computes a affity matrix from the distances using a gaussian kernel. |
MDS | computes the multidimensional scaling embedding of the distance matrix. |
spec_clustering | computes clusters using spectral clustering. |
hier_clustering | computes hierarchical clustering. |
get_dunn_score | computes dunn score. |
get_quality_score | computes the quality score. |
get_purity_score | computes the purity score. |
plot | plots the distance matrix. |
plot_masked | plots the masked distance matrix. |
plot_squares | plots colored squares over the distance matrix to point out experiments with the same cell-type. |
plot_similarity | plots the affinity matrix. |
plot_roc | plots ROC curve. |
show_hist | plots and dispays a histogram of the distribution of the distances. Same cell type, different cell type, and pseudoreplicate experiments are colored differently. |
show_MDS | plots and displays the MDS embedding of the matrix. |
show_clusters | plots and displays a cartoon of the clusters. |
show_dendrogram | plots and displays a dendrogram of the clusters. |