-
Notifications
You must be signed in to change notification settings - Fork 2
essHIC.hic
essHIC.hic(datafile,from_pairs=True)
The hic class provides a wrapper for an HiC matrix which also contains its metadata. It can also perform some operations on the matrix, such as normalization and cleaning, and compute its spectrum.
It relies on the name format adopted by make_hic and the metadata.txt and chromosomes.txt files it generates to automatically extract metadata about the matrix.
- datafile: string
- Name of the binary file which contains the hic matrix.
- from_pairs: bool, default=True
- If *True*, it will read the matrix from a binary file containing the indexes of the bins and their value.
- matrix: numpy.ndarray
- the HiC matrix.
- indir: string
- the root directory of the data, which should contain all metadata files.
- refname: string
- reference name of the experiment of the HiC matrix.
- norm: string
- kind of normalizationg applied to the matrix.
- chromo: integer
- number of the chromosome of the matrix.
- res: string
- resolution of the matrix.
- length: integer
- size of the matrix at the chosen resolution.
- eig: numpy.ndarray
- array of floats containing the computed eigenvalues of the matrix
- eigv: numpy.ndarray
- array of floats containing the normalized eigenvectors of the matrix.
method | function |
---|---|
resize | computes a lower resolution version of the matrix. |
decay_norm | computes the decay normalization of the matrix. |
clean | remove empty rows and columns from the matrix. |
vc_norm | applies the vanilla coverage normalization to the matrix. |
vcsqrt_norm | applies the square root vanilla coverage normalization to the matrix. |
pearson | computes a matrix of the correlation coefficients between rows of the matrix. |
laplacian | computes laplacian of the matrix. |
reduce | computes the essential matrix. |
get_spectrum | computes the spectrum of the matrix. |
norm_spectrum | applies normalizations to the spectrum. |
print_matrix | prints the matrix to a binary file. |
plot | plots matrix. |
plot_chromosome | plots chromosome cartoon. |
__init__(datafile,from_pairs=True)
initialize self.
resize(self,res_factor)
computes a new matrix at a fraction of the original matrix resolution. Each bin of the new matrix contains the average of the res_factor x res_factor bins of the original matrix.
- res_factor: integer
- factor by which the resolution is decreased.
- none
decay_norm(self)
computes the decay normalization (or observed vs expected normalization ) of the matrix.
- none
clean(self)
removes empty rows and columns from the matrix.
- none
vc_norm(self,res_factor)
computes the vanilla coverage normalization of the matrix.
- none
vcsqrt_norm(self,res_factor)
computes the square root vanilla coverage normalization of the matrix.
- none
pearson(self)
computes the matrix of the correlations between the columns of the original HiC matrix, using the pearson correlation factor.
- none
laplacian(self)
computes the laplacian matrix from the original HiC matrix.
- none
reduce(self,nvec=10,order='abs',norm=1.):
computes the essential matrix from the first nvec eigenspaces of the original HiC matrix. The new matrix is given by
where are the normalized eigenvalues of the original matrix.
- nvec: integer
- number of eigenspaces to use; negative to compute whole spectrum.
- order:{'abs','sgn'}, default='abs'
- order of the eigenspaces. 'abs' orders them in decreasing order according to their eigenvalue absolute value, 'sgn' orders them in decreasing order according to their eigenvalue.
- norm: {'flat','norm','none', float}, default=1.0
- normalization to apply to the spectrum (see **norm_spectrum**).
- none
get_spectrum(self,nvec=-1,order='abs')
computes the first nvec eigenvalues and eigenvector of the HiC matrix, according to the chosen ordering.
- nvec: integer
- number of eigenspaces to compute; negative to compute whole spectrum.
- order:{'abs','sgn'}, default='abs'
- order of the eigenspaces. 'abs' orders them in decreasing order according to their eigenvalue absolute value, 'sgn' orders them in decreasing order according to their eigenvalue.
- eig: numpy.ndarray
- array of the eigenvalues
- eigv: numpy.ndarray
- array of the eigenvectors
norm_spectrum(self,norm=1.0)
applies normalization to the computed eigenvalues. There are various normalization modes.
- flat: flattens the spectrum, the values of the eigenvalues become
- norm: normalizes the computed eigenvalues so that the sum of their squares is 1.
-
none: does not normalize the spectrum and preserves the original eigenvalues.
-
float: if the normalization mode is a number p, it normalizes the eigenvalues so that their norm is 1.
- norm: {'flat','norm','none',float}, default=1.0
- normalization mode.
- none
print_matrix(self,save)
prints the matrix as a binary file containing the indices of non-zero bins and their values.
- save: string
- output file.
- none
plot(self, vmax=2.5, vmin=0.0, cmap='Reds', plotkind='flat', cbar=False, triangle=False)
plots a heatmap matrix according to the specifications. There are several plotting modes:
- flat: plots the heatmap of the matrix.
- log: plots the heatmap of the LogNorm of the matrix (see matplotlib documentation)
- bilog: plots the SymLogNorm of the matrix (see matplotlib documentation)
- vmax: float, default=2.5
- maximum of the heatmap color range.
- vmin: float, default=0.0
- minimum of the heatmap color range.
- cmap: string, default='Reds'
- color map to use.
- plotkind: string, default='flat'
- plotting mode to use.
- cbar: bool, default=False
- if True, plots a color bar next to the heatmap.
- triangle: bool, default=False
- if True, only plots the lower triangular matrix.
- none
plot_chromosome(self,centromere='none',regions='none',bins='none',orientation='horizontal',ticks="none")
plots a cartoon of the chromosome next to or below the heatmap. It can be used to display one dimensional information about the genome side by side with the heatmap.
- centromere: {'none','auto',float}, default='None'
- if it is not 'none', it draws the centromere position on the cartoon. If 'auto' it uses human centromeres.
- regions:{'none',list of dict}, default='none'
- if not 'none' colors regions according to the color indicated by the dictionary. Each region dictionary in the list must contain the key 'bounds' corresponding to a list of two integers (the boundaries of the region), and the key 'color' which contains a color in the hex format.
- bins:{'none',list of dict}, default='none'
- if not 'none' colors each bin indicated according to the chosen color. Each bin dictionary in the list must contain the key 'bins' corresponding to a list of integers (the bins to color), and the key 'color' which contains a color in the hex format.
- orientation:{'horizontal','vertical'}, default='horizontal'
- whether the cartoon should be drown in the horizontal orientation or the vertical orientation.
- ticks:{'none',integer}, default='none'
- set ticks on the cartoon every *ticks* bins.
- none