Skip to content

epigeneticstoocean/paper-gonad-meth

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

General DNA methylation patterns and environmentally-induced differential methylation in the eastern oyster (Crassostrea virginica)

Background

Epigenetic modification, specifically DNA methylation, is one possible mechanism for transgenerational plasticity. Before inheritance of methylation patterns can be characterized we need a better understanding of how environmental change modifies the parental epigenome. Specifically, methylation patterns should be understood in reproductive tissue. We examined the effect of ocean acidification on Eastern oyster (Crassostrea virginica) reproductive tissue.

This repository is associated with this manuscript on bioRXiv.

Results

  • Differentially methylated loci (DML)
  • Figures
    • Figure 1: Frequency distribution of methylation ratios for CpG loci in C. virginica gonad tissue DNA subjected to MBD enrichment. A total of 4,304,257 CpGs with at least 5x coverage summed across all ten samples were characterized. Loci were considered methylated if they were at least 50% methylated, sparsely methylated loci were 10-50% methylated, and unmethylated loci were 0-10% methylated (code here).
    • Figure 2: Proportion of CpG loci within genomic features. All CpGs are every dinucleotide in the C. virginica genome. Methylated CpGs refers to a dinucleotide with a methylation level of at least 50%. (code here).
    • Figure 3: Heatmap of DML in C. virginica reproductive tissue. Samples in control pCO2 conditions are represented by grey, and samples in elevated pCO2 conditions are represented by a black bar. Loci with higher percent methylation are represented by darker colors. A logistic regression identified 598 DML, defined as individual CpG dinucleotide with at least a 50% methylation change between treatment and control groups, and a q-value < 0.01 based on correction for false discovery rate with the SLIM method. The density of DML at each percent methylation value is represented in the heatmap legend (code here).
    • Figure 4: Principal Components Analysis of a) all CpG loci with 5x coverage across samples and b) DML. Methylation status of individual CpG loci explained 29.2% of variation between samples when considering all CpG loci. Methylation status of DML explained 57.1% of sample variation (code here).
    • Figure 5: Distribution of DML among chromosomes and genes. (a) Number of DML normalized by number of CpG in each chromosome (bars) and number of genes (line) in each C. virginica chromosome. (b) Number of genes with various numbers of DML per gene (1-5). Most genes that contained DML only had 1 DML. (c) Proportion of hypermethylated, hypomethylated DML in genes with various numbers of DML per gene (1-5). Mixed refers to a classificaiton of a gene that has both hypermethylated and hypomethylated DML (code here).
    • Figure 6: Proportion CpG loci within putative promoters, untranslated regions (UTR), exons, introns, transposable elements, and intergenic regions for MBD-enriched CpGs and differentially methylated loci (DML). The distribution of DML in C. virginica gonad tissue in response to ocean acidification differed from distribution of MBD-enriched loci with 5x coverage across control and treatment samples (Contingency test; χ2 = 401.09, df = 6, P-value < 2.2e-16). (code here).
    • Figure 7: Distribution of hyper- and hypomethylated DML along a hypothetical gene. The scaled position of a DML within a gene was calculated by dividing the base pair position of the DML by gene length. Counts of hypermethylated DML are plotted above the x-axis, and hypomethylated DML counts are below the x-axis (code here).
    • Figure 8: Biological processes represented by all genes used in enrichment background (% Genes) and those with DML (% Genes with DML). Gene ontology categories with similar functions are represented by the same color. Genes may be involved in multiple biological processes. No gene ontologies were significantly enriched (code here).

Repository Structure

In-depth file descriptions can be found in the README.md for each subdirectory.

  • data: Raw data used for project analyses, as well as links to data files.
  • code: Bash scripts, R Markdown files, and Jupyter notebooks used to analyze data.
  • genome-feature-tracks: Genome feature tracks used for analyses.
  • analyses: Output from multiple analyses. Each analysis will be in its own subdirectory.

External links