Skip to content

This repository contains the data and code created during the writing assignment of my master's program (Molecular and cellular life sciences).

Notifications You must be signed in to change notification settings

Suirotras/MCLS_writing_assignment_data

Repository files navigation

MCLS_writing_assignment_data

This repository contains the data and code created during the writing assignment of my master's program (Molecular and cellular life sciences).

Title writing assignment: DNA methylation signals in blood as proxies for environmental exposures

The repository contains the following elements:

  • Data folder: contains the data used for the writing assignment.

    • Main data folder is Critical_appraisal_sheet.xlsx, as all data is derived from this excel sheet. The excel sheet contains multiple worksheet tabs:

      • articles: This tab contains the critical appraisal data for included studies that performed EWAS (and sometimes also predictions) on environmental exposures.

      • prediction: This tab displays the critical appraisal data for included studies that only performed DNAm-based predictions for environmental exposures.

      • articles+prediction for tsv: This tab simply combines the critical appraisal data from the articles and prediction worksheet tabs.

      • EWAS list: This tab displays all the conducted EWAS, and how much CpG-sites (DMPs) and DMRs were identified per EWAS.

      • prediction list: This tab shows data on all the DNAm-based predictions conducted by the included studies. This data includes the number of CpG-sites used by the predictor (Prediction CpGs), and the AUC values (for discovery and replication sets). One study only reported R2 values for their prediction model, so this value is also included.

    • A number of files derive from Critical_appraisal_sheet.xlsx:

      • Critical_appraisal_counts.tsv: tab-delimited file containing the summarized critical appraisal counts, generated by Critical_appraisal_analysis.R.

      • Critical_appraisal_EWAS_list.tsv: tab-delimited file. It represents the combination of the "articles+prediction for tsv" and "EWAS list" worksheet tabs.

      • Critical_appraisal_EWAS_list.xlsx: Identical to Critical_appraisal_EWAS_list.tsv, but as an excel file.

      • Critical_appraisal_sheet.tsv: tab-delimited file, representing the "articles+prediction for tsv" worksheet tab.

      • EWAS_list.tsv: tab-delimited file, representing the "EWAS list" worksheet tab.

      • Prediction_list.tsv: tab-delimited file, representing the "prediction list" worksheet tab.

      • Summary_DMP.tsv: Summary statistics for the number of CpG-sites (DMPs) found per environmental exposure category. Generated by Data_visualization.R.

      • Summary_DMR.tsv: Summary statistics for the number of DMRs found per environmental exposure category. Generated by Data_visualization.R.

  • Figures folder: Contains all the figures made for the writing assignment.

    • Almost all figures were generated by Critical_appraisal_analysis.R or Data_visualization.R. The two exceptions are Critical_appraissal_schematic.png and Critical_appraissal_schematic.svg, which were made using inkscape.
  • Rayyan_selection procedure folder: Contains information on the paper inclusion procedure, conducted in rayyan.ai. Contains two subfolders:

    • Selection_all_studies_found folder: Contains information about the included and excluded studies.

      • articles.csv: Comma-seperated file that lists for every study if it was included or excluded. Also, it lists the exclusion reason. More specifically, this information can be found in the notes column.

      • customizations_log.csv: Log file, which recorded all changes and their timestamps.

    • Selection_included_studies folder: Contains information about only the included studies.

  • Critical_appraisal_analysis.html and Critical_appraisal_analysis.R: Rscript used for the generation of figures and tables related to the critical appraisal.

  • Data_visualization.html and Data_visualization.R: Rscript used for the generation of figures and tables related to EWAS environmental exposure categories, number of CpG-sites (DMPs) and number of DMRs.

  • Helper_script.R: An Rscript containing some useful code to more easily access specific data on individual EWAS and environmental exposures. Helpful during the writing of the writing assignment.

  • DNA_double_helix_AI2.jpg: "highly succesful" AI-generated art that is supposed to resemble a DNA double helix with DNA methylation, but in the style of a Vincent van Gogh painting. Generated by lexica.art

DNA_double_helix_template.jpg: Template image of DNA methylation, to force lexica.art to at least generate something that somewhat resembles a DNA double helix. Retrieved from www.spectrumnews.org.

About

This repository contains the data and code created during the writing assignment of my master's program (Molecular and cellular life sciences).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published