GitHub - translational-audiology-lab/STOP_bloodscreen: R script for analysis of O'link data from STOP

This has the information and description of the code here, which were used to generate results from given data. This is written in Markdown (.md). So, it looks better in a Markdown viewer such as Typora.

Project info

Title : Blood biomarkers in chronic tinnitus
Principal Investigator: Christopher Cederroth (christopher.cederroth@ki.se)
Organization : Karolinska Institutet
NBIS experts : Mun-Gwan Hong (mungwan.hong@nbis.se)

Aim of the project

Copied from Redmine

The overarching aim of this project is to identify novel blood biomarkers for tinnitus, which is part of the aims from the UNITI EU project and among the first research priority from the BTA. Biomarkers for tinnitus are currently lacking and these are essential in order to i) understand the pathophysiology of tinnitus; ii) stratify patients; iii) provide read-outs for clinical trials. Similarity in the neuropathophysiology between tinnitus and pain led us to hypothesize that inflammation is involved in tinnitus. A preliminary analysis on 548 cases of constant tinnitus and 548 age/sex matched controls showed 10% of the proteins had a strong correlation with smoking, and that 50% of proteins were strongly correlated with age and with self-reported hearing ability (Bonferroni adjusted P < 0.05). We performed a linear regression and adjusted for age, sex, BMI, smoking status, sample collection site, and hearing ability. This allowed us to identify 5 proteins with close to significant adjusted p values, namely FGF-21, MCP-4, CXCL9, GDNF, and MCP-1 (Fig. 1). These proteins were found higher in the tinnitus group and did not associate with stress, anxiety or depression, nor temporomandibular joint disorder, headache or hyperacusis. The goal of this project is to expand the analysis to the neurological panel from O'link on the same samples.

General info

Header

Every script has a header at the top of it. It has a description about the script as well as basic info. All input and output file names are listed in the header.

Neighbor folders

Those input and output files are supposed to be stored in the following neighbor folders. Every script is written to be executed on this current folder accessing those folders using relative path.

    ../data/raw_internal       # raw input data
           /                   # intermediate derived data from raw data
           /raw_external       # data from external sources, e.g. public database
             
    ../reports                 # all the reports
    ../results/                # main output folder
              /figures
              /tables

R and Rmd scripts

All R functions from installed packages except those listed below, are called with double colons (::) to specify source package clearly.

Packages in R-core (e.g. base, stats, utils, ...),
Packages in tidyverse (e.g. dplyr, ggplot2, purrr, ...)
kabelExtra
janitor
Packages of the function unable to be accessed using ::
- ggfortify # ggplot2::autoplot for PCA
- lme4 # predict for class merMod

All R code followed the tidyverse R style guide.

Restore environment

Different software environment, e.g. different versions of R packages, from the one during the development of the code, can yield unexpected error message or generate dissimilar analysis results. To make the results reproducible, the software environment around the code are saved. It can be restored ahead of using the code following the steps below. Here, the process was facilitated by conda management system and renv R package. The environment handling software conda is assumed to be pre-installed. Please refer to Conda for questions about the installation.

Open Terminal (Mac / Linux) or Command line (Windows)
Navigate to the directory where this README.md file is located using the cd command.
Run the commands below. It creates the same Conda environment and activate it (Note : The nbis5797 can be chosen as your preference for your project name).
```
 conda env create -n nbis5797 -f environment.yml
 conda activate nbis5797
```
Run R and execute this command in R to restore the same R environment.
```
 renv::init()
 renv::restore()
```

Please refer to "introduction to renv", for the R environment handling package, renv.

The Conda environment was stored in this file.

environment.yml

The R environment was saved in the folder and file below. They are supposed to be managed only by the renv package under R.

renv/
renv.lock

Files

Master script files

Run these lines below in an R console, which will generate all intermediate data files for data analysis and the final report. Please make sure the input files are ready in expected folders. The files were listed below the code lines.

source("master_data_generation.R")   # data generation
source("master_heavy_analyses.R")   # data generation
rmarkdown::render("report-5797-Tinnitus.Rmd")    # main report
rmarkdown::render("report-5797-Tinnitus-Inflammation.Rmd")  # Inflamation panel
rmarkdown::render("report-5797-Tinnitus-Neurology.Rmd")     # Neurology panel

../data/raw_internal/20210504/All STOP questionnaire data 180118_v14_BloodAnalysis_v2.xlsx
../data/raw_internal/20210505/Variable Key STOP Frågeformulär LG 2018.xls
../data/raw_internal/20210505/Variablekey_ESITSQ.xlsx
../data/raw_internal/20210504/cederroth_tinnitus_protein_profiling_NPX_belowLOD.xlsx
../data/raw_internal/20210505/Tinnitus2_Cederroth_NPX_below_LOD.xlsx

The dates in the folder names above indicate when the files were transferred to NBIS. If multiple versions of one file were transferred, please use the one delivered on that day.

Data generation

`master_data_generation.R`

The master file for R data generation. This executes following scripts in the proper order.

gen-s1-clinc.v01.R : Read clinical info data
gen-s1-olink_proteomic.v01.R : Read Olink proteomic data
gen-s1-olink_proteomic.v02.R : QC and preprocess - details in report-QC-Olink_proteomic_data.Rmd
gen-s1-clinc.v02.R : Limit samples to those with proteomic data

Other data generating scripts

Heavy analyses

`master_heavy_analyses.R`

The master script for computationally heavy analyses.

anal-lm_resample_t.R : T statistics from resampling

Report writing

`report-5797-Tinnitus.Rmd`

The master R markdown file to create the final report. Individual chapters were written in separate R markdown files listed below. This master file runs all of them in the right order after adding the information about the project and NBIS support.

report-sample_info.Rmd : About samples and clinical info
report-QC-clinical_info.Rmd : Clinical info table QC
report-QC-Olink_proteomic_data.Rmd : About QC and preprocessing of Olink proteomic data
report-overview_after_QC.Rmd : Overview of the data after QC
report-assoc-overall_samples.Rmd : Association of individual proteins - overall samples
report-assoc-by_sex.Rmd : Association of individual proteins - females/males only

`report-5797-Tinnitus-Inflammation.Rmd`

A main R markdown for analysis for Olink inflammation panel data. This reuses some the files above.

`report-5797-Tinnitus-Neurology.Rmd`

A main R markdown for analysis for Olink neurology panel data. This reuses some the files above.

Auxiliary files

styles.css : CSS style, used by Rmarkdown (.Rmd) files for reports
utils.R : A collection of useful R functions
anal-assoc_protein-utils.R : Functions used in the reports, mainly for association tests
citations_in_report.bib : Bibliography for the reports

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
renv		renv
.Rprofile		.Rprofile
.gitignore		.gitignore
README.md		README.md
anal-assoc_protein-utils.R		anal-assoc_protein-utils.R
anal-lm_resample_t.R		anal-lm_resample_t.R
citations_in_report.bib		citations_in_report.bib
environment.yml		environment.yml
export-assoc_results.R		export-assoc_results.R
export_clean_data.R		export_clean_data.R
gen-s1-clinical.v01.R		gen-s1-clinical.v01.R
gen-s1-clinical.v02.R		gen-s1-clinical.v02.R
gen-s1-olink_proteomic.v01.R		gen-s1-olink_proteomic.v01.R
gen-s1-olink_proteomic.v02.R		gen-s1-olink_proteomic.v02.R
master_data_generation.R		master_data_generation.R
master_heavy_analyses.R		master_heavy_analyses.R
renv.lock		renv.lock
report-5797-Tinnitus-Inflammation.Rmd		report-5797-Tinnitus-Inflammation.Rmd
report-5797-Tinnitus-Neurology.Rmd		report-5797-Tinnitus-Neurology.Rmd
report-5797-Tinnitus.Rmd		report-5797-Tinnitus.Rmd
report-QC-Olink_proteomic_data.Rmd		report-QC-Olink_proteomic_data.Rmd
report-QC-clinical_info.Rmd		report-QC-clinical_info.Rmd
report-assoc-by_sex.Rmd		report-assoc-by_sex.Rmd
report-assoc-overall_samples.Rmd		report-assoc-overall_samples.Rmd
report-overview_after_QC.Rmd		report-overview_after_QC.Rmd
report-sample_info.Rmd		report-sample_info.Rmd
styles.css		styles.css
tinnitus.Rproj		tinnitus.Rproj
utils.R		utils.R

translational-audiology-lab/STOP_bloodscreen

Folders and files

Latest commit

History

Repository files navigation

Project info

Aim of the project

General info

Header

Neighbor folders

R and Rmd scripts

Restore environment

Files

Master script files

Data generation

master_data_generation.R

Other data generating scripts

Heavy analyses

master_heavy_analyses.R

Report writing

report-5797-Tinnitus.Rmd

report-5797-Tinnitus-Inflammation.Rmd

report-5797-Tinnitus-Neurology.Rmd

Auxiliary files

Extraction

Cleaned data

Association test results

About

Resources

Stars

Watchers

Forks

Languages

`master_data_generation.R`

`master_heavy_analyses.R`

`report-5797-Tinnitus.Rmd`

`report-5797-Tinnitus-Inflammation.Rmd`

`report-5797-Tinnitus-Neurology.Rmd`