# Result Replication: Post-mortem molecular profiling of three psychiatric disorders

UCSD DSC180A FA20

Brandon Tsui

Xuanyu Wu

## Introduction

One of the largest causes of morbidity and mortality annually in the United States is suicide, with close to 90% of people who commit suicide being clinically diagnosed with a psychiatric disorder. Three of the most prominent disorders that affect the population are Schizophrenia (SZ), Bipolar Disease (BP), and Major Depressive Disorder (MDD).

Schizophrenia is an illness that affects a person’s mood, feeling, and behavior as well as often causing psychosis in which patients lose touch with reality. While Schizophrenia can affect younger children, typically patients are diagnosed between late teen years to early thirties. Including general negative effects on feelings and behavior, symptoms also include hallucinations or delusions, loss of motivation or pleasure, and difficulty in paying attention or decision making. While there are treatments and medications that exist mainly to help suppress psychotic symptoms, because the etiology of the disease is not fully understood, there is currently no cure. Based on various genetic studies, it is known that certain genes may be associated with increased risk of the disorder but more work needs to be done in that area before using genes to predict the onset or help with a cure.

Bipolar Disorder is an illness that affects mood, energy, and behavior, often experiencing various “mood episodes” of uncharacteristically intense emotions. There are three types of Bipolar Disorder characterized by different severities of manic episodes but in all three types, patients still exhibit patterns of manic and depressive episodes of varying lengths of duration. During manic episodes, patients can also exhibit psychotic symptoms such as hallucinations which can lead to a misdiagnosis of SZ in some cases. Like SZ, Bipolar Disorder is diagnosed during teen years to early adulthood and while symptoms vary over time, lifetime treatment is usually required. Genetic research has also been done on this disorder to confirm the effect of certain genes on risk.

Unlike SZ and BPD which affect only a small subset of the population, Major Depressive Disorder is much more common yet still has a severe impact on mood, thinking, and behavior. Depression has a wide range of symptoms and the combination of said symptoms as well as circumstances result in the various forms of depression such as Persistent, Postpartum, or Seasonal affective disorder. While people with depression can experience similar extreme symptoms of psychosis as seen with SZ and BPD, Depression is more typically associated with low mood, fatigue, and trouble sleeping. Again, research has indicated a genetic connection with the disease supported by family history increasing risk, and genetic studies. However, circumstances and life events have also seen to be major causes of the disorders, unlike SZ and BPD.

While each of these three disorders is clinically distinct, because of the overlapping symptoms and similar effects on mood and behavior across each disorder, it is highly possible that there are shared genetic causes. This idea has been backed by genome-wide association studies (GWAS), which is used to associate genetic variations with diseases. However, while a GWAS can point to associations in genes, other methods must be used to further specify and characterize genetic etiology in order to better understand underlying molecular mechanisms behind the diseases with the hope of creating better treatments for these types of diseases.

One of the ways to dive deeper into the effect of gene expression is RNA sequencing (RNA-Seq), which provides the ability of transcriptome profiling and a quantifying gene expression level. Compared to other transcriptomic methods, RNA-Seq offers high-throughput, low-cost sequencing based on less amount of RNA samples. In addition, RNA-Seq does not rely on existing genetic sequences, thus having the potential to analyze complex transcriptomes and reveal variations within the transcriptome. Therefore, employing RNA-Seq in mental health researches may reveal the commonalities across diseases and differences between patients and controls.

Given that previous studies have found distinct gene expressions in superior temporal gyrus and hippocampus among SZ and BPD patients, this study dedicates to discovering the most significant disease-related differences in gene expressions among SZ, BP, and MDD by analyzing RNA-Seq data, laying the foundation for potential therapeutic directions, which can benefit the welfare of millions of patients and their families. 

The data was collected from post-mortem tissue in three areas of the brain: the anterior cingulate cortex (AnCg), the dorsolateral prefrontal cortex (DLPFC), and the nucleus accumbens (nAcc), which were chosen due to their associations with behaviors affected by said mental disorders. The source of data is the National Center for Biotechnology Information (NCBI), which contains four groups of samples, SZ, BPD, MDD, and control(CTL). Each of the groups has 24 individuals (96 in total). The result is also validated using the data from Stanley Neuropathology Consortium Integrative Database (SNCID)(27SZ, 26CTL, and 25BPD). 

From brain tissue, RNA was extracted and sequenced. We will use SRA Toolkit to convert the file format to fastq. Then the converted files will be fed into the aRNApip to quantify the expression level. In general, the fastq files will be trimmed by cutadapt to reduce noise brought by adapter sequences, aligned by STAR, and evaluated by Picard QC. And further analyses such as correlation and clustering will build upon the processing results.


## Methods

### Pre-processing


In [None]:
# code here

## Part 2: EDA
This part does EDA.

Here is the plot

In [None]:
# code, tables, and plots here

## Part 3: Training the model

This part demonstrates the model trainig process.


In [None]:
# code, tables, and plots (if needed) here

## Part 4: Results

This part interpretes the Results

In [None]:
# code, tables, and plots here