# Generating Normalized Counts, The rLogCounts, and the Default IHW With DEseq2

## Site(s) Used:

* For the [demo version](http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html)

* The [installation of the DESeq2]( https://bioconductor.org/packages/release/bioc/html/DESeq2.html )

This will be used for the entire script.

The output is in subdirectories of the parent directory. The outputs are in 2 folders:

1) 2___Normalized_Counts_DEseq2

2) 3___IHW_Using_Default_Version_From_DEseq2

## Output Directory(s)

In [None]:
# Define the parent directory

parent_directory <- "path/to/your/parent/directory"


## Make Output Directory(s)

In [None]:
# Define the subdirectory path
output_directory <- file.path(parent_directory, "2___Normalized_Counts_DEseq2")

# Create the folder if it doesn't exist
dir.create(output_directory, showWarnings = FALSE)

In [None]:
# Define the folder name
folder_name <- "3___IHW_Using_Default_Version_From_DEseq2"

# Define the output directory path
output_directory_DEseq_default_IHW <- file.path(parent_directory, folder_name)

# Create the folder if it doesn't exist
dir.create(output_directory_DEseq_default_IHW, showWarnings = FALSE)

## Loading Counts Matrix


* First the counts matrix needs to be loaded. Make sure this is the counts matrix that has the summary statistics removed.


In [None]:
# Define the path to your counts matrix file
counts_matrix_file <- "/path/to/your/counts/file.txt"

# Read the counts matrix from the TSV file
counts_matrix <- read.table(counts_matrix_file, header = TRUE, row.names = 1, sep = "\t")

head(counts_matrix)

In [None]:
print(colnames(counts_matrix))

## Isolate 2 days post fertilization:


In [None]:
# Specify the columns you want to isolate by name
columns_to_isolate <- c("C.01__Control",
                        "C.02__Control",
                        "C.03__Control",
                        "E.01__Experimental",
                        "E.02__Experimental",
                        "E.03__Experimental")

# Isolate the specified columns and keep the row names
isolated_columns <- counts_matrix[, c(columns_to_isolate), drop = FALSE]

head(isolated_columns)

## Sample MetaData

In [None]:


sample_metadata <- data.frame(
  Sample = c("Ctrl.01__Control",
             "Ctrl.02__Control",
             "Ctrl.03__Control",
             "NO.01__Experimental",
             "NO.02__Experimental",
             "NO.03__Experimental"),
    
  Treatment = c("Untreated", "Untreated", "Untreated",
                "Knockdown", "Knockdown", "Knockdown")
)

# View the table

sample_metadata


Below is the tentative design formula I am going to use.

In [None]:
colnames(sample_metadata)

## DEseq2 Data Set

When it comes to the design forumla for the `DESeqDataSetFromMatrix` object ensure that you put the formula directly into design. Do not put it into a variable that will be fed in design. It will most likely not work.

### Library:

In [None]:
library(DESeq2)

In [None]:

dds <- DESeqDataSetFromMatrix(countData = isolated_columns,
                              colData = sample_metadata,
                              design = ~ Treatment  )


## Manually Identify: Factor Levels

You may have to idiftify the reference factor as according to DEseq2:

> By default, R will choose a reference level for factors based on alphabetical order. Then, if you never tell the DESeq2 functions which level you want to compare against (e.g. which level represents the control group), the comparisons will be based on the alphabetical order of the levels. There are two solutions: you can either explicitly tell results which comparison to make using the contrast argument (this will be shown later), or you can explicitly set the factors levels. In order to see the change of reference levels reflected in the results names, you need to either run DESeq or nbinomWaldTest/nbinomLRT after the re-leveling operation. Setting the factor levels can be done in two ways, either using factor

Therefore below I am setting Untreated as the reference.  This makes it seem the knockdown as the experimental group.

In [None]:
dds$Treatment <- factor(dds$Treatment, levels = c("Untreated", "Knockdown"))

In [None]:
dds$Treatment <- relevel(dds$Treatment, ref = "Untreated")

## Pre-filtering Low Counts

In [None]:
keep <- rowSums(counts(dds)) >= 10
dds <- dds[keep,]

## Differential Expression Analysis Main

In [None]:
dds <- DESeq(dds)
dds

## Normalizing The Counts

In [None]:
# Extract normalized counts
normalized_counts <- counts(dds, normalized = TRUE)

In [None]:
head(normalized_counts)

### Make Row Names Into A Separate Column:

First, I need the `tibble` library:

In [None]:
library(tibble)

Now I need to make `normalized_counts` into a data frame:

In [None]:
# Make normalized counts into a data frame if it is not already a dataframe
normalized_counts_df <- as.data.frame(normalized_counts)

head(normalized_counts_df)

Below I will make the normalized counts row names into an Ensembl_ID column:

In [None]:
# Use rownames_to_column to add row names as a new column
normalized_counts_df <- rownames_to_column(normalized_counts_df, var = "Ensembl_ID")
head(normalized_counts_df)

### Make A CSV/TSV Of The Normalized Counts

In [None]:
# Write to a CSV file in the specified directory
write.csv(normalized_counts_df, file.path(output_directory, "Normalized_Counts.csv"), row.names = FALSE)

# Write to a TSV file in the specified directory
write.table(normalized_counts_df, file.path(output_directory, "Normalized_Counts.tsv"), sep = "\t", row.names = FALSE)


### Make a CSV/TSV Of The rLOGcounts

In [None]:
rld<-rlog(dds) 
rlogcounts <- assay(rld)

In [None]:
head(rlogcounts)

In [None]:
rlogcounts_df <- as.data.frame(rlogcounts)

In [None]:
# Write to a CSV file in the specified directory
write.csv(rlogcounts_df, file.path(output_directory, "rLOGcounts.csv"), row.names = TRUE)

# Write to a TSV file in the specified directory
write.table(rlogcounts_df, file.path(output_directory, "rLOGcounts.tsv"), sep = "\t", row.names = TRUE)


## Session Information

In [None]:
sessionInfo()

In [None]:
print("Done")