<img src="materials/images/introduction-to-statistics-II-cover.png"/>


# 👋 Welcome, before you start
<br>

### 📚 Module overview

We will go through eleven lessons with you:
    
- [**Lesson 1: Z-score**](Lesson_1_Z-score.ipynb)

- [**Lesson 2: P-value**](Lesson_2_P-value.ipynb)

- [**Lesson 3: Lesson 3: Welchs T-test**](Lesson_3_Welchs_T-test.ipynb)

- <font color=#E98300>**Lesson 4: Log2 Fold Change**</font>    `📍You are here.`

- [**Lesson 5: Pearson Correlation**](Lesson_5_Pearson_Correlation.ipynb)

- [**Lesson 6: Spearman Correlation**](Lesson_6_Spearman_Correlation.ipynb)

- [**Lesson 7: False Discovery Rate**](Lesson_7_False_Discovery_Rate.ipynb)

- [**Lesson 8: Benjamini Hochberg**](Lesson_8_Benjamini_Hochberg.ipynb)

- [**Lesson 9: Dimensionality Reduction Methods: Principal Component Analysis**](Lesson_9_Dimensionality_Reduction_Methods_Principal_Component_Analysis.ipynb)

- [**Lesson 10: Dimensionality Reduction Methods: t-SNE**](Lesson_10_Dimensionality_Reduction_Methods_t-SNE.ipynb)

- [**Lesson 11: UMAP**](Lesson_11_UMAP.ipynb)
</br>



<div class="alert alert-block alert-info">
<h3>⌨️ Keyboard shortcut</h3>

These common shortcut could save your time going through this notebook:
- Run the current cell: **`Enter + Shift`**.
- Add a cell above the current cell: Press **`A`**.
- Add a cell below the current cell: Press **`B`**.
- Change a code cell to markdown cell: Select the cell, and then press **`M`**.
- Delete a cell: Press **`D`** twice.

Need more help with keyboard shortcut? Press **`H`** to look it up.
</div>

---

# Lesson 4: Log2 Fold Change

A <mark>**fold change**</mark> is a measure describing how much a quantity changes between an initial and a subsequent measurement. This is often used when comparing various measurements of a biological system taken at different times. For example, if a quantity changes from 50 to 100 over a given period of time, this is defined as a two-fold increase (i.e., a fold change of **2**). Similarly, a change from 100 to 50 would be referred to as a 0.5-fold decrease (i.e., a fold change of **.5**). 

Commonly, fold change is used in the analysis of gene expression data from microarray experiments for measuring a change in the expression level of a gene.

<img src="materials/images/images_log2_fold_change/microarray.png"/>

`🕒 This module should take about 15 minutes to complete.`

`✍️ This notebook is written using Python.`

---

# Log<sub>2</sub>-Fold Change
This is the effect size estimate. This value indicates, for example, how much the gene or transcript's expression seems to have changed between the comparison and control groups. For example, suppose there are two gene expression values: A for the initial measurement, and B for the treatment. If A = 50 and B = 75, then the **fold change** is B/A (i.e., 1.5). The **log<sub>2</sub>-fold change** would be log<sub>2</sub>(1.5) = .58.

### Why Log<sub>2</sub>-Fold Change:
When analyzing and visualzing fold changes, this value is reported on a logarithmic scale to base 2 (i.e., log<sub>2</sub>). This is because it is easy to interpret. For example, doubling (2) the initial quantity is equal to a log<sub>2</sub> fold change of 1 (i.e., log<sub>2</sub>(2) = 1). And quadrupling an initial quantity is equal to a log<sub>2</sub> fold change of 2 (i.e., log<sub>2</sub>(4) = 2). Further, a nice property of log<sub>2</sub> is that it is symmetric for reciprocals. For example, conversely, when the initial quantity is decreased by half, this is equivalent to a log<sub>2</sub> fold change of −1, and quartering an initial quantity is equivalent to a log<sub>2</sub> fold change of −2 and so on. This leads to more aesthetically pleasing plots as exponential changes are displayed as linear, so the dynamic range is increased. For example, on a plot axis showing log2 fold changes, an 8-fold increase will be displayed at an axis value of 3 since log<sub>2</sub>(8) = 3. 

<img src="materials/images/images_log2_fold_change/log2_plot.png"/>

In the volcano plot shown above, the red points indicate genes that display both large-magnitude fold changes (x-axis) as well as high statistical significance (-log<sub>10</sub> p-value, y-axis). The dashed green line shows the p-vaule cutoff of 0.01 (10<sup>-2</sup>) with points above the line having a p-value < 0.01 and points below the line having a p-value > 0.01. The vertical dashed blue lines indicate log<sub>2</sub>-fold changes of 2. **Therefore, all red dots exhibit log<sub>2</sub>-fold changes beyond ±2 (four-fold change) and statistical significance less than 0.01.**

<div class="alert alert-block alert-success">
    <b>Tip: </b> The formula for the <mark><b>log<sub>2</sub>-fold change</b> is: log<sub>2</sub>(B) - log<sub>2</sub>(A)</mark>  <br> The <mark>fold change = 2<sup><b>log<sub>2</sub>FC</b></sup></mark>
    
</div>

<div class="alert alert-block alert-warning">
    <b>Note:</b> <b>log<sub>2</sub>(x)</b> = log<sub>10</sub>(x)/log<sub>10</sub>(2)

---

# 🌟 Ready for the next one?
<br>

- [**Lesson 5: Pearson Correlation**](Lesson_5_Pearson_Correlation.ipynb)

- [**Lesson 6: Spearman Correlation**](Lesson_6_Spearman_Correlation.ipynb)

- [**Lesson 7: False Discovery Rate**](Lesson_7_False_Discovery_Rate.ipynb)

- [**Lesson 8: Benjamini Hochberg**](Lesson_8_Benjamini_Hochberg.ipynb)

- [**Lesson 9: Dimensionality Reduction Methods: Principal Component Analysis**](Lesson_9_Dimensionality_Reduction_Methods_Principal_Component_Analysis.ipynb)

- [**Lesson 10: Dimensionality Reduction Methods: t-SNE**](Lesson_10_Dimensionality_Reduction_Methods_t-SNE.ipynb)

- [**Lesson 11: UMAP**](Lesson_11_UMAP.ipynb)
</br>

---

# Contributions & acknowledgment

Thanks Antony Ross for contributing the content for this notebook.

---

Copyright (c) 2022 Stanford Data Ocean (SDO)

All rights reserved.