Skip to content
Branch: master
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
data
README.md
quantile_normalize_own_data.Rmd
quantile_normalize_own_data.nb.html

README.md

refine.bio Examples: Quantile normalize your own data

This directory contains an example workflow of how to quantile normalize your own dataset to make it more comparable to data you obtain from refine.bio.

Requirements and usage

This module requires you to install the following software to run the example yourself:

These requirements can be installed by following the instructions at the links above. The example R Notebooks are designed to check if additional required packages are installed and will install them if they are not.

RStudio

We have prepared a quick guide to RStudio as part of our training content that you may find helpful if you're getting started with RStudio for the first time.

Note that the first time you open RStudio, you should select a CRAN mirror. You can do so by clicking Tools > Global Options > Packages and selecting a CRAN mirror near you with the Change button.

You can install the additional R package requirements (e.g., tidyverse) through RStudio.

Obtaining the R Notebooks

To run the examples yourself, you will need to clone or download this repository: https://github.com/AlexsLemonade/refinebio-examples

Interacting with R Notebooks

You can open an R Notebook by opening the .Rmd file in RStudio. Note that working with R Notebooks requires certain R packages, but RStudio should prompt you to download them the first time you open one. This will allow you to modify and run the R code chunks. Chunks that have already been included in an example can be run by clicking the green play button in the top right corner of the chunk or by using Ctrl + Shift + Enter (Cmd + Shift + Enter on a Mac). See this guide using to R Notebooks for more information about inserting and executing code chunks.

Background

Datasets that are downloaded from the web interface of refine.bio are quantile normalized because we allow users to aggregate multiple platforms (e.g., microarray chips) and even multiple technologies. Quantile normalization (QN) ensures that the underlying distributions of samples are the same.

If you are interested in comparing your own data to data you have downloaded from the refine.bio website, you may wish to quantile normalize it using the reference distribution we've used in refine.bio.

You can read more about the reference distribution and how it is generated in the quantile normalization section of our documentation.

In this example notebook, we demonstrate how to obtain the reference distribution from the refine.bio API and apply it to your own data in R.

Using your own data

The R package we use for quantile normalization (preprocessCore) expects a matrix where samples are columns and genes are rows. Any tabular format that can be read into R and follows this format (or can be easily transposed) is appropriate. In our example, we use a TSV file where samples are columns and genes are rows.

You can’t perform that action at this time.