refine.bio Examples: Ensembl ID conversion
This notebook contains example workflows of how to convert from Ensembl gene IDs, which is the gene identifier used by refine.bio, to other types of gene identifiers that may be required for downstream analyses.
Requirements and usage
This module requires you to install the following software to run examples yourself:
These requirements can be installed by following the instructions at the links above. The example R Notebooks are designed to check if additional required packages are installed and will install them if they are not.
We have prepared a quick guide to RStudio as part of our training content that you may find helpful if you're getting started with RStudio for the first time.
Note that the first time you open RStudio, you should select a CRAN mirror.
You can do so by clicking
Global Options >
Packages and selecting a CRAN mirror near you with the
You can install the additional requirements (e.g., tidyverse) through RStudio.
Obtaining the R Notebooks
To run the examples yourself, you will need to clone or download this repository: https://github.com/AlexsLemonade/refinebio-examples
Interacting with R Notebooks
You can open the R Notebook by opening the
.Rmd file in RStudio.
Note that working with R Notebooks requires certain R packages, but RStudio should prompt you to download them the first time you open one.
This will allow you to modify and run the R code chunks.
Chunks that have already been included in an example can be run by clicking the green play button in the top right corner of the chunk or by using Ctrl + Shift + Enter (Cmd + Shift + Enter on a Mac).
See this guide using to R Notebooks for more information about inserting and executing code chunks.
Using your own data
For the example in this module, the gene expression data and sample metadata are stored in a
If you'd like to adapt an example to include data you've obtained from refine.bio, we recommend placing the files in the
data/ directory and changing the filenames and paths in the notebook to match these files.
The output of the notebook is a TSV file that contains annotation with gene symbols; this filename should be updated as well.
We suggest saving the output in the
results/ directory, as this is automatically created by the notebook if moved outside of the GitHub repository structure.