Skip to content

osmanbeyoglulab/ECCB-2024-Tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Computational Approaches for Identifying Context-Specific Transcription Factors using Single-Cell Multi-Omics Datasets (ECCB 2024)

Table of Contents

Overview

Development of specialized cell types and their functions are controlled by external signals that initiate and propagate cell-type specific transcriptional programs. Activation or repression of genes by key combinations of transcription factors (TFs) drive these transcriptional programs and control cellular identity and functional state. For example, ectopic expression of the TF factors Oct4, Sox2, Klf4 and c-Myc are sufficient to reprogram fibroblasts into induced pluripotent stem cells. Conversely, disruption of TF activity can cause a broad range of diseases including cancer. Hence, identifying context-specific TFs is particularly relevant to human health and disease.

Systematically identifying key TFs for each cell-type represents a formidable challenge. Determination of TF activity in bulk tissue is confounded by cell-type heterogeneity. Single-cell technologies now measure different modalities from individual cells such as RNA, protein, and chromatin states. For example, recent technological breakthroughs have coupled the relatively sparse single cell RNA sequencing (scRNA-seq) signal with robust detection of highly abundant and well-characterized surface proteins using index sorting and barcoded antibodies such as cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq). But these approaches are limited to surface proteins, whereas TFs are intracellular. Single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq) measures genome-wide chromatin accessibility and reveals cellular memory and response to stimuli or developmental decisions. Recently several computational methods have leveraged these omics datasets to systematically estimate TF activity influencing cell states. We will cover these TF activity inference methods using scRNA-seq, scATAC-seq, Multiome and CITE-seq data through hybrid lectures and hand-on-training sessions. We will cover the principles underlying these methods, their assumptions and trade-offs. We will apply multiple methods, interpret results and discuss strategies for further in silico validation. The audience will be equipped with practical knowledge, essential skills to conduct TF activity inference independently on their own datasets and interpret results.

Hands-on Tutorial

Environment Setup

Installation of Conda

Download the installer by choosing the proper installer for your machine. 2. Verify your installer hashes using SHA-256 checksums. 3. Install the installer: - Windows: Double-click the .exe file. Follow the instructions on the screen. For a detailed reference, please read this page. - macOS: double-click the .pkg file. Follow the instructions on the screen. For a detailed reference, please read this page. - Linux: In your terminal window, run: bash Anaconda-latest-Linux-x86_64.sh. Follow the prompts on the screen. For a detailed reference, please read this page.

Managing Environment

With Conda, you can create, export, list, and update environments that have different versions of Python and/or packages installed in them. The JupyterLab, which can run in Conda environment, is a web application for computational documents so that our code can produce rich, interactive output.

Below we will demonstrate how to create a Conda environment and install JupyterLab and packages for each tutorial session on macOS/Linux. You need to create a separate Conda environment for each session.

Use the terminal for the following steps. For a detailed reference, please read this page.

  1. Create an environment with python 3.

     conda create --name stan python=3.12 -y
    
  2. Activate the environment you just created:

     conda activate stan
    
  3. Install JupyterLab:

     pip install jupyterlab
    
  4. Install required packages

     git clone https://github.com/osmanbeyoglulab/ECCB-2024-Tutorial.git
    
     cd ECCB-2024-Tutorial
    
     pip install -r requirements.txt
    

After installing the required packages for the tutorial, launch JupyterLab and open the Jupyter notebook in each session subfolder to start the tutorial. To launch JupyterLab, enter the following command in your terminal:

  Jupyter lab

Intended Audience and Level

This tutorial is designed for individuals at the beginner to intermediate level, specifically targeting bioinformaticians or computational biologists with some prior experience in analyzing single-cell RNA sequencing (scRNA-seq), single-cell assay for transposase-accessible chromatin sequencing (scATAC-seq), Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-seq), and Multiome data, or those familiar with next-generation sequencing (NGS) methods. A foundational understanding of basic statistics is assumed.

While participants are expected to be beginners, a minimum level of experience in handling NGS datasets is required. The workshop will be conducted using Python and JupyterLab, necessitating prior proficiency in Python programming and familiarity with command-line tools.

To facilitate the learning process, participants will be provided with pre-processed count matrices derived from real datasets. All analyses, including JupyterLab notebooks and tutorial steps, will be available on GitHub for reference.

The tutorial will employ publicly accessible data, with examples showcased using datasets that will be made available through repositories such as the Gene Expression Omnibus or similar public platforms. This hands-on workshop aims to equip participants with practical skills and knowledge, enabling them to navigate and analyze complex datasets in the field of single-cell omics.

Schedule

Tuesday, September 17, 2024 9:00 AM – 12:00 PM

Time Tutorial
9:00~10:30 Welcome remarks and tutorial overview

Basic principles behind TF activity inference methods
      * Overview of the importance of context-specific TF regulation in biological systems.
      * Significance of TF dynamics in health and disease.
      * Single-cell multi-omics and spatial transcriptomics technologies for TF activity inference (scRNA-seq, scATAC-seq, Multiome and CITE-seq)

Overview of computational TF inference methods based on single cell omics
10:30~11:00 Coffee
11:00~12:00 Hands-on experience in applying tools and interpreting results using TF activity inference methods using public scRNA-seq and spatial transcriptomics
12:00~13:00 Lunch break

References

  • Linan Zhang, April Sagan, Bin Qin, Baoli Hu, Hatice Ulku Osmanbeyoglu. STAN, a computational framework for inferring spatially informed transcription factor activity across cellular contexts: bioRxiv 2024.06.26.600782; doi: https://doi.org/10.1101/2024.06.26.600782

  • Xiaojun Ma, Ashwin Somasundaram, Zengbiao Qi, Douglas J Hartman, Harinder Singh, Hatice Ulku Osmanbeyoglu, SPaRTAN, a computational framework for linking cell-surface receptors to transcriptional regulators, Nucleic Acids Research, Volume 49, Issue 17, 27 September 2021, Pages 9633–9647, https://doi.org/10.1093/nar/gkab745

  • Kim, D., Tran, A., Kim, H.J. et al. Gene regulatory network reconstruction: harnessing the power of single-cell multi-omic data. npj Syst Biol Appl 9, 51 (2023). https://doi.org/10.1038/s41540-023-00312-6

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published