Skip to content

This repository contains scripts used to generate figures for the paper "A novel single-cell RNA-sequencing platform and its applicability connecting genotype to phenotype in ageing diseases"

UKHG-NIG/single-cell-cellenion-icell8

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Utilising Cogent Analysis Pipeline (CogentAP) and Discovery Software (CogentDS) to analyse data from novel single-cell approach

This repository contains R scripts used to analyse single-cell RNA-seq data generated by a novel approach combining the CellenONE® X1 instrument from CELLENION utilising Image Based Single Cell Isolation (IBSCIT™) to isolate and sort cells with the ICELL8® cx Single-Cell System from Takara to process the cells for sequencing.

alt text

Requirements

Genome assembly references

Software

R packages

Usage

1. Single-cell RNA-seq datasets

For both ICELL8 and composite datasets, the analysis used to generate gene matrix and metadata from FastQ files are as follows:

  1. Demultiplex reads based on barcodes each cell is affiliated with (found in the wellLists directory):
cogent demux -i /path/to/scRNAseq.fastq.gz -p /path/to/scRNAseq.fastq.gz -t ICELL8_FLA -b /path/to/wellList.txt -o /path/to/demux_out --gz
  1. Build genome based on the ENSEMBL hg38 fasta file and GTF file v103:
cogent add_genome -g hg38-v103 -f /path/to/Homo_sapiens.GRCh38.dna.toplevel.fa -a /path/to/Homo_sapiens.GRCh38.103.gtf
  1. Analyse data with human genome reference. In brief, the following steps are utilised:
  • Trim reads using cutadapt, whereby N's are trimmed at ends of reads, 3' bases with quality <20 are trimmed, and reads with more than 70% of their length with N's and/or shorter than 15 bases after trimming are removed.
  • Align reads to hg38 genome using STAR.
  • Quantify reads in exonic, genic (including introns) and mitochondrial regions for all hg38 v103 genes using featureCounts, where only primary alignments are counted.
  • Summarise data and re-organise into gene matrix, metadata and gene info.
  • Generate CogentDS report with default parameters.
cogent analyse -i /path/to/demux_out/demux_out_demuxed_R1.fastq.gz -t ICELL8_3DE -g hg38-v103 -o /path/to/analysis_out -d /path/to/demux_out/demux_out_counts_all.csv

2. Bulk RNA-seq data

To make sure the bulk and single-cell RNA-seq datasets are comparable, steps 3a-c used for the single-cell RNA-seq data (read trimming, alignment to genome and read quantification in exonic regions) should be used in the same way.

3. Regulon analysis

This analysis involved associating the gene expressions from individual cells/samples with specific underlying phenotypes. In brief, the following steps were utilized:

  • Variants were called from alignment files for both bulk RNA-seq and scRNA-seq data using GATK.
  • For each sample, top variants from the scRNA-seq data were selected for each sample as (a) sample-specific, (b) corroborated by variants called from bulk RNA-seq data, (c) non-synonymous mutations, (d) with highest mutation rate (i.e. maximum proportion of mutant-containing depth out of total depth in that position), and (e) with highest overall depth.
  • For each top variant, a multiple regression analysis was executed between average gene expressions per sample and the mutation percentage in that sample (i.e. proportion of cells in that sample containing the mutation). The scripts used for this analysis can be found in the directory ├──figure5.

4. Differential expression of mutant vs. non-mutant cells

As an additional assessment of the association between the sample phenotypes (mutations) and the gene expressions, the top variants of each sample were searched for within individual cells from that sample, and a differential expression analysis was performed for all genes between mutated and non-mutated cells. The script used for this analysis can be found in the directory ├──figure6.

Citation

The scripts were used to generate the figures and results for the following paper: Shomroni, O., Sitte, M., Schmidt, J. et al. A novel single-cell RNA-sequencing approach and its applicability connecting genotype to phenotype in ageing disease. Sci Rep 12, 4091 (2022). https://doi.org/10.1038/s41598-022-07874-1

About

This repository contains scripts used to generate figures for the paper "A novel single-cell RNA-sequencing platform and its applicability connecting genotype to phenotype in ageing diseases"

Resources

Stars

Watchers

Forks

Languages