Utilising Cogent Analysis Pipeline (CogentAP) and Discovery Software (CogentDS) to analyse data from novel single-cell approach

This repository contains R scripts used to analyse single-cell RNA-seq data generated by a novel approach combining the CellenONE® X1 instrument from CELLENION utilising Image Based Single Cell Isolation (IBSCIT™) to isolate and sort cells with the ICELL8® cx Single-Cell System from Takara to process the cells for sequencing.

Requirements

Genome assembly references

Software

R packages

GEOquery and RCurl - download the single-cell and bulk data from GEO
Cogent NGS Discovery Software (CogentDS)
ggpubr
reshape2
VennDiagram
DESeq2
GGally
circlize
VariantAnnotation

Usage

1. Single-cell RNA-seq datasets

For both ICELL8 and composite datasets, the analysis used to generate gene matrix and metadata from FastQ files are as follows:

Demultiplex reads based on barcodes each cell is affiliated with (found in the wellLists directory):

cogent demux -i /path/to/scRNAseq.fastq.gz -p /path/to/scRNAseq.fastq.gz -t ICELL8_FLA -b /path/to/wellList.txt -o /path/to/demux_out --gz

Build genome based on the ENSEMBL hg38 fasta file and GTF file v103:

cogent add_genome -g hg38-v103 -f /path/to/Homo_sapiens.GRCh38.dna.toplevel.fa -a /path/to/Homo_sapiens.GRCh38.103.gtf

Analyse data with human genome reference. In brief, the following steps are utilised:

Trim reads using cutadapt, whereby N's are trimmed at ends of reads, 3' bases with quality <20 are trimmed, and reads with more than 70% of their length with N's and/or shorter than 15 bases after trimming are removed.
Align reads to hg38 genome using STAR.
Quantify reads in exonic, genic (including introns) and mitochondrial regions for all hg38 v103 genes using featureCounts, where only primary alignments are counted.
Summarise data and re-organise into gene matrix, metadata and gene info.
Generate CogentDS report with default parameters.

cogent analyse -i /path/to/demux_out/demux_out_demuxed_R1.fastq.gz -t ICELL8_3DE -g hg38-v103 -o /path/to/analysis_out -d /path/to/demux_out/demux_out_counts_all.csv

2. Bulk RNA-seq data

To make sure the bulk and single-cell RNA-seq datasets are comparable, steps 3a-c used for the single-cell RNA-seq data (read trimming, alignment to genome and read quantification in exonic regions) should be used in the same way.

3. Regulon analysis

This analysis involved associating the gene expressions from individual cells/samples with specific underlying phenotypes. In brief, the following steps were utilized:

Variants were called from alignment files for both bulk RNA-seq and scRNA-seq data using GATK.
For each sample, top variants from the scRNA-seq data were selected for each sample as (a) sample-specific, (b) corroborated by variants called from bulk RNA-seq data, (c) non-synonymous mutations, (d) with highest mutation rate (i.e. maximum proportion of mutant-containing depth out of total depth in that position), and (e) with highest overall depth.
For each top variant, a multiple regression analysis was executed between average gene expressions per sample and the mutation percentage in that sample (i.e. proportion of cells in that sample containing the mutation). The scripts used for this analysis can be found in the directory ├──figure5.

4. Differential expression of mutant vs. non-mutant cells

As an additional assessment of the association between the sample phenotypes (mutations) and the gene expressions, the top variants of each sample were searched for within individual cells from that sample, and a differential expression analysis was performed for all genes between mutated and non-mutated cells. The script used for this analysis can be found in the directory ├──figure6.

Citation

The scripts were used to generate the figures and results for the following paper: Shomroni, O., Sitte, M., Schmidt, J. et al. A novel single-cell RNA-sequencing approach and its applicability connecting genotype to phenotype in ageing disease. Sci Rep 12, 4091 (2022). https://doi.org/10.1038/s41598-022-07874-1

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
figure2		figure2
figure3		figure3
figure4		figure4
figure5		figure5
figure6		figure6
wellLists		wellLists
README.md		README.md
github_figure.jpeg		github_figure.jpeg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

figure2

figure2

figure3

figure3

figure4

figure4

figure5

figure5

figure6

figure6

wellLists

wellLists

README.md

README.md

github_figure.jpeg

github_figure.jpeg

Repository files navigation

Utilising Cogent Analysis Pipeline (CogentAP) and Discovery Software (CogentDS) to analyse data from novel single-cell approach

Requirements

Genome assembly references

Software

R packages

Usage

1. Single-cell RNA-seq datasets

2. Bulk RNA-seq data

3. Regulon analysis

4. Differential expression of mutant vs. non-mutant cells

Citation

About

Contributors 2

Languages

UKHG-NIG/single-cell-cellenion-icell8

Folders and files

Latest commit

History

Repository files navigation

Utilising Cogent Analysis Pipeline (CogentAP) and Discovery Software (CogentDS) to analyse data from novel single-cell approach

Requirements

Genome assembly references

Software

R packages

Usage

1. Single-cell RNA-seq datasets

2. Bulk RNA-seq data

3. Regulon analysis

4. Differential expression of mutant vs. non-mutant cells

Citation

About

Resources

Stars

Watchers

Forks

Languages