GitHub - VirtualPatientEngine/sva-workshop: Demo data and code for the SVA workshop

Credit

Data and code below were taken from the DIY.transcriptomics course by Dr. Daniel Beiting

Data

These are the fastq files that come from 1000 peripheral blood mononuclear cells (PBMCs) and is one of the sample datasets provided by 10X Genomics.

Storage space ~ 5Gb

Download here Note: do not uncompress them.

Get the reference sequences from Ensembl (cDNA fasta file for Human) here

(Optional) This file will be generated on the fly but I am anyway pasting the link here (just in case). transcript to gene mapping file

Initial setup and preprocessing (mostly in shell)

Create a conda environment (name it sva_demo) and activate it

conda create --name sva_demo

conda activate sva_demo

Install the Kallisto package (popular for single-cell analysis)

conda install kallisto

Install the kb-python package that consist of some bustools required to perform preprocessing of the dataset

pip install kb-python

More info about kb-python here

Use Kallisto to build index from reference sequences

kallisto index -i Homo_sapiens.GRCh38.cdna.all.fa Homo_sapiens.GRCh38.cdna.all.index

kallisto index -i input_fasta output_index

Preprocessing scRNA-seq data

 kb count \
 pbmc_1k_v3_S1_mergedLanes_R1.fastq.gz pbmc_1k_v3_S1_mergedLanes_R2.fastq.gz \
 -i Homo_sapiens.GRCh38.cdna.all.index \
 -x 10XV3 \
 -g t2g.txt \
 -t 8 \
 --cellranger

Great, now you are done with the initial setup and preprocessing!

QA and analysis in RStudio

You must have R and RStudio installed. If not .....
Now open the DIY_scRNAseq script on your system (Rstudio) and simply follow the instructions in it.

Sample input for the ML workflow

ML_input.tsv.gz

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
DIY_scRNAseq.R		DIY_scRNAseq.R
LICENSE		LICENSE
ML_input.tsv.gz		ML_input.tsv.gz
README.md		README.md
functions.R		functions.R
image.png		image.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

DIY_scRNAseq.R

DIY_scRNAseq.R

LICENSE

LICENSE

ML_input.tsv.gz

ML_input.tsv.gz

README.md

README.md

functions.R

functions.R

image.png

image.png

Repository files navigation

Credit

Data

Initial setup and preprocessing (mostly in shell)

QA and analysis in RStudio

Sample input for the ML workflow

About

Releases

Packages

Languages

License

VirtualPatientEngine/sva-workshop

Folders and files

Latest commit

History

Repository files navigation

Credit

Data

Initial setup and preprocessing (mostly in shell)

QA and analysis in RStudio

Sample input for the ML workflow

About

Resources

License

Stars

Watchers

Forks

Languages