svmATAC

Enhancement and imputation of peak signal enables accurate identification of cell type in scATAC-seq

svmATAC is fully open source and under MIT license.

Dependencies

R 3.6
Python 3.7
Cicero 3.11

Quick Start

The pipeline of svmATAC is consist of two steps:

1.Pre-process:
- If you are trying to reproduce the results in svmATAC paper from raw sequencing data(Can be found in preprocess/XXX/input folder), you may need this step.
- If you are trying to reproduce the results in svmATAC paper using our provided data(Can be found in XXX-dataset/XXX/input folder), or your data are already merged to 0-1 matrix (peak-cell) and labelled, you may skip this step.
2.Training-Classification:
- The training scripts and classification scripts are stored in bin/ folder of each experiment, you can try these scripts and reproduce the results in svmATAC paper through following chapters:
  - intra-dataset experiment
    - Corces2016
    - Buenrostro2018
    - 10xPBMCsV1
    - 10xPBMCsNextGem
    - 10xPBMCsV1-labeled
    - 10xPBMCsNextGem-labeled
  - inter-dataset experiment
    - 10xPBMCs-labeled
    - 10xPBMCs-unlabeled
- For each single experiment:
  - All the scripts are available in the bin/ folder
    - All scripts are numbered and users should execute one by one.
    - The intermediate temporary files are stored in tmp/ folder and you can ignore these files.
  - All the input data required are available in the input/ folder
  - All the output data generated are stored in the output/ folder

Content

Pre-process

This chapter stores the scripts for processing raw data.

The Corces2016 dataset (folder 'Corces2016')
- bin
- input
- output
The Buenrostro2018 dataset (folder 'Buenrostro2018')
- bin
- input
- output
The 10xPBMCsV1 dataset (folder '10xPBMCsV1')
- bin
- input
- output
The 10xPBMCsNextGem dataset (folder '10xPBMCsNextGem')
- bin
- input
- output

This chapter stores the scripts for assigning labels to 10x PBMCs v1 and nextGem scATAC-seq data from labeled scRNA-seq data using Seurat.

The scRNA-seq-5k-v3 dataset (folder 'scRNA-seq-5k-v3')
- bin
- data
- output
The scATAC-seq-5k-v1 dataset (folder 'scATAC-seq-5k-v1')
- bin
- input
- output
The scATAC-seq-5k-nextgem dataset (folder 'scATAC-seq-5k-nextgem')
- bin
- input
- output

Training-Classification

intra-dataset

This chapter stores the scripts for intra-dataset experiments which are described in manuscript.

The Corces2016 dataset (folder 'Corces2016')
- bin
- input
- output
The Buenrostro2018 dataset (folder 'Buenrostro2018')
- bin
- input
- output
The 10xPBMCsV1 dataset (folder '10xPBMCsV1')
- bin
- input
- output
The 10xPBMCsNextGem dataset (folder '10xPBMCsNextGem')
- bin
- input
- output
The 10xPBMCsV1-labeled dataset (folder '10xPBMCsV1_labeled')
- bin
- input
- output
The 10xPBMCsNextGem-labeled dataset (folder '10xPBMCsNextGem_labeled')
- bin
- input
- output

inter-dataset

This chapter stores the scripts for inter-dataset experiments which are described in manuscript.

The 10xPBMCs-labeled dataset (folder '10xPBMCs_labeled')
- bin
- input
- output
The 10xPBMCs-unlabeled dataset (folder '10xPBMCs_unlabeled')
- bin
- input
- output

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
10x_assign_label		10x_assign_label
inter-dataset		inter-dataset
intra-dataset		intra-dataset
preprocess		preprocess
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

svmATAC

Dependencies

Quick Start

Content

Pre-process

Training-Classification

intra-dataset

inter-dataset

About

Releases

Packages

Languages

License

mrcuizhe/svmATAC

Folders and files

Latest commit

History

Repository files navigation

svmATAC

Dependencies

Quick Start

Content

Pre-process

Training-Classification

intra-dataset

inter-dataset

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages