Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
General functions for analyzing combinatorial indexing sequence data
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Type||Name||Latest commit message||Commit time|
|Failed to load latest commit information.|
scitools Version: 0.1.3 Adey Lab (www.adeylab.org, www.github.com/adeylab/scitools) WARNING: scitools is still in a development stage - a number of commands are in active development. SETUP: To set up scitools, download the repository to the location you want the executable to be. Keep all index files and folders within the scitools directory. We recommend adding the scitools execulable to your path. We also recommend editing the scitools.cfg file to add in command-line callable executables used by scitools, defaults assume they are all in your path. Once it is set up, run 'scitools dependencies' which will check for external software that scitools relies on. DESCRIPTION: scitools is a set of scripts designed for working with single-cell combinatorial indexing data. It includes tools to go from fastq files (after bcl2fastq) to a processed dataset. It includes a number of external tools and R packages that are called by scitools. If scitools is used, be sure to cite those tools! scitools commands are in the form of [class]-[operation]. Most can be specified in reverse order, or if the operation is unique to the class of files or the class type can be determined by the files in the arguments then just the operation name can be specified. Run 'scitools list' for a list of commands, or 'scitools list [keyword]' to list commands that include the keyword, e.g. 'scitools list fastq' DEPENDENCIES: (command-line callable, can be specified in options) Run scitools dependencies to check software and R requirements. The default executables can be altered in the scitools.cfg file. A user may also have their own personal scitools.cfg file in their home directory to override the use of the config file that is present with the scitools executable. Executables: gzip For gzipped fastq files. Default: gzip & zcat (these are hardoded at the beginning of the scitools code and not available as options) samtools Bam-related commands. Default: samtools bedtools Bed-related commands. Default: bedtools Rscript Numerous R operations. Default: Rscript bwa For alignment only. Default: bwa macs2 For atac-callpeak only. Default: macs2 scitools Can call itself. Default: scitools (bismark) For sci-MET (bowtie2) For sci-MET / used by bismark R packages: ggplot2 For plotting commands svd For Latent Semantic Indexing (LSI) Rtsne For tSNE visualization methods For PCA dbscan For density-based clustering irlba For fast SVD (recommended) File Types used by scitools: fastq Standard fastq files, input can be gzipped or not, output fastq files will always be gzipped. bam Bam files. For scitools, it is expected the barcode sequence or cellID is in the read name ([barcode]:[read_number]) Sam files are not supported to encourage (force) space saving annot Annotation file: tab delimited, 2 columns Col 1 = cellID (barcode), could also be feature name Col 2 = annotation information (e.g. experimental condition) values file A special case of an annotaiton file where the annotation is a continuous variable and not discrete. matrix Matrix file, tab delimited Row 1 = CellIDs, has 1 less column than all other rows Rows = field 1 is name, then 1 field for each cellID dims Dimensions file, tab delimited Col 1 = cellID Col 2-N = dimensions for each cell range Range format is not a file type but a way to input a range of values (e.g. which dimensions to use from a dims file) This is a comma separated list and can include dashes for a range of values (e.g. 1,3-6,8-10,13)