## Minimal SeqSpec requirements for the Single Cell Perturb-Seq running

Create a SeqSpec file for each of the 3 modalities
- All the samples are 10xv3 5'
- Read1 has the CB(16) and the UMI (12)
The R2 stores the information about
    - scRNA (FULL read 92 bp)
    - Hashing (HTO 15 bp after a 10bp common sequence )
    - Guide (Guide is a 20bp sequence after a 63bp common construct region)


In [61]:
#Talks about the guide trick! This is allow kallisto extract it
#Use CDNA to extract automatically the regions (even if this is a guide or hto use the term cdna)

## This is an example of a SeqSpec file for a HTO design

In [62]:
%%writefile hto.yaml
!Assay
seqspec_version: 0.2.0
assay_id: MULTISEQ_10XV3_Hashing
name: MULTISEQ_10XV3_Hashing
doi: 'XXX'
date: 15 July 2022
description: Hashing using 10x5'v3
modalities:
- hashing
lib_struct: https://teichlab.github.io/scg_lib_structs/methods_html/10xChromium3.html
sequence_protocol: Not-specified
sequence_kit: Not-specified
library_protocol: 10xv3 Hashing
library_kit: Not-specified
sequence_spec:
- !Read
  read_id: hash_R1.fq
  name: Read 1
  modality: hashing
  primer_id: r1_primer
  min_len: 28
  max_len: 28
  strand: pos
- !Read
  read_id: hash_R2.fq
  name: Read 2
  modality: hashing
  primer_id: r2_primer
  min_len: 25
  max_len: 25
  strand: neg
library_spec:
- !Region
  parent_id: null
  region_id: hashing
  region_type: null
  name: null
  sequence_type: null
  sequence: null
  min_len: 53
  max_len: 53
  onlist: null
  regions:
  - !Region
    parent_id: hashing
    region_id: r1_primer
    region_type: r1_primer
    name: r1_primer
    sequence_type: fixed
    sequence: null
    min_len: 0
    max_len: 0
    onlist: null
    regions: null
  - !Region
    parent_id: hashing
    region_id: barcode
    region_type: barcode
    name: barcode
    sequence_type: onlist
    sequence: NNNNNNNNNNNNNNNN
    min_len: 16
    max_len: 16
    onlist: !Onlist
      location: remote
      filename: 737K-august-2016.txt
    regions: null
  - !Region
    parent_id: hashing
    region_id: umi
    region_type: umi
    name: umi
    sequence_type: fixed
    sequence: NNNNNNNNNNNN
    min_len: 12
    max_len: 12
    onlist: null
    regions: null
  - !Region
    parent_id: hashing
    region_id: cdna
    region_type: cdna
    name: cdna
    sequence_type: random
    sequence: NNNNNNNNNNNNNNN
    min_len: 15
    max_len: 15
    onlist: !Onlist
      location: remote
      filename: HTO_medatada.txt
    regions: null
  - !Region
    parent_id: hashing
    region_id: common
    region_type: common
    name: common
    sequence_type: fixed
    sequence: NNNNNNNNNN
    min_len: 10
    max_len: 10
    onlist: null
    regions: null
  - !Region
    parent_id: hashing
    region_id: r2_primer
    region_type: r2_primer
    name: r2_primer
    sequence_type: fixed
    sequence: null
    min_len: 0
    max_len: 0
    onlist: null
    regions: null






Writing hto.yaml


## Visualizing the HTO/MULTI SeqSpec

In [63]:
!seqspec print multi.yaml
!seqspec index -t kb -m hashing -i hash_R1.fq,hash_R2.fq multi.yaml

                                ┌─'r1_primer:0'
                                ├─'barcode:16'
                                ├─'umi:12'
─────────────── ──hashing───────┤
                                ├─'cdna:15'
                                ├─'common:10'
                                └─'r2_primer:0'
0,0,16:0,16,28:1,10,25


## Lets create  the seqspec for the other two modalities

### guide.yaml  
    - CB (18) and UMI (12) are standard to 10xv3  
    - 63bp construct  
    - 20bp guide   

### scRNA.yaml  
    - CB (18) and UMI (12) are standard to 10xv3
    - Whole read2 (92bp)   



## Use the following cells to  create your seqspecs and test it

### Guide

In [64]:
%%writefile  guide.yaml






Writing guide.yaml


In [None]:
!seqspec print guide.yaml
!seqspec index -t kb -m guide -i guide_R1.fq,guide_R2.fq guide.yaml

## rna

In [66]:
%%writefile  rna.yaml





Writing rna.yaml


In [None]:
!seqspec print rna.yaml
!seqspec index -t kb -m rna -i scRNA_R1.fq,scRNA_R2.fq rna.yaml