# CellRanger

Take a look at the CellRanger [website](https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/what-is-cell-ranger), there is great documentation on how these tools work and how to run them. Here, we will run CellRanger on the first batch of data from the Haber et al. paper. 

We downloaded data from the first batch of cells, which we learn from the Methods section contained approximately 1500 cells. For a conservative estimate, we will run cellRanger Count with an expected number of cells at 3000. 

Since this pipeline requires a long time to run, we will submit a job rather than running it interactively like we did with dropseqtools. Take a look at the data download [notebook](URL_here) to get more information on how the fastqs were downloaded. The raw data is located in the class shared folder here: ```/oasis/tscc/scratch/cshl_2018/raw_data_haber/batch1/Atlas1/```



**Organize folders for processing**

I like to keep track of my projects by making a separate folder for each one with the scripts that I used to generate my results (stored in my home) and a folder for the results (which can be quite large and must be stored in scratch). The code below assumes you have made a softlink into your scratch directory and that softlink exists in your home (Described in detail in notebook [1_Dropseqtools](add_URL_later). 

```bash
mkdir -p ~/projects/haber_batch1/scripts/
mkdir -p ~/scratch/projects/haber_batch1/cellranger_results/
```

**Write a processing script template**

To submit a job to the queue (rather than running commands interactively), you write a script and designate the job submission parameters with flags. I like to make a template in my home that I can copy whenever I need to make a new script and just update the amount of compute resources I request. Make a file called ```fake_script.sh``` in your home directory. Add all the lines below that begin with a ```#```. We will go over what each of these mean together, and you can read more about them [here](http://www.sdsc.edu/support/user_guides/tscc-quick-start.html).

```bash
cd ~
vi fake_script.sh
i
#!/bin/bash
#PBS -q home-yeo
#PBS -N jobname
#PBS -l nodes=1:ppn=1
#PBS -l walltime=1:00:00
#PBS -o outputfile
#PBS -e errorfile

#write_command_here

esc
:wq
```


**Copy template and add CellRanger command**

Copy your dummy file into the scripts folder that you created for the haber_batch1 data and give it a meaningful name:

```bash
cp ~/fake_script.sh ~/projects/haber_batch1/scripts/cellranger_count.sh
```

Edit that script to update the values for all the flags. I will use: 
```bash
#!/bin/bash
#PBS -q home-yeo
#PBS -N cellranger_count
#PBS -l nodes=1:ppn=4
#PBS -l walltime=16:00:00
#PBS -o cellranger_count.out
#PBS -e cellranger_count.err
```

**Read about the CellRanger Count Command**

Notice the syntax of my command below. The backslash at the end of the line is used for readability purposes. Usually, when you enter to a new line, that assumes you are entering a new command. However, the ```\``` tells the computer that what comes on the next line, is actually part of the command on the previous line. Edit your cellranger_count.sh script to include this command below the ```#PBS``` submission parameters. 

```bash
cellranger count \
--id Atlas1_batch1 \
--fastqs ~/cshl_2018/raw_data_haber/batch1/Atlas1/ \
--sample bamtofastq \
--transcriptome ~/software/cellranger-2.1.1/refdata-cellranger-mm10-1.2.0 \
--expect-cells 1500
```


**Submit the job with qsub**

By default, cellranger count will put the results in the same folder where the script was run. So first, I will move into scratch where I want the output results. In order for this to work properly when submitting a job to the cluster, I will add the ```cd``` command in the script above ```cellranger count```.

```bash
cd ~/scratch/projects/haber_batch1/cellranger_results/```

When you are happy with your script, you submit the file to the queue with:

```bash
qsub cellranger_count.sh```

Once it has been accepted, you can check on the status with:

```bash
qstat -u ucsd-trainXY```

If you realize you made a mistake, you can delete your job with:

```bash
qdel JOBID##```

For more practice with job submissions, take a look at the [TSCC_job_submission](URL_later) notebook in the tutorials folder.  