# Installs
**1) Create anaconda environment with python3**

    conda create --yes -n cmm262-py3 python=3 scipy numpy jupyter scikit-learn numpy matplotlib seaborn pandas statsmodels seaborn ipykernel
    
This will install a bunch of packages into it's own isolated environment (and not interfere with anything you have installed so far). 

To activate this environment. Type:

    source activate cmm262-py3
    
Notice the name of this environment is what we provided in the -n flag above. 

Now your command line prompt should begin with (cmm262-py3) before your TSCC information. 

**2) Install phenograph in your environment**

We have to use pip to install [phenograph](https://github.com/jacoblevine/PhenoGraph). 

    pip install git+https://github.com/jacoblevine/phenograph.git
    
**3) Add this environment as a kernel in jupyter**

To access all these packages in jupyter, we need to make a new kernel from this environment. 

    python -m ipykernel install --user --display-name "Python 3 (cmm262)"
    
**4) Check if it worked!**

Open a new notebook using the Python3 cmm262 kernel. 

# Organizing folders

**1) Make new folder for our analyses**

I would recommend putting a folder in your home directory where we will save all the results from these analyses. We will also softlink the data folder into that directory for easy access to the data. Let's make a new directory in your projects folder. 

    cd ~/projects
    
    mkdir single_cell_intestine
    
Move into that directory and make a folder for raw data and softlink the csv files as shown below. 

    cd single_cell_intestine
    
    mkdir raw_data
    
    cd raw_data
    
    ln -s /oasis/tscc/scratch/biom200/cmm262/single_cell/raw_data/*.csv ./
    
We are also going to make a folder for the results. Move up one directory so you are back in the single_cell_intestine folder and make another folder in there called results. 

    cd ~/projects/single_cell_intestine
    
    mkdir results
 

# Seurat - pre-cooked analysis tools

The seurat [website](http://satijalab.org/seurat/) has a lot of great tutorials explaining how to use their tools.

The majority of our analyses in this class will not use Seurat because it has a lot of "pre-cooked" analyses and our goals are to delve deeper into each step to understand what is happening behind the scene. However, it is a very useful tool to get an easy first-pass look at your data. Especially if you have sequenced with 10X, it can very easily read the output from the CellRanger pipeline. To install and use Seurat in a R notebook through jupyter, follow the comands below. We will be using this on Thursday, so please find time to perform these installations before class.

Login to tscc:

    /projects/ps-yeolab/software/yeolabmodules/install_yeolabmodules
    
    exit
    
You should now be logged out of TSCC. 

Log back in!

    ssh ucsd-train##@tscc-login.sdsc.edu
    
Once you are logged in again, we are going to load the seurat module. 

    pathreset
    
    module load seurat
    
Now we are going to make a jupyter kernel from this environment so we can use Seurat in jupyer. 

    R
    
    IRkernel::installspec(displayname="seurat")
    
Quit R and logout of tscc:

    quit()
    
    exit 
    
You can log back in to load jupyter. But it is important to logout first to get out of the seurat module that you have loaded. 

# Jupyter notebooks on an interactive node

Some of the functions in Seurat are pretty memory intensive, so we need to load our notebooks on an interactive node! There are a couple additional steps to loading your notebooks. Follow them as shown below:

1) Get an interactive node:

    qsub -I -l nodes=1:ppn=2 -l walltime=4:00:00 -q hotel
    
2) Start jupyter as you normally would

    jupyter notebook --no-browser --port #### &
    
3) Wait until your login token appears on the screen. Then setup a tunnel FROM TSCC. Press enter to get your command prompt back. Then enter the following command. Replace #### with your port number and # with the tscc login node you are logged in on.

    ssh -NR ####:localhost:#### tscc-login# &
    
4)
For Mac users: Move to your local computer (not logged into tscc, on another tab) and complete the tunnel. Replace #### with your port number, ## with your train account number, and # with your login node.

    ssh -NL ####:localhost:#### ucsd-train##@tscc-login#.sdsc.edu
    
For Windows users: go to your Putty session, load the session you used to log in previously, re-name it to something like TSCC_serat.
Move to 'Connection->SSH->Tunnels'. Source port is: ####, Destination is: localhost:####. Then add the tunnel (these are the same steps used when you first created a jupyter notebook)

5) Open jupyter in a web browser. Copy and paste the login token from tscc into a web browser and open jupyter. Check out if your seurat kernel is present by opening a new notebook and choosing the seurat kernel. 

6) Shutting down the session. After 4 hours (or the length specified on your interactive job), everything will automatically shut down. To close before the 4 hours is over, type exit on the login node where you loaded the notebooks. This will exit the interactive job and take you back to the login node.