# Coding with Kallisto

This notebook explains the commands in Kallisto, and how you can use the package in programming languages primarily in Python.

## Commands

There are several commands that can be used. The full manual can be found here: https://pachterlab.github.io/kallisto/manual

**kallisto**
produces a list of usage options, where the options are:

index

quant

quant-tcc

bus

h5dump

inspect

version

cite

**kallisto index** builds index from FASTA formatted file of target sequences.

**kallisto quant** runs the quantification algorithm.

**kallisto quant-tcc** runs the EM algorithm to produce estimated counts from a transcript-compatibility-counts matrix file (which is in a MatrixMarket format where each column is an equivalence class and each row is a sample). 

**kallisto bus** works with raw FASTQ files for single-cell RNA-Seq datasets.

**kallisto h5dump** converts HDF5-formatted results to plaintext.

**kallisto inspect** can output the Target de Bruijn Graph in the index in two ways, as a file in GFA format or it can map the contigs of the graph and and equivalence classes in a BED format that can be visualized using IGV.

**kallisto version** displays the current version of the software.

**kallisto cite** displays the citation for the paper.

## Python

Now we will go step by step, explaining how to run Kallisto in Python, specifically a .ipynb Jupyter Notebook that is run in Visual Studio Code.

#### 1. Open the PowerShell/Windows Command Prompt

#### 2. In the PowerShell window, type 'wsl'
This will bring you into the WSL environment, where you have Ubuntu installed.

#### 3. Install Python, pip, and Jupyter
In the command prompt, type:

sudo apt update

sudo apt install python3 python3-pip

pip3 install jupyter

*Note this is a 1 time step, so if you have done this already in wsl, you can move onto the next step.
#### 4. Navigate to your notebook's directory
In WSL, the Windows file system is mounted under '/mnt/'. To navigate to your notebook's location, type 'cd /mnt/(file path of where notebook is located)'

Ex. If your notebook is located in the file path: C:\Users\OneDrive\Test_Folder, the command you would type is 'cd /mnt/c/Users/OneDrive/Test_Folder'.
*Make sure to use '/' and **not** '\' in the command prompt. Also, it should just be where your notebook is located, **not** the file path of the notebook.

#### 5. Start the Jupyter Notebook Server
In the command prompt, type:

jupyter notebook --no-browser --allow-root

You will see some messages, then a URL that will look like:

http://localhost:8888/?token=YOUR_UNIQUE_TOKEN

**Keep this terminal open**

#### 6. Open Visual Studio Code
Make sure you have the Jupyter extension installed. If not, install it from the extensions view (You can do so by clicking the blocks icon in the left panel, or simply do 'Ctrl+Shift+X' and search for 'Jupyter'.)

#### 7. Open the desired Jupyter Notebook in Visual Studio Code

#### 8. Select Kernel
At the top right of the window, you will see a server logo that says 'Select Kernel'. Click on that.

#### 9. Click on 'Select Another Kernel'

#### 10. Select 'Existing Jupyter Server'

#### 11. Select 'Enter the URL of the running Jupyter Server'

#### 12. Enter the URL
Copy the URL from the command prompt in step 5. Paste that into the input box in Visual Studio Code and press enter. 
*Note it may ask you to change the display name, but simply press enter again to continue.

#### 13. Select 'Python 3 (ipykernel)'

#### 14. Verify Kallisto runs
In your Jupyter Notebook, run '!kallisto version' in a code cell to see if Kallisto is accessible. If you see the version number then you were successful!


In [1]:
!kallisto version

kallisto, version 0.46.1


To use any of the kallisto commands, simply put '!' before it. This is because '!' tells the Jupyter environment to execute the command in the system shell rather than trying to interpret it as Python code. If you were writing Python code in the cell, you wouldn't use '!', it is only for shell commands.

If you want to capture the output of the command within Python for further procession, you can use the 'subprocess' module. Here is some example code:

In [2]:
import subprocess

cmd = ["kallisto", "index", "-i", "transcripts.idx", "transcripts.fasta"]
result = subprocess.run(cmd, capture_output=True, text=True)

# Print the output
print(result.stdout)

# Check for errors (if any)
if result.stderr:
    print("Errors:")
    print(result.stderr)


kallisto 0.46.1
Builds a kallisto index

Usage: kallisto index [arguments] FASTA-files

Required argument:
-i, --index=STRING          Filename for the kallisto index to be constructed 

Optional argument:
-k, --kmer-size=INT         k-mer (odd) length (default: 31, max value: 31)
    --make-unique           Replace repeated target names with unique names


Errors:

Error: FASTA file not found transcripts.fasta



## R

Kallisto is not a package that can be found in R. However, there is a closely related package called 'sleuth' that is designed to work with Kallisto output. 'sleuth' is used for differential expression analysis of the transcript quantifications produced by Kallisto. If you are working within the R ecosystem, 'sleuth' allows you to read in Kallisto's output and perform further statistical analyses.

Now we will go step by step, installing sleuth in R.

#### 1. Install RStudio

#### 2. Install Bioconductor Packages
In an R command window, type:

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install()

#### 3. Install sleuth
In the same R command window, type:

BiocManager::install("sleuth")

#### 4. Load sleuth
In your R session, type:

library(sleuth)