# Running Panda and Lioness using the Graphical Processing Unit (GPU)

### Author: 
Daniel Morgan*.

*Channing division of network medicine, Brigham's and Women hospital and Harvard Medical School, Boston, MA.

## Introduction
PANDA [1] estimates gene regulatory networks using information from Transcription Factor (TF) Protein-Protein Interaction (PPI), gene coexpression, and TF DNA binding motifs. At its core, PANDA computes similarites between the three sources of information to infer the weight of regulatory edges, using matrix mulitplication. To accelerate the network inference, Graphical Processing Units (GPUs) can be used to perform matrix multiplication in an efficient manner. The complete case study of this tool can be found here: https://netzoo.github.io/zooanimals/gpuzoo/

## 1. Setup Environment: download, install & import latest packages

Please make sure to install the following package on your Python environment, then import them as follows:

In [None]:
import os
import s3fs
import pandas as pd
import numpy as np
from psutil import *

Also make sure that nvcc, the Nvidia CUDA compiler is installed, you can verify the version as follows:

In [None]:
!nvcc --version

### 1.1 Configure cupy for installed Cuda version
gpuPANDA uses CUDA through the `cuPy` library

1.   use cuda version based on above release (*e.g.* release 10.1= cuda101)
2.   install cupy version 7.4.0 or above 

First, we need to install cuPy.

In [None]:
!pip install cupy-cuda101==7.4.0

Then, we need to import it.

In [None]:
import cupy as cp
# print(cp.__version__)
!pip freeze | grep cupy

## 1.2 GPU and CPU info

In a second step, we need to make sure that the computer detects the GPU card and its drivers installed. This can be done as follows:

In [None]:
!nvidia-smi

In [None]:
!cat /proc/cpuinfo

### 1.3 Clone netZoo
Finally, we need to install gpuPANDA through the netZooPy package. First, we clone the package from GitHub.

In [None]:
!git clone --single-branch --branch devel https://github.com/netZoo/netZooPy.git
os.chdir('netZooPy')
!pip install -e .
os.chdir('..')

Then, we import netZooPy.

In [None]:
import netZooPy
from netZooPy.panda import Panda
from netZooPy.lioness import Lioness

## 2. Load test data

To test gpuPANDA. First, we get the input data from the [GRAND database](https://grand.networkmedicine.org/download/). This can be done directly through the package `pandas`.

In [None]:
LCL_ppi=pd.read_csv('https://granddb.s3.amazonaws.com/cells/ppi/LCL_ppi.txt',sep='\t',header=None)
LCL_expression=pd.read_csv('https://granddb.s3.amazonaws.com/optPANDA/expression/Hugo_exp1_lcl.txt',sep='\t',header=None)
LCL_motif=pd.read_csv('https://granddb.s3.amazonaws.com/gpuPANDA/Hugo_motifCellLine_reduced.txt',sep='\t',header=None)
LCL_ppi.to_csv('LCL_ppi.txt',sep='\t',index=False)
LCL_expression.to_csv('Hugo_exp1_lcl.txt',sep='\t',index=False)
LCL_motif.to_csv('Hugo_motifCellLine_reduced.txt',sep='\t',index=False)

Then, we specify the path to the files we downloaded from the database.

In [None]:
expression_data='Hugo_exp1_lcl.txt'
motif_data='Hugo_motifCellLine_reduced.txt'
ppi_data='LCL_ppi.txt'

## 3. Run Panda with GPU & precision flags

Finally, we run PANDA through the command line. There a few important parameters to conisder. First, we need to set `precision` to `gpu` to run gpuPANDA.
Second, the precision flags allow to compute the network in sinlgle (7 decimal digits) or double precision (15 decimal digits). Although double precision networks are more accurate, single precision can be useful to free memory on devices and for faster run times. It is a tradeoff that depends on each application.

In [None]:
panda_obj = Panda(expression_data, motif_data, ppi_data,computing='gpu',precision='single',save_tmp=False,save_memory = False, remove_missing=True, keep_expression_matrix = True,modeProcess = 'intersection')

In [None]:
panda_obj.save_panda_results('single_cpu_panda.txt')

## 4. Run LIONESS with GPU

LIONESS [2] calls PANDA to estimate regulatory networks for each sample. We can use the GPU acceleration capabilities to estimate sample-specific networks through setting the `computing` flag to `gpu`.

In [None]:
lioness_obj = Lioness(panda_obj,computing='gpu',start=1, end=5, save_dir='lioness_output', save_fmt='txt')

In [None]:
lioness_obj.panda_network.shape

### References
[1] Glass, Kimberly, et al. "Passing messages between biological networks to refine predicted interactions." PloS one 8.5 (2013).

[2] Kuijjer, Marieke Lydia, et al. "Estimating sample-specific regulatory networks." iScience 14 (2019): 226-240.