# Cell type-wise differential expression analysis

### Introduction

This notebook demonstrates how to perform differential expression analysis per cell type using a pseudobulk method.<br>
Here we use our PMCA Jak2 dataset.

### Preparing input data

Differential expression analysis uses expression raw counts as input. In this tutorial, we use the raw count data of our PMCA Jak2 samples located in the "data" directory ('PMCA_Jak2_raw_count.h5ad').<br>
The structure of our test data is shown below.

In [1]:
import scanpy as sc
import warnings
warnings.filterwarnings("ignore")

In [2]:
adata = sc.read('data/PMCA_Jak2_raw_count.h5ad')

In [3]:
adata

AnnData object with n_obs × n_vars = 40747 × 27998
    obs: 'library', 'dataset', 'Condition', 'n_genes', 'n_counts', 'percent_mito', 'celltype'
    var: 'gene_ids', 'feature_types', 'genome'
    uns: 'celltype_colors'
    obsm: 'X_umap'

### Running differential expression analysis

This example script generates txt files containing the results of cell type-wise differential expression analysis. The output files will be saved in the "data" directory.
When you analyze your own data, please change the "params" section within the R script.

In [4]:
!R --vanilla --slave < scripts/pseudobulk_DE.R

Attaching SeuratObject
[?25hRegistered S3 method overwritten by 'SeuratDisk':
  method            from  
  as.sparse.H5Group Seurat
[?25hLoading required package: SummarizedExperiment
Loading required package: MatrixGenerics
Loading required package: matrixStats

Attaching package: ‘MatrixGenerics’

The following objects are masked from ‘package:matrixStats’:

    colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
    colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
    colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
    colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
    colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
    colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
    colWeightedMeans, colWeightedMedians, colWeightedSds,
    colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
    rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
    rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs,