# Import module

The link to get [ImageAnalysis3](https://github.com/zhengpuas47/ImageAnalysis3) 

Or from the Zhuang lab archived [source_tools](https://github.com/ZhuangLab/Chromatin_Analysis_2020_cell/tree/master/sequential_tracing/source)

## ImageAnalysis3 and basic modules

In [1]:
%run "C:\Users\shiwei\Documents\ImageAnalysis3\required_files\Startup_py3.py"
sys.path.append(r"C:\Users\shiwei\Documents")

import ImageAnalysis3 as ia
from ImageAnalysis3 import *
from ImageAnalysis3.classes import _allowed_kwds

import h5py
import ast
import pandas as pd

print(os.getpid())

41276


## Chromatin_analysis_tools etc

See **functions** in the repository for [AnalysisTool_Chromatin](../../README.md)

In [2]:
# Chromatin_analysis_tools (ATC)
# Get path for the py containing functions
import os
import sys
import importlib
module_path =r'C:\Users\shiwei\Documents\AnalysisTool_Chromatin'
if module_path not in sys.path:
    sys.path.append(module_path)
    
# import relevant modules
import gene_selection 
importlib.reload(gene_selection)
import gene_to_loci
importlib.reload(gene_to_loci)
import gene_activity
importlib.reload(gene_activity)
import loci_1d_features
importlib.reload(loci_1d_features)  

import atac_to_loci
importlib.reload(atac_to_loci)

<module 'atac_to_loci' from 'C:\\Users\\shiwei\\Documents\\AnalysisTool_Chromatin\\atac_to_loci.py'>

# Define folders

In [3]:
# main folder for postanalysis
postanalysis_folder = r'L:\Shiwei\postanalysis_2024\v0'
# input files for postanalysis
input_folder = os.path.join(postanalysis_folder, 'resources_from_preprocess')

# output file to be generated
output_main_folder = os.path.join(postanalysis_folder, 'compartment_transcription')

output_analysis_folder = os.path.join(output_main_folder, 'analysis')
output_figure_folder = os.path.join(output_main_folder, 'figures')

# make new folder if needed
make_output_folder = True

if make_output_folder and not os.path.exists(output_analysis_folder):
    os.makedirs(output_analysis_folder)
    print(f'Generating analysis folder: {output_analysis_folder}.')
elif os.path.exists(output_analysis_folder):
    print(f'Use existing analysis folder: {output_analysis_folder}.')
    
if make_output_folder and not os.path.exists(output_figure_folder):
    os.makedirs(output_figure_folder)
    print(f'Generating figure folder: {output_figure_folder}.')
elif os.path.exists(output_figure_folder):
    print(f'Use existing figure folder: {output_figure_folder}.')

Use existing analysis folder: L:\Shiwei\postanalysis_2024\v0\compartment_transcription\analysis.
Use existing figure folder: L:\Shiwei\postanalysis_2024\v0\compartment_transcription\figures.


# Plotting parameters

In [4]:
%matplotlib inline
import matplotlib
matplotlib.rcParams['pdf.fonttype'] = 42
import matplotlib.pyplot as plt
plt.rc('font', family='serif')
plt.rc('font', serif='Arial')

from ImageAnalysis3.figure_tools import _double_col_width, _single_col_width, _font_size, _ticklabel_size,_ticklabel_width

import seaborn as sns
sns.set_context("paper", rc={"font.size":_font_size,"axes.titlesize":_font_size+1,"axes.labelsize":_font_size})  

In [5]:
# Other required plotting parameters
_dpi = 300
_font_size = 7
_page_width = 5.5


## cell type color-codes

In [6]:
# cell labels from RNA-MERFISH and celltype prediction
selected_cell_labels = ['L2/3 IT','L4/5 IT','L5 IT','L6 IT','L5 ET','L5/6 NP','L6 CT','L6b',
                           'Sst','Pvalb','Lamp5','Sncg','Vip',
                           'Astro','Oligo','OPC','Micro','Endo','VLMC','SMC','Peri', 
                           #'other',
                          ]
# cell palette from RNA-MERFISH UMAP and stats
celltype_palette = {'Astro':'lightcoral', 
                    'Endo':'skyblue', 
                    'L2/3 IT':'gold', 
                    'L4/5 IT':'darkorange', 
                    'L5 ET':'mediumseagreen', 
                    'L5 IT':'aqua',
                    'L5/6 NP':'darkgreen',
                    'L6 CT':'brown',
                    'L6 IT':'magenta',
                    'L6b':'blue', 
                    'Lamp5':'orange', 
                    'Micro':'peachpuff',
                    'OPC':'thistle', 
                    'Oligo':'darkviolet',
                    'Peri':'sandybrown',
                    'Pvalb':'springgreen',
                    'SMC':'rosybrown',
                    'Sncg':'darkkhaki',
                    'Sst':'steelblue', 
                    'VLMC':'saddlebrown', 
                    'Vip':'red',
                    'other':'slategray'}


In [7]:
# this is the plotting order noted based on the snRNA transcriptional acitivty if needed
sorted_cellplot_order_byRNA = ['Micro', 'Oligo', 'Endo', 'OPC', 'Astro', 'Vip', 'Lamp5',
                  'L5/6 NP', 'Sst', 'Sncg', 'Pvalb', 'L4/5 IT', 'L6 CT',
                  'L6 IT', 'L6b', 'L2/3 IT', 'L5 IT', 'L5 ET']

# Load data relevant information

## load and format codebook

[merged codebook](../resources/merged_codebook.csv) as in the repository (merged for all DNA-MERFISH libraries)

In [8]:
# Load codebook 
codebook_fname = os.path.join(input_folder,'merged_codebook.csv')
codebook_df = pd.read_csv (codebook_fname, index_col=0)

# sort df by chr and chr_order
codebook_df = loci_1d_features.sort_loci_df_by_chr_order (codebook_df)
codebook_df.head()

Unnamed: 0,name,id,NDB_784,NDB_755,NDB_826,NDB_713,NDB_865,NDB_725,NDB_817,NDB_710,...,NDB_479,NDB_562,NDB_608,NDB_460,NDB_563,NDB_592,NDB_368,NDB_436,NDB_629,NDB_604
0,1:3742742-3759944,1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,1:6245958-6258969,2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,1:8740008-8759916,3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1016,1:9627926-9637875,1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1017,1:9799472-9811359,2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0


In [9]:
# Format the chr loci name by 
# 1. changing loci name format
# 2. extract relevant information such as id, chr, chr_order, and library etc
from gene_to_loci import loci_pos_format
loci_name_list = list(map(loci_pos_format, codebook_df['name'].tolist()))
loci_name_arr = np.array(loci_name_list)

# convert to a new dataframe and set loci name as index
codebook_df = codebook_df[['name','id','chr','chr_order','library']]
codebook_df['loci_name'] = list(loci_name_arr[:,0])
codebook_df = codebook_df.set_index ('loci_name')

codebook_df.head()

Unnamed: 0_level_0,name,id,chr,chr_order,library
loci_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
chr1_3742742_3759944,1:3742742-3759944,1,1,0.0,CTP11
chr1_6245958_6258969,1:6245958-6258969,2,1,1.0,CTP11
chr1_8740008_8759916,1:8740008-8759916,3,1,2.0,CTP11
chr1_9627926_9637875,1:9627926-9637875,1,1,3.0,CTP13
chr1_9799472_9811359,1:9799472-9811359,2,1,4.0,CTP13


# Load gene annotation

Data can be generated from the notebook with snRNA seq data as input:
[external/scripts/genome_annotation/1_genomic_info_for_mop_genes_using_ensembl_transcriptome](../../external/scripts/genome_annotation/1_genomic_info_for_mop_genes_using_ensembl_transcriptome.ipynb)



Example of [MOp_10x_snRNA_chr_info](../../external/resources/MOp_10x_snRNA_chr_info_NEW_from_transcriptome_FORMAT.csv) in the repository.




In [12]:
# load gene annotation (covering all genes from the SMART-seq) for chr locus
# L drive is Crick Pu_SSD_0
scRNA_folder = r'L:\Shiwei\DNA_MERFISH_analysis\10x_nuclei_v3_MOp_AIBS\Analysis_10X_nuclei_v3_AIBS\processed'
gene_annotation_df = pd.read_csv(os.path.join(scRNA_folder, "MOp_10x_snRNA_chr_info_NEW_from_transcriptome_FORMAT.csv"),index_col=0)

gene_annotation_df.head()

Unnamed: 0_level_0,chr,start,end,gene_biotype,coding_strand,length,genomic_position
gene,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Xkr4,1,3205901,3671498,protein_coding,-1,465597,chr1_3205901_3671498
Gm1992,1,3466587,3513553,antisense,1,46966,chr1_3466587_3513553
Gm37381,1,3905739,3986215,lincRNA,-1,80476,chr1_3905739_3986215
Rp1,1,3999557,4409241,protein_coding,-1,409684,chr1_3999557_4409241
Sox17,1,4490931,4497354,protein_coding,-1,6423,chr1_4490931_4497354


# Load pre-processed snRNA anndata

Data can be generated from the notebook

[external/scripts/sn_rna/2_prepare_and_rename_sn_rna_mop](../../external/scripts/sn_rna/2_prepare_and_rename_sn_rna_mop.ipynb)

In [13]:
# Get loaded adata from other notebook
#%store -r adata
import os
import scanpy as sc
# load from here for saved h5ad
adata = sc.read(os.path.join(scRNA_folder,r'MOp_10x_sn_labeled.h5ad'))

In [14]:
adata.obs.columns

Index(['aggr_num', 'umi.counts', 'gene.counts', 'library_id', 'tube_barcode',
       'Seq_batch', 'Region', 'Lib_type', 'Gender', 'Donor', 'Amp_Name',
       'Amp_Date', 'Amp_PCR_cyles', 'Lib_Name', 'Lib_Date', 'Replicate_Lib',
       'Lib_PCR_cycles', 'Lib_PassFail', 'Cell_Capture', 'Lib_Cells',
       'Mean_Reads_perCell', 'Median_Genes_perCell', 'Median_UMI_perCell',
       'Saturation', 'Live_percent', 'Total_Cells', 'Live_Cells', 'method',
       'exp_component_name', 'mapped_reads', 'unmapped_reads',
       'nonconf_mapped_reads', 'total.reads', 'doublet.score',
       'subclass_label', 'class_label', 'cluster_label', 'cluster_id',
       'n_genes', 'n_genes_by_counts', 'total_counts',
       'pct_counts_in_top_50_genes', 'pct_counts_in_top_100_genes',
       'pct_counts_in_top_200_genes', 'pct_counts_in_top_500_genes', 'leiden',
       'subclass_label_new', 'class_label_new', 'neuron_identity'],
      dtype='object')

In [15]:
# normalize and log1p and create new raw layer for marker gene analysis

# retrive unnormalized full gene sets from the raw layer
adata_ori = adata.raw.to_adata()
#print(adata_ori.X.toarray()[:5,:5])
sc.pp.normalize_total(adata_ori, target_sum=np.median(adata_ori.obs['total_counts']))
#print(adata_ori.X.toarray()[:5,:5])
sc.pp.log1p(adata_ori) # the warning about log1p is not correct; print to inspect if needed
#print(adata_ori.X.toarray()[:5,:5])


# replace the old raw layer since the downstream script will use the normalized and log1p-raw layer
adata_ori.raw = adata_ori
# replace variable as adata will be used later in the script
adata = adata_ori



# Process all cellgroups for the markers

## define shared parameters

In [16]:
# the hierachy level for cell grouping
_groupby = 'subclass_label_new'

# filter marker genes by pts, mena count, etc or not
filter_markers = False

In [17]:
# the cell groups to be analyzed
sorted_group_order = ['L2/3 IT','L4/5 IT','L5 IT','L6 IT','L5 ET','L5/6 NP','L6 CT','L6b',
                           'Sst','Pvalb','Lamp5','Sncg','Vip',
                           #'Astro','Oligo','OPC','Micro','Endo',#'VLMC','SMC','Peri', 
                           #'other',
                          ]

adata = adata[adata.obs['neuron_identity']=='Neuronal']
np.unique(adata.obs['subclass_label_new'])

array(['L2/3 IT', 'L4/5 IT', 'L5 ET', 'L5 IT', 'L5/6 NP', 'L6 CT',
       'L6 IT', 'L6b', 'Lamp5', 'Pvalb', 'Sncg', 'Sst', 'Vip'],
      dtype=object)

## loop to process each cell group

In [18]:
# re-import functions
import loci_1d_features as lf
import gene_selection as gs
import gene_to_loci as gl

#load_processed_marker_df= False

In [19]:

# compile result as (marker loci of group) by (median of sc for group)
#compiled_df = pd.DataFrame(columns = sorted_group_order)

for _marker_group in sorted_group_order[:]:


    print (f'Process marker gene loci for {_marker_group}.')   
    ##############################################################################
    # 1. get marker loci inds
    # find markers using 150 genes as candidate
    marker_genes_df = gs.rank_genes_groups_into_dataframe (adata, 
                                          _groupby, 
                                          _marker_group,
                                          'rest', 
                                          n_genes=200, 
                                          stat_method='wilcoxon', 
                                          add_control_genes = True,
                                          use_raw = True,
                                          )
    # add score, pts, and mean for markers
    marker_genes_df = gs.add_scores_into_gene_dataframe(adata, marker_genes_df)
    marker_genes_df = gs.add_pts_into_gene_dataframe(adata, marker_genes_df)
    marker_genes_df = gs.add_mean_into_gene_dataframe(adata, marker_genes_df)

    # filter markers
    if filter_markers:
        marker_genes_df_filter = gs.filter_pts_for_gene_dataframe (marker_genes_df, 
                                                              high_th_in_group_1 = 0.5, 
                                                              low_th_in_group_2 = 0.2, 
                                                              control_as_inactive=True)
        marker_genes_df_filter = gs.filter_mean_for_gene_dataframe (marker_genes_df_filter, 
                                                              high_th_in_group_1 = 20, 
                                                              high_th_in_group_2 = None)
        marker_genes_df_filter = gs.filter_foldchange_for_gene_dataframe (marker_genes_df_filter, 
                                                              change_th_in_group_1 = 2, 
                                                              change_th_in_group_2 = None)

    else:
        marker_genes_df_filter = marker_genes_df.copy()

    # get genomic pos and adjacent imaged loci; keep only upregulated and downregulated loci
    sel_marker_genes_df = marker_genes_df_filter[marker_genes_df_filter['Expression_change']!='control']

    sel_marker_genes_df = gl.get_genomic_for_gene_dataframe(sel_marker_genes_df, gene_annotation_df)

    sel_marker_genes_df = gl.get_imaged_loci_near_gene_dataframe (sel_marker_genes_df, 
                                             codebook_df, nearby_type='tss', 
                                             nearby_dist=100*1000, # extend for 100kb
                                             num_threads = 8, 
                                             parallel=True)

    # get inds in the codebook
    im_loci_df  = lf.im_loci_dataframe_from_gene_dataframe (sel_marker_genes_df, sel_cols = None)
    #im_loci_df = lf.sort_loci_df_by_chr_order(im_loci_df)

    im_loci_df  = lf.codebook_chr_order_for_loci_dataframe  (im_loci_df, codebook_df,sel_cols =['chr','chr_order','id'], 
                                               sort_df = True,
                                               sort_by_chr= True)

    # save markers info
    _marker_savename = _marker_group.replace("/","_")
    _groupby_savename = 'class'
    marker_genes_fname = os.path.join(output_analysis_folder, 'marker_neuron',f'{_groupby_savename}_{_marker_savename}_vs_rest.csv')
    im_loci_df.to_csv(marker_genes_fname)

        

Process marker gene loci for L2/3 IT.
Start Gene Selection:
1. Ranking marker genes with Scanpy using wilcoxon.


  self.data[key] = value
  scores[group_index, :] = (


2. Selecting marker genes for expression change direction.
3. Generating a DataFrame for the selected marker genes.
There are 200 genes selected for upregulated expression.
There are 200 genes selected for downregulated expression.
There are 200 genes selected for control expression.
Complete Gene Selection in 446.814s.
Use existing result from the input adata.
Add Gene Scores from Rank Genes Groups analysis.
Use existing result from the input adata.
Add Gene Expression Percentage.
Add Gene Expression Means.
Multiprocessing for loci matching:
Complete in 9.187s.
Remove loci whose match were not found.
Retrieve all columns from Gene DataFrame
Process marker gene loci for L4/5 IT.
Start Gene Selection:
1. Ranking marker genes with Scanpy using wilcoxon.


  scores[group_index, :] = (


2. Selecting marker genes for expression change direction.
3. Generating a DataFrame for the selected marker genes.
There are 200 genes selected for upregulated expression.
There are 200 genes selected for downregulated expression.
There are 200 genes selected for control expression.
Complete Gene Selection in 300.388s.
Use existing result from the input adata.
Add Gene Scores from Rank Genes Groups analysis.
Use existing result from the input adata.
Add Gene Expression Percentage.
Add Gene Expression Means.
Multiprocessing for loci matching:
Complete in 4.766s.
Remove loci whose match were not found.
Retrieve all columns from Gene DataFrame
Process marker gene loci for L5 IT.
Start Gene Selection:
1. Ranking marker genes with Scanpy using wilcoxon.


  scores[group_index, :] = (


2. Selecting marker genes for expression change direction.
3. Generating a DataFrame for the selected marker genes.
There are 200 genes selected for upregulated expression.
There are 200 genes selected for downregulated expression.
There are 200 genes selected for control expression.
Complete Gene Selection in 270.422s.
Use existing result from the input adata.
Add Gene Scores from Rank Genes Groups analysis.
Use existing result from the input adata.
Add Gene Expression Percentage.
Add Gene Expression Means.
Multiprocessing for loci matching:
Complete in 4.724s.
Remove loci whose match were not found.
Retrieve all columns from Gene DataFrame
Process marker gene loci for L6 IT.
Start Gene Selection:
1. Ranking marker genes with Scanpy using wilcoxon.


  scores[group_index, :] = (


2. Selecting marker genes for expression change direction.
3. Generating a DataFrame for the selected marker genes.
There are 200 genes selected for upregulated expression.
There are 200 genes selected for downregulated expression.
There are 200 genes selected for control expression.
Complete Gene Selection in 269.746s.
Use existing result from the input adata.
Add Gene Scores from Rank Genes Groups analysis.
Use existing result from the input adata.
Add Gene Expression Percentage.
Add Gene Expression Means.
Multiprocessing for loci matching:
Complete in 4.935s.
Remove loci whose match were not found.
Retrieve all columns from Gene DataFrame
Process marker gene loci for L5 ET.
Start Gene Selection:
1. Ranking marker genes with Scanpy using wilcoxon.


  scores[group_index, :] = (


2. Selecting marker genes for expression change direction.
3. Generating a DataFrame for the selected marker genes.
There are 200 genes selected for upregulated expression.
There are 200 genes selected for downregulated expression.
There are 200 genes selected for control expression.
Complete Gene Selection in 268.622s.
Use existing result from the input adata.
Add Gene Scores from Rank Genes Groups analysis.
Use existing result from the input adata.
Add Gene Expression Percentage.
Add Gene Expression Means.
Multiprocessing for loci matching:
Complete in 4.632s.
Remove loci whose match were not found.
Retrieve all columns from Gene DataFrame
Process marker gene loci for L5/6 NP.
Start Gene Selection:
1. Ranking marker genes with Scanpy using wilcoxon.


  scores[group_index, :] = (


2. Selecting marker genes for expression change direction.
3. Generating a DataFrame for the selected marker genes.
There are 200 genes selected for upregulated expression.
There are 200 genes selected for downregulated expression.
There are 200 genes selected for control expression.
Complete Gene Selection in 269.802s.
Use existing result from the input adata.
Add Gene Scores from Rank Genes Groups analysis.
Use existing result from the input adata.
Add Gene Expression Percentage.
Add Gene Expression Means.
Multiprocessing for loci matching:
Complete in 4.613s.
Remove loci whose match were not found.
Retrieve all columns from Gene DataFrame
Process marker gene loci for L6 CT.
Start Gene Selection:
1. Ranking marker genes with Scanpy using wilcoxon.


  scores[group_index, :] = (


2. Selecting marker genes for expression change direction.
3. Generating a DataFrame for the selected marker genes.
There are 200 genes selected for upregulated expression.
There are 200 genes selected for downregulated expression.
There are 200 genes selected for control expression.
Complete Gene Selection in 271.257s.
Use existing result from the input adata.
Add Gene Scores from Rank Genes Groups analysis.
Use existing result from the input adata.
Add Gene Expression Percentage.
Add Gene Expression Means.
Multiprocessing for loci matching:
Complete in 4.695s.
Remove loci whose match were not found.
Retrieve all columns from Gene DataFrame
Process marker gene loci for L6b.
Start Gene Selection:
1. Ranking marker genes with Scanpy using wilcoxon.


  scores[group_index, :] = (


2. Selecting marker genes for expression change direction.
3. Generating a DataFrame for the selected marker genes.
There are 200 genes selected for upregulated expression.
There are 200 genes selected for downregulated expression.
There are 200 genes selected for control expression.
Complete Gene Selection in 268.427s.
Use existing result from the input adata.
Add Gene Scores from Rank Genes Groups analysis.
Use existing result from the input adata.
Add Gene Expression Percentage.
Add Gene Expression Means.
Multiprocessing for loci matching:
Complete in 4.651s.
Remove loci whose match were not found.
Retrieve all columns from Gene DataFrame
Process marker gene loci for Sst.
Start Gene Selection:
1. Ranking marker genes with Scanpy using wilcoxon.


  scores[group_index, :] = (


2. Selecting marker genes for expression change direction.
3. Generating a DataFrame for the selected marker genes.
There are 200 genes selected for upregulated expression.
There are 200 genes selected for downregulated expression.
There are 200 genes selected for control expression.
Complete Gene Selection in 268.953s.
Use existing result from the input adata.
Add Gene Scores from Rank Genes Groups analysis.
Use existing result from the input adata.
Add Gene Expression Percentage.
Add Gene Expression Means.
Multiprocessing for loci matching:
Complete in 4.623s.
Remove loci whose match were not found.
Retrieve all columns from Gene DataFrame
Process marker gene loci for Pvalb.
Start Gene Selection:
1. Ranking marker genes with Scanpy using wilcoxon.


  scores[group_index, :] = (


2. Selecting marker genes for expression change direction.
3. Generating a DataFrame for the selected marker genes.
There are 200 genes selected for upregulated expression.
There are 200 genes selected for downregulated expression.
There are 200 genes selected for control expression.
Complete Gene Selection in 269.176s.
Use existing result from the input adata.
Add Gene Scores from Rank Genes Groups analysis.
Use existing result from the input adata.
Add Gene Expression Percentage.
Add Gene Expression Means.
Multiprocessing for loci matching:
Complete in 4.624s.
Remove loci whose match were not found.
Retrieve all columns from Gene DataFrame
Process marker gene loci for Lamp5.
Start Gene Selection:
1. Ranking marker genes with Scanpy using wilcoxon.


  scores[group_index, :] = (


2. Selecting marker genes for expression change direction.
3. Generating a DataFrame for the selected marker genes.
There are 200 genes selected for upregulated expression.
There are 200 genes selected for downregulated expression.
There are 200 genes selected for control expression.
Complete Gene Selection in 268.222s.
Use existing result from the input adata.
Add Gene Scores from Rank Genes Groups analysis.
Use existing result from the input adata.
Add Gene Expression Percentage.
Add Gene Expression Means.
Multiprocessing for loci matching:
Complete in 4.702s.
Remove loci whose match were not found.
Retrieve all columns from Gene DataFrame
Process marker gene loci for Sncg.
Start Gene Selection:
1. Ranking marker genes with Scanpy using wilcoxon.


  scores[group_index, :] = (


2. Selecting marker genes for expression change direction.
3. Generating a DataFrame for the selected marker genes.
There are 200 genes selected for upregulated expression.
There are 200 genes selected for downregulated expression.
There are 200 genes selected for control expression.
Complete Gene Selection in 269.080s.
Use existing result from the input adata.
Add Gene Scores from Rank Genes Groups analysis.
Use existing result from the input adata.
Add Gene Expression Percentage.
Add Gene Expression Means.
Multiprocessing for loci matching:
Complete in 4.658s.
Remove loci whose match were not found.
Retrieve all columns from Gene DataFrame
Process marker gene loci for Vip.
Start Gene Selection:
1. Ranking marker genes with Scanpy using wilcoxon.


  scores[group_index, :] = (


2. Selecting marker genes for expression change direction.
3. Generating a DataFrame for the selected marker genes.
There are 200 genes selected for upregulated expression.
There are 200 genes selected for downregulated expression.
There are 200 genes selected for control expression.
Complete Gene Selection in 284.307s.
Use existing result from the input adata.
Add Gene Scores from Rank Genes Groups analysis.
Use existing result from the input adata.
Add Gene Expression Percentage.
Add Gene Expression Means.
Multiprocessing for loci matching:
Complete in 10.328s.
Remove loci whose match were not found.
Retrieve all columns from Gene DataFrame
