<p style="font-size:18px; font-weight:bold;"> 2025 Olivia Debnath</p>
<p style="font-size:14px;">Dana-Farber Cancer Institute & Harvard Medical School</p>

**Hypothesis 2 (H₂): Facilitated co-expression & functional dependence**
    
If a lowly expressed gene (POI: Protein of Interest) maintains stable but widespread co-expression (≥10%) with multiple interactors across diverse cell types, its function is cellularly indispensable despite its low abundance.

🔹 Why this hypothesis?
Some genes are not highly expressed, but they are still critical for cellular homeostasis and operate in a facilitated interaction manner, where multiple weakly co-expressed partners compensate for expression variability.

For example:

MLH1 (MutL Homolog 1) is a key DNA mismatch repair (MMR) gene but does not show strong expression peaks in any single cell type.
Instead, MLH1 and its partners (e.g., PMS2, MSH2, MSH6) maintain steady but weak co-expression across multiple proliferating cell states—suggesting functional compensation rather than strong enrichment.

Technical pipeline (adjusted based on previous exploratory analyses):

If a lowly expressed POI (≤ Median% expression across clusters in a tissue) is consistently co-expressed with one or more interactors (≥30%), it remains functionally relevant, as its interactors might compensate for its deficiency. We can reduce the cut-off to 20% for now to pick up a few more cases for MLH1. 

Certain genes—like MLH1 (Mismatch Repair)— are weakly expressed (often <10% across most clusters or even <5%) but still cell-essential due to their role in highly conserved pathways. Instead of relying on strong co-expression (Hypothesis 1), these genes depend on their highly expressed interactors to maintain functionality.

🔍 H₂ filtering strategy: 

1. POI expression threshold:

- Identify cases where POI expression ≥ median % expression of the gene across clusters within the same tissue (adaptive thresholding).
- This allows the detection of functional low-expression genes while ignoring highly abundant ones.
 
2. Interactor rescue requirements:

- At least one interactor must be expressed at ≥30% in the same cell type/state.
- If multiple interactors are highly expressed, it strengthens the functional dependence hypothesis.


3. Rescue strength annotation: (new column: "Rescue Category")

    - Weak Rescue (10-30%) → Interaction present, but not strong.
    - Moderate Rescue (30-50%) → Likely functionally relevant.
    - High Rescue (50-80%) → Strong compensatory effect.
    - Robust Rescue (80-100%) → Near-total functional compensation.

In [1]:
import os
print(os.path.exists("./results_11032025/Jess_PPI_21032025/PPI_preprocessed_15042025/Level2/filtered_S0/"))

True


In [2]:
#Define input directory: 
input_dir = "./results_11032025/Jess_PPI_21032025/PPI_preprocessed_15042025/Level2/filtered_S0/"  

#Dynamically find all relevant input files
input_files = sorted([f for f in os.listdir(input_dir) if f.endswith("_filtered_S0_17042025.csv")]) 
print(input_files) 

#Print total count of files
print(f"\n✅ Total Input Files Found: {len(input_files)}")

['ACSF3_PPI_filtered_S0_17042025.csv', 'ACTB_PPI_filtered_S0_17042025.csv', 'ACY1_PPI_filtered_S0_17042025.csv', 'ADIPOQ_PPI_filtered_S0_17042025.csv', 'AGXT_PPI_filtered_S0_17042025.csv', 'AHCY_PPI_filtered_S0_17042025.csv', 'AIPL1_PPI_filtered_S0_17042025.csv', 'ALAS2_PPI_filtered_S0_17042025.csv', 'ALDOA_PPI_filtered_S0_17042025.csv', 'ALOX5_PPI_filtered_S0_17042025.csv', 'AMPD2_PPI_filtered_S0_17042025.csv', 'ANKRD1_PPI_filtered_S0_17042025.csv', 'ANXA11_PPI_filtered_S0_17042025.csv', 'AP2S1_PPI_filtered_S0_17042025.csv', 'APOA1_PPI_filtered_S0_17042025.csv', 'APOD_PPI_filtered_S0_17042025.csv', 'ASNS_PPI_filtered_S0_17042025.csv', 'ATPAF2_PPI_filtered_S0_17042025.csv', 'BAG3_PPI_filtered_S0_17042025.csv', 'BANF1_PPI_filtered_S0_17042025.csv', 'BCL10_PPI_filtered_S0_17042025.csv', 'BFSP2_PPI_filtered_S0_17042025.csv', 'BLK_PPI_filtered_S0_17042025.csv', 'C1QA_PPI_filtered_S0_17042025.csv', 'C1QB_PPI_filtered_S0_17042025.csv', 'C1QC_PPI_filtered_S0_17042025.csv', 'CA8_PPI_filtered_S

In [3]:
#Specify output directory: 
output_dir = "./results_11032025/Jess_PPI_21032025/PPI_preprocessed_15042025/PPI_contextualization/filtered_Step2_H2/"
os.makedirs(output_dir, exist_ok=True)  #Ensure output directory exists

In [4]:
import pandas as pd
import numpy as np

1). POI expression cutoff: - POI must be ≥ median expression % (across clusters in that tissue). - We use the tissue-specific median rather than a fixed 30% cutoff, because some genes (e.g., MLH1) have consistently low but functional expression.

2). Interactor expression cutoff: - At least one interactor must be ≥20% expression (instead of the 30% threshold used in H₁).

In [5]:
#Applied code from PPI_CellxGene_H2_corr_27022025.ipynb on all PPI cases: 

#**Define Expression Cutoffs for H₂**
WEAK_COMPENSATION = 20   #At least 20% interactor expression
MODERATE_COMPENSATION = 30  #At least 30% interactor expression
STRONG_COMPENSATION = 50  #At least 50% interactor expression

In [6]:
import pandas as pd
import numpy as np
from scipy.stats import zscore
import matplotlib.pyplot as plt

We need to remove cases where both the POI & the interactor are ≥30% expression to prevent redundancy with H3 (strong co-expression & functional disruption). However, if the POI is ≥30% but the interactor is <30%, we still keep those cases.

In [7]:
def process_H2_file(df, protein_of_interest):
    """ 
    Filters for H₂: Facilitated Co-Operation
    - POI must be expressed at least at the **median expression %** (within its tissue).
    - At least one interactor must be highly expressed (≥20%) in the same cell type.
    - **POI expression is explicitly retained & displayed.**
    """

    print(f"\n🔍STEP-1: Computing tissue median expression for {protein_of_interest}...")

    #Step-1: Compute POI-Specific Median Expression Cutoff
    #Computes median expression % of POI across all cell types in each tissue
    #If POI is absent in a tissue, it won't be processed
    tissue_median_expr = df[df["Gene Symbol"] == protein_of_interest].groupby("Tissue")["%Cells Expressing Gene"].median()

    if tissue_median_expr.empty:
        print(f"⚠️WARNING: {protein_of_interest} does not meet median expression in any tissue. Skipping file.")
        return None

    print(f"📊Tissue-specific median expression for {protein_of_interest}:\n{tissue_median_expr}\n")

    #Merge computed tissue-specific median expression into df for reference
    df = df.merge(tissue_median_expr.rename("Tissue_Median_Exp"), on="Tissue", how="left")

    #Explicitly store POI expression per cell type
    #This ensures that even after filtering, POI expression remains in output
    poi_expression_dict = df[df["Gene Symbol"] == protein_of_interest].set_index(["Tissue", "Cell Type"])["%Cells Expressing Gene"].to_dict()
    df["POI_Expression_%"] = df.apply(lambda row: poi_expression_dict.get((row["Tissue"], row["Cell Type"]), np.nan), axis=1)

    #Ensure POI_Above_Median is explicitly written (True/False instead of NaN)
    # - If POI expression in a cell type is ≥ tissue median, it's marked as True
    df["POI_Above_Median"] = df["POI_Expression_%"] >= df["Tissue_Median_Exp"]
    df["POI_Above_Median"] = df["POI_Above_Median"].fillna(False).astype(bool) #Convert NaN to False where applicable

    #STEP-2: Extract only POI rows that pass the median expression cutoff
    #Ensures POI_Expression_% is stored per tissue-cell type.
    #Ensures POI_Above_Median is explicitly marked True or False
    #Since Jupyter prints full df before filtering, some False values will still show in output logs
    main_protein_present = df[df["POI_Above_Median"]]

    print(f"✅STEP-2: Found {len(main_protein_present)} clusters where {protein_of_interest} is expressed above median.")
    print(main_protein_present.head(), "\n")

    if main_protein_present.empty:
        print(f"⚠️WARNING: {protein_of_interest} does not surpass median expression threshold in any tissue. Skipping file.")
        return None

    #STEP-3: Identify interactors that are **highly expressed** (≥20%)
    #Filters only interactors (excluding POI itself) where %Cells Expressing Gene is ≥ 20%
    interactors_present = df[(df["Gene Symbol"] != protein_of_interest) & (df["%Cells Expressing Gene"] >= WEAK_COMPENSATION)]

    print(f"✅STEP-3: Found {len(interactors_present)} interactors with ≥20% expression.")
    print(interactors_present.head(), "\n")

    if interactors_present.empty:
        print(f"⚠️WARNING: No interactors reach ≥20% expression. Skipping file.")
        return None

    #STEP-4: Identify valid interactions (POI + at least one interactor in the same cluster)
    #Merges clusters where POI is above median with clusters where an interactor is ≥ 20%
    valid_interactions = pd.merge(
        main_protein_present[["Tissue", "Cell Type"]],
        interactors_present[["Tissue", "Cell Type"]],
        on=["Tissue", "Cell Type"], how="inner"
    ).drop_duplicates()

    print(f"✅STEP-4: Found {len(valid_interactions)} valid POI-interactor pairs.")
    print(valid_interactions.head(), "\n")

    if valid_interactions.empty:
        print(f"⚠️WARNING: No valid POI-interactor pairs found for {protein_of_interest}. Skipping file.")
        return None

    #STEP-5: Merge valid interactions back into the dataset
    #Ensures only rows with a POI-interactor pair are retained
    df = df.merge(valid_interactions, on=["Tissue", "Cell Type"], how="inner")

    print(f"✅STEP-5: Filtered dataset now contains {len(df)} rows.")
    print(df.head(), "\n")

    #STEP-6: Define Compensation Category Based on Interactor Expression
    #Categorizes interactions based on interactor expression level
    conditions = [
        df["%Cells Expressing Gene"] >= STRONG_COMPENSATION,  #Robust Compensation (≥50%)
        df["%Cells Expressing Gene"] >= 30,  #Moderate Compensation (30-49%)  <-- ✅ FIXED RANGE
        df["%Cells Expressing Gene"] >= 20  #Weak Compensation (20-29%)  <-- ✅ FIXED RANGE
    ]
    categories = ["Robust Compensation (≥50%)", "Moderate Compensation (30-49%)", "Weak Compensation (20-29%)"]

    df["Compensation Category"] = np.select(conditions, categories, default="No Compensation (<20%)")

    #STEP-7: Remove Lowly Expressed Interactors (<20%)
    #Interactions that fail the ≥20% expression cutoff are removed
    df = df[df["Compensation Category"] != "No Compensation (<20%)"]

    print(f"✅STEP-7: Final dataset contains {len(df)} rows after removing non-compensating interactors.")

    return df


In [8]:
output_dir = "./results_11032025/Jess_PPI_21032025/PPI_preprocessed_15042025/PPI_contextualization/filtered_Step2_H2/"


for file in input_files:
    input_path = os.path.join(input_dir, file)
    
    print(f"\n📂 Processing file: {file}")

    #Extract POI from filename
    protein_of_interest = file.split("_")[0]
    print(f"🧬 Identified POI: {protein_of_interest}")

    #Read CSV (Fix encoding issues)
    df = pd.read_csv(input_path, encoding="utf-8")  
    print(f"✅ Loaded {file} | Shape: {df.shape}")

    if df.empty:
        print(f"⚠️ Skipping {file} as it's empty.")
        continue

    #Process file using H2 criteria
    processed_df = process_H2_file(df, protein_of_interest)

    if processed_df is None or processed_df.empty:
        print(f"⚠️ WARNING: No valid data after processing {file}. Skipping...")
        continue

    #Save output in Excel format (Fix encoding issues)**
    output_file = os.path.join(output_dir, file.replace("_S0_17042025", "_S2_H2_24042025").replace(".csv", ".xlsx")) 
    processed_df.to_excel(output_file, index=False, engine= "openpyxl")  

    print(f"✅ Successfully saved H2 output: {output_file}")


📂 Processing file: ACSF3_PPI_filtered_S0_17042025.csv
🧬 Identified POI: ACSF3
✅ Loaded ACSF3_PPI_filtered_S0_17042025.csv | Shape: (8536, 14)

🔍STEP-1: Computing tissue median expression for ACSF3...
📊Tissue-specific median expression for ACSF3:
Tissue
adipose tissue     3.109360
adrenal gland      2.270799
bladder organ      0.532229
blood              2.002172
bone marrow        2.725964
brain              1.130856
breast             0.736955
colon              1.333889
cortex             0.509045
embryo             2.143559
endocrine gland    1.601069
esophagus          1.419061
exocrine gland     1.568245
eye                1.578902
fallopian tube     0.704561
forelimb           0.218341
gallbladder        0.329399
heart              3.107043
hindlimb           0.548948
intestine          0.599917
kidney             2.582015
large intestine    1.109918
liver              1.606673
lung               2.031746
lymph node         2.383717
ovary              0.799124
pancreas          

Key confirmations:

1. All POI rows are first filtered for POI_Above_Median.

df["POI_Above_Median"] = df["POI_Expression_%"] >= df["Tissue_Median_Exp"]
Only True rows move to the next step (Step 2).


2. Only interactors with ≥20% expression are considered.

interactors_present = df[(df["Gene Symbol"] != protein_of_interest) & (df["%Cells Expressing Gene"] >= WEAK_COMPENSATION)]
If no interactors qualify, the file is skipped (Step 3).


3. Final filtering merges valid POI & interactors, ensuring that both conditions are met.

df = df.merge(valid_interactions, on=["Tissue", "Cell Type"], how="inner")
Only valid POI-interactor pairs are retained (Step 5).


4. The output will only contain PPIs where both conditions hold (Steps 6-7).

POI expression is shown in the final file for reference (POI_Expression_%).
Rows failing either condition are removed before saving.

In [10]:
#Just print the moderate & robust compensations in the notebook:

#Find all processed H2 files
h2_files = [f for f in os.listdir(output_dir) if f.endswith("_S2_H2_24042025.xlsx")]
len(h2_files)

174

- We need to ensure that while filtering for Moderate & Robust Compensation, we retain weakly expressed interactors within the same cluster for context.

- Also, instead of completely removing POI as an interactor, let's explicitly flag self-interactions so that they can be reviewed manually.

- Updates for handling self-interactions:
    - If POI is present as an interactor, mark it as "Self-Interaction" in a new column.
    - Keep all cases in the output but differentiate them so you can review them later.
    - Sorting remains unchanged, but self-interactions will be labeled clearly.
    - Filters out H3 redundant cases at the end (Step-9) without affecting H2
    - Renames %Cells Expressing Gene → %Cells Expressing Interactor
    - Renames POI_Expression_% → %Cells Expressing POI
    - Ensures sorting remains correct (R → M → W)

In [11]:
#let's retain all relevant columns, including Cell Count, so you have full flexibility
#Self_Interaction == "Yes" rows should be removed just before saving the Excel file.
#Refer to line-16 of PPI_CellxGene_H2_corr_27022025.ipynb 

#Process each file
for file in h2_files:
    file_path = os.path.join(output_dir, file)
    
    #Load the processed file
    df = pd.read_excel(file_path, engine="openpyxl")
    
    print(f"\n📂 Processing file: {file}")
    print(f"✅ Loaded {file} | Shape: {df.shape}")

    #Extract POI from filename
    protein_of_interest = file.split("_")[0]

    #Step 1: Identify clusters where at least one interactor is Moderate or Robust
    valid_clusters = df[df["Compensation Category"].isin(["Robust Compensation (≥50%)", "Moderate Compensation (30-49%)"])][["Tissue", "Cell Type"]].drop_duplicates()

    if valid_clusters.empty:
        print(f"⚠️ No valid interactor-driven Robust or Moderate Compensation cases found in {file}. Skipping...\n")
        continue

    #Step 2: Retain all interactors (including weak ones) in these clusters
    filtered_df = df.merge(valid_clusters, on=["Tissue", "Cell Type"], how="inner")

    print(f"✅ Found {len(filtered_df)} total interactors in clusters with at least one Moderate/Robust Compensation.")

    #Step 3: Construct the PPI column (POI-Interactor Pair)
    filtered_df["PPI"] = protein_of_interest + "-" + filtered_df["Gene Symbol"]  #Format: POI-Interactor (e.g., STXBP1-STX5)

    #Step 4: Assign W, M, R for Compensation Category (Fixing Mapping Issue)
    category_mapping = {
        "Robust Compensation (≥50%)": "R",
        "Moderate Compensation (30-49%)": "M",
        "Weak Compensation (20-29%)": "W"
    }

    #Standardize category column before mapping
    filtered_df["Compensation Category"] = filtered_df["Compensation Category"].astype(str).str.strip()

    #Map category names
    filtered_df["Comp_Category"] = filtered_df["Compensation Category"].map(category_mapping)

    #Handle missing values (should not happen, but just in case)
    filtered_df["Comp_Category"] = filtered_df["Comp_Category"].fillna("Unknown")

    #Step 5: Flag Self-Interactions
    filtered_df["Self_Interaction"] = filtered_df["Gene Symbol"].apply(lambda x: "Yes" if x == protein_of_interest else "No")

    #Step 6: Rename columns for consistency
    filtered_df.rename(columns={
        "%Cells Expressing Gene": "%Cells Expressing Interactor",
        "POI_Expression_%": "%Cells Expressing POI"
    }, inplace=True)

    #Step 7: Select only required columns (PPI-related & Cell Count)
    output_cols = ["Tissue", "Cell Type", "Cell Count", "PPI", "%Cells Expressing POI", "%Cells Expressing Interactor", "Comp_Category", "Self_Interaction"]
    final_df = filtered_df[output_cols]  #Retain only relevant columns

    #Step 8: Sort by Tissue → Cell Type → Compensation (R → M → W)
    comp_rank = {"R": 1, "M": 2, "W": 3}
    final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)

    sorted_df = final_df.sort_values(by=["Tissue", "Cell Type", "Comp_Rank"],
                                     ascending=[True, True, True]).drop(columns=["Comp_Rank"])

    #Step 9: **Exclude H3 redundant cases (where both POI & Interactor are ≥30%)** => very important to avoid redundancy 
    sorted_df = sorted_df[~((sorted_df["%Cells Expressing POI"] >= 30) & 
                            (sorted_df["%Cells Expressing Interactor"] >= 30))]

    print(f"✅ Removed H3 redundant cases. Final dataset contains {len(sorted_df)} rows.")

    #Step 10: Print first 10 rows for quick verification
    print(sorted_df.head(10)) 

    #Step 11: Remove Self-Interactions before saving the files
    sorted_df = sorted_df[sorted_df["Self_Interaction"] != "Yes"]

    #Step 12: Save the sorted version
    sorted_file_path = os.path.join(output_dir, file.replace("_S2_H2_24042025.xlsx", "_Sorted_S2_H2_24042025.xlsx"))
    sorted_df.to_excel(sorted_file_path, index=False, engine="openpyxl")

    print(f"✅ Saved sorted file (Only PPI + Cell Count, Self-Interactions Removed): {sorted_file_path}\n")



📂 Processing file: EIF2B1_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded EIF2B1_PPI_filtered_S2_H2_24042025.xlsx | Shape: (12, 18)
✅ Found 1 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 1 rows.
    Tissue       Cell Type  Cell Count          PPI  %Cells Expressing POI  \
0  stomach  malignant cell        1040  EIF2B1-ATF5              11.634615   

   %Cells Expressing Interactor Comp_Category Self_Interaction  
0                     56.153846             R               No  
✅ Saved sorted file (Only PPI + Cell Count, Self-Interactions Removed): ./results_11032025/Jess_PPI_21032025/PPI_preprocessed_15042025/PPI_contextualization/filtered_Step2_H2/EIF2B1_PPI_filtered_Sorted_S2_H2_24042025.xlsx



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)



📂 Processing file: EFHC1_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded EFHC1_PPI_filtered_S2_H2_24042025.xlsx | Shape: (2189, 18)
✅ Found 1980 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 1940 rows.
           Tissue                             Cell Type  Cell Count  \
0  adipose tissue                connective tissue cell      188588   
1  adipose tissue                connective tissue cell      188588   
2  adipose tissue                      contractile cell       16234   
3  adipose tissue                      contractile cell       16234   
5  adipose tissue                      endothelial cell       45303   
4  adipose tissue                      endothelial cell       45303   
7  adipose tissue  endothelial cell of lymphatic vessel        6146   
6  adipose tissue  endothelial cell of lymphatic vessel        6146   
8  adipose tissue                       epithelial cell      109540   
9  adipo

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)



📂 Processing file: PUF60_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded PUF60_PPI_filtered_S2_H2_24042025.xlsx | Shape: (1715, 18)
✅ Found 1331 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 1304 rows.
           Tissue                        Cell Type  Cell Count           PPI  \
0  adipose tissue                           B cell       16003     PUF60-SF1   
1  adipose tissue                           B cell       16003  PUF60-SRSF11   
2  adipose tissue  CD4-positive, alpha-beta T cell       11653     PUF60-SF1   
3  adipose tissue  CD4-positive, alpha-beta T cell       11653  PUF60-SRSF11   
4  adipose tissue  CD8-positive, alpha-beta T cell        5441     PUF60-SF1   
5  adipose tissue  CD8-positive, alpha-beta T cell        5441  PUF60-SRSF11   
6  adipose tissue                           T cell       31314     PUF60-SF1   
7  adipose tissue                           T cell       31314  PUF60-SRSF11  

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)


✅ Saved sorted file (Only PPI + Cell Count, Self-Interactions Removed): ./results_11032025/Jess_PPI_21032025/PPI_preprocessed_15042025/PPI_contextualization/filtered_Step2_H2/PITX1_PPI_filtered_Sorted_S2_H2_24042025.xlsx


📂 Processing file: FOXP3_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded FOXP3_PPI_filtered_S2_H2_24042025.xlsx | Shape: (520, 18)
✅ Found 369 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 366 rows.
           Tissue                        Cell Type  Cell Count         PPI  \
0  adipose tissue  CD4-positive, alpha-beta T cell       11653  FOXP3-FOSB   
1  adipose tissue                           T cell       31314  FOXP3-FOSB   
2  adipose tissue                alpha-beta T cell       20182  FOXP3-FOSB   
3  adipose tissue                   dendritic cell        2534  FOXP3-FOSB   
4  adipose tissue               hematopoietic cell      121993  FOXP3-FOSB   
5  adipose tissue              

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is


📂 Processing file: GLYCTK_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded GLYCTK_PPI_filtered_S2_H2_24042025.xlsx | Shape: (1524, 18)
✅ Found 1389 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 1389 rows.
           Tissue                         Cell Type  Cell Count  \
0  adipose tissue                         adipocyte       72587   
1  adipose tissue       adipocyte of omentum tissue       15436   
3  adipose tissue                    dendritic cell        2534   
2  adipose tissue                    dendritic cell        2534   
4  adipose tissue  fibro/adipogenic progenitor cell       15985   
5  adipose tissue                hematopoietic cell      121993   
6  adipose tissue              innate lymphoid cell        1031   
7  adipose tissue                         leukocyte      118505   
8  adipose tissue                        lymphocyte       50559   
9  adipose tissue                        macro

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is

✅ Saved sorted file (Only PPI + Cell Count, Self-Interactions Removed): ./results_11032025/Jess_PPI_21032025/PPI_preprocessed_15042025/PPI_contextualization/filtered_Step2_H2/SYNGR1_PPI_filtered_Sorted_S2_H2_24042025.xlsx


📂 Processing file: RP2_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded RP2_PPI_filtered_S2_H2_24042025.xlsx | Shape: (222, 18)
✅ Found 65 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 65 rows.
           Tissue                                        Cell Type  \
1  adipose tissue                                       blood cell   
0  adipose tissue                                       blood cell   
2  adipose tissue                             innate lymphoid cell   
3  adipose tissue                              natural killer cell   
5  adipose tissue                                       neutrophil   
4  adipose tissue                                       neutrophil   
6   adrenal gl

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is

✅ Saved sorted file (Only PPI + Cell Count, Self-Interactions Removed): ./results_11032025/Jess_PPI_21032025/PPI_preprocessed_15042025/PPI_contextualization/filtered_Step2_H2/GCM2_PPI_filtered_Sorted_S2_H2_24042025.xlsx


📂 Processing file: IRAK4_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded IRAK4_PPI_filtered_S2_H2_24042025.xlsx | Shape: (6, 18)
✅ Found 1 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 1 rows.
   Tissue   Cell Type  Cell Count          PPI  %Cells Expressing POI  \
0  embryo  glial cell        2770  IRAK4-TRIM7               6.137184   

   %Cells Expressing Interactor Comp_Category Self_Interaction  
0                     32.924188             M               No  
✅ Saved sorted file (Only PPI + Cell Count, Self-Interactions Removed): ./results_11032025/Jess_PPI_21032025/PPI_preprocessed_15042025/PPI_contextualization/filtered_Step2_H2/IRAK4_PPI_filtered_Sorted_S2_H2_24042025.xlsx


📂 Proc

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)



📂 Processing file: EXOC8_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded EXOC8_PPI_filtered_S2_H2_24042025.xlsx | Shape: (1683, 18)
✅ Found 1281 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 1281 rows.
           Tissue                        Cell Type  Cell Count           PPI  \
1  adipose tissue  CD8-positive, alpha-beta T cell        5441   EXOC8-IKZF3   
2  adipose tissue  CD8-positive, alpha-beta T cell        5441    EXOC8-PBX4   
0  adipose tissue  CD8-positive, alpha-beta T cell        5441  EXOC8-BICRAL   
3  adipose tissue  CD8-positive, alpha-beta T cell        5441    EXOC8-PCM1   
4  adipose tissue  CD8-positive, alpha-beta T cell        5441   EXOC8-TCF12   
8  adipose tissue                           T cell       31314    EXOC8-PCM1   
9  adipose tissue                           T cell       31314   EXOC8-TCF12   
5  adipose tissue                           T cell       31314  EXOC8-BICRAL  

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is


📂 Processing file: HBD_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded HBD_PPI_filtered_S2_H2_24042025.xlsx | Shape: (217, 18)
✅ Found 161 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 131 rows.
           Tissue                                          Cell Type  \
0   adrenal gland                                     precursor cell   
1   adrenal gland                                     precursor cell   
8     bone marrow                             CD14-positive monocyte   
9     bone marrow                             CD14-positive monocyte   
10    bone marrow              CD14-positive, CD16-positive monocyte   
11    bone marrow              CD14-positive, CD16-positive monocyte   
13    bone marrow  CD4-positive, CD25-positive, alpha-beta regula...   
14    bone marrow  CD4-positive, CD25-positive, alpha-beta regula...   
12    bone marrow  CD4-positive, CD25-positive, alpha-beta regula...   
15   

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is

✅ Found 156 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 156 rows.
           Tissue                             Cell Type  Cell Count  \
0  adipose tissue                            macrophage       55294   
1  adipose tissue                 mononuclear phagocyte       61597   
2  adipose tissue                          myeloid cell       66382   
3  adipose tissue                     myeloid leukocyte       65250   
4  adipose tissue                neuron associated cell        1900   
5  adipose tissue  professional antigen presenting cell       57968   
6   bladder organ                      contractile cell        5403   
7   bladder organ                    myofibroblast cell        3136   
8           blood                         megakaryocyte       13528   
9     bone marrow                            macrophage        2968   

             PPI  %Cells Expressing POI  %Cells Expressing Intera

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is

✅ Saved sorted file (Only PPI + Cell Count, Self-Interactions Removed): ./results_11032025/Jess_PPI_21032025/PPI_preprocessed_15042025/PPI_contextualization/filtered_Step2_H2/KCTD7_PPI_filtered_Sorted_S2_H2_24042025.xlsx


📂 Processing file: TMEM43_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded TMEM43_PPI_filtered_S2_H2_24042025.xlsx | Shape: (30, 18)
✅ Found 12 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 12 rows.
   Tissue                     Cell Type  Cell Count             PPI  \
0   brain                     astrocyte     1154803  TMEM43-GRAMD2B   
1   brain        connective tissue cell       50321  TMEM43-GRAMD2B   
2   brain                    fibroblast       36741  TMEM43-GRAMD2B   
3   brain              mature astrocyte       11177  TMEM43-GRAMD2B   
4   brain                    mural cell       37892  TMEM43-GRAMD2B   
5   brain             perivascular cell       40098  TMEM43-GRAMD2B   
6  

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is

✅ Saved sorted file (Only PPI + Cell Count, Self-Interactions Removed): ./results_11032025/Jess_PPI_21032025/PPI_preprocessed_15042025/PPI_contextualization/filtered_Step2_H2/PMP22_PPI_filtered_Sorted_S2_H2_24042025.xlsx


📂 Processing file: CACNB4_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded CACNB4_PPI_filtered_S2_H2_24042025.xlsx | Shape: (260, 18)
✅ Found 125 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 111 rows.
            Tissue                          Cell Type  Cell Count  \
1   adipose tissue   fibro/adipogenic progenitor cell       15985   
0   adipose tissue   fibro/adipogenic progenitor cell       15985   
2   adipose tissue                           monocyte        3769   
4   adipose tissue                    progenitor cell       22033   
3   adipose tissue                    progenitor cell       22033   
5    adrenal gland  endothelial cell of vascular tree        2110   
6            b

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is


📂 Processing file: BAG3_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded BAG3_PPI_filtered_S2_H2_24042025.xlsx | Shape: (909, 18)
✅ Found 568 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 534 rows.
           Tissue                        Cell Type  Cell Count          PPI  \
1  adipose tissue             innate lymphoid cell        1031  BAG3-VPS37B   
0  adipose tissue             innate lymphoid cell        1031  BAG3-DAZAP2   
4  adipose tissue              natural killer cell        1022  BAG3-VPS37B   
2  adipose tissue              natural killer cell        1022  BAG3-ARRDC3   
3  adipose tissue              natural killer cell        1022  BAG3-DAZAP2   
6   adrenal gland  CD4-positive, alpha-beta T cell        1418  BAG3-DAZAP2   
5   adrenal gland  CD4-positive, alpha-beta T cell        1418    BAG3-BAG3   
7   adrenal gland  CD4-positive, alpha-beta T cell        1418  BAG3-VPS37B   
8   adrenal 

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)



📂 Processing file: PRPF31_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded PRPF31_PPI_filtered_S2_H2_24042025.xlsx | Shape: (2482, 18)
✅ Found 2346 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 2346 rows.
            Tissue                        Cell Type  Cell Count  \
2   adipose tissue  CD8-positive, alpha-beta T cell        5441   
0   adipose tissue  CD8-positive, alpha-beta T cell        5441   
1   adipose tissue  CD8-positive, alpha-beta T cell        5441   
4   adipose tissue                alpha-beta T cell       20182   
5   adipose tissue                alpha-beta T cell       20182   
3   adipose tissue                alpha-beta T cell       20182   
7   adipose tissue                   dendritic cell        2534   
6   adipose tissue                   dendritic cell        2534   
11  adipose tissue             innate lymphoid cell        1031   
9   adipose tissue             innate lymphoid

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)



📂 Processing file: SUOX_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded SUOX_PPI_filtered_S2_H2_24042025.xlsx | Shape: (1637, 18)
✅ Found 1352 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 1352 rows.
           Tissue                    Cell Type  Cell Count          PPI  \
1  adipose tissue                    adipocyte       72587  SUOX-FNDC3B   
2  adipose tissue                    adipocyte       72587    SUOX-UBR2   
0  adipose tissue                    adipocyte       72587    SUOX-EYA2   
4  adipose tissue  adipocyte of omentum tissue       15436  SUOX-FNDC3B   
3  adipose tissue  adipocyte of omentum tissue       15436    SUOX-EYA2   
5  adipose tissue  adipocyte of omentum tissue       15436    SUOX-UBR2   
7  adipose tissue       connective tissue cell      188588  SUOX-FNDC3B   
6  adipose tissue       connective tissue cell      188588    SUOX-EYA2   
8  adipose tissue       connective tissue cel

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)



📂 Processing file: ZMYND10_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded ZMYND10_PPI_filtered_S2_H2_24042025.xlsx | Shape: (309, 18)
✅ Found 118 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 104 rows.
          Tissue                                        Cell Type  Cell Count  \
0  adrenal gland                  CD8-positive, alpha-beta T cell        4761   
1  adrenal gland                                           T cell        9720   
2  adrenal gland                         ciliated epithelial cell        1558   
3  adrenal gland  effector memory CD8-positive, alpha-beta T cell        1069   
4  adrenal gland                                       enterocyte        1250   
5  adrenal gland                                    memory T cell        1101   
6          blood                       common lymphoid progenitor        7972   
7          blood                double negative T regulatory cell    

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is


📂 Processing file: LITAF_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded LITAF_PPI_filtered_S2_H2_24042025.xlsx | Shape: (3321, 18)
✅ Found 3221 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 1821 rows.
            Tissue                        Cell Type  Cell Count  \
2   adipose tissue  CD4-positive, alpha-beta T cell       11653   
4   adipose tissue  CD8-positive, alpha-beta T cell        5441   
6   adipose tissue  CD8-positive, alpha-beta T cell        5441   
8   adipose tissue                           T cell       31314   
10  adipose tissue                           T cell       31314   
13  adipose tissue                alpha-beta T cell       20182   
16  adipose tissue                       blood cell        3586   
18  adipose tissue                       blood cell        3586   
20  adipose tissue                   dendritic cell        2534   
22  adipose tissue                   dendritic c

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)


✅ Saved sorted file (Only PPI + Cell Count, Self-Interactions Removed): ./results_11032025/Jess_PPI_21032025/PPI_preprocessed_15042025/PPI_contextualization/filtered_Step2_H2/MCEE_PPI_filtered_Sorted_S2_H2_24042025.xlsx


📂 Processing file: GYPA_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded GYPA_PPI_filtered_S2_H2_24042025.xlsx | Shape: (14, 18)
✅ Found 10 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 10 rows.
           Tissue    Cell Type  Cell Count           PPI  \
0  adipose tissue   blood cell        3586  GYPA-TMEM154   
1           blood  granulocyte       38067  GYPA-TMEM154   
2           blood   neutrophil       36783  GYPA-TMEM154   
3     bone marrow  granulocyte       14877  GYPA-TMEM154   
4     bone marrow   neutrophil       13563  GYPA-TMEM154   
5           heart   blood cell        1292  GYPA-TMEM154   
6            lung   blood cell       24432  GYPA-TMEM154   
7            lung  granul

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)



📂 Processing file: FAM161A_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded FAM161A_PPI_filtered_S2_H2_24042025.xlsx | Shape: (2000, 18)
✅ Found 1312 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 1299 rows.
            Tissue                    Cell Type  Cell Count               PPI  \
0   adipose tissue                    adipocyte       72587     FAM161A-HOOK2   
1   adipose tissue                    adipocyte       72587  FAM161A-KIAA1328   
2   adipose tissue                    adipocyte       72587     FAM161A-PIBF1   
3   adipose tissue  adipocyte of omentum tissue       15436     FAM161A-HOOK2   
4   adipose tissue  adipocyte of omentum tissue       15436     FAM161A-PIBF1   
8   adipose tissue             contractile cell       16234    FAM161A-TBC1D1   
5   adipose tissue             contractile cell       16234  FAM161A-KIAA1328   
6   adipose tissue             contractile cell       16234     FA

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)



📂 Processing file: PIK3CD_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded PIK3CD_PPI_filtered_S2_H2_24042025.xlsx | Shape: (518, 18)
✅ Found 328 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 325 rows.
           Tissue                        Cell Type  Cell Count            PPI  \
0  adipose tissue  CD8-positive, alpha-beta T cell        5441  PIK3CD-PIK3R1   
1  adipose tissue                           T cell       31314  PIK3CD-PIK3R1   
2  adipose tissue                alpha-beta T cell       20182  PIK3CD-PIK3R1   
3  adipose tissue               hematopoietic cell      121993  PIK3CD-PIK3R1   
4  adipose tissue     hematopoietic precursor cell        2353  PIK3CD-PIK3R1   
6  adipose tissue               immature NK T cell        2551  PIK3CD-PIK3R1   
5  adipose tissue               immature NK T cell        2551  PIK3CD-PIK3CD   
7  adipose tissue             innate lymphoid cell        1031  PIK3CD-

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is


📂 Processing file: BFSP2_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded BFSP2_PPI_filtered_S2_H2_24042025.xlsx | Shape: (457, 18)
✅ Found 238 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 234 rows.
           Tissue           Cell Type  Cell Count           PPI  \
0  adipose tissue      dendritic cell        2534   BFSP2-DISC1   
1  adipose tissue  hematopoietic cell      121993   BFSP2-DISC1   
2  adipose tissue           leukocyte      118505   BFSP2-DISC1   
4  adipose tissue           mast cell        2725   BFSP2-DISC1   
3  adipose tissue           mast cell        2725  BFSP2-BICRAL   
5  adipose tissue            monocyte        3769   BFSP2-DISC1   
6  adipose tissue    mononuclear cell      112156   BFSP2-DISC1   
8  adipose tissue      secretory cell        2726   BFSP2-DISC1   
7  adipose tissue      secretory cell        2726  BFSP2-BICRAL   
9   adrenal gland              T cell        9720  

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is


📂 Processing file: MLH1_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded MLH1_PPI_filtered_S2_H2_24042025.xlsx | Shape: (530, 18)
✅ Found 180 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 180 rows.
          Tissue                                        Cell Type  Cell Count  \
0  adrenal gland                  CD4-positive, alpha-beta T cell        1418   
1  adrenal gland                  CD8-positive, alpha-beta T cell        4761   
2  adrenal gland                                           T cell        9720   
3  adrenal gland  effector memory CD8-positive, alpha-beta T cell        1069   
4  adrenal gland                                        leukocyte       14622   
5  adrenal gland                                       lymphocyte       11091   
6  adrenal gland                                    mature T cell        7110   
7  adrenal gland                         mature alpha-beta T cell        70

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)



📂 Processing file: PNKP_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded PNKP_PPI_filtered_S2_H2_24042025.xlsx | Shape: (926, 18)
✅ Found 569 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 569 rows.
           Tissue             Cell Type  Cell Count          PPI  \
0  adipose tissue                T cell       31314   PNKP-IKZF1   
1  adipose tissue        dendritic cell        2534   PNKP-IKZF1   
2  adipose tissue        dendritic cell        2534    PNKP-SNX2   
3  adipose tissue        dendritic cell        2534  PNKP-TBC1D1   
4  adipose tissue      endothelial cell       45303     PNKP-MCC   
5  adipose tissue      endothelial cell       45303  PNKP-TBC1D1   
6  adipose tissue    hematopoietic cell      121993   PNKP-IKZF1   
7  adipose tissue    hematopoietic cell      121993    PNKP-SNX2   
8  adipose tissue    immature NK T cell        2551   PNKP-IKZF1   
9  adipose tissue  innate lymphoid cell    

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)



📂 Processing file: BLK_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded BLK_PPI_filtered_S2_H2_24042025.xlsx | Shape: (1589, 18)
✅ Found 1231 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 1200 rows.
            Tissue                        Cell Type  Cell Count         PPI  \
1   adipose tissue                           B cell       16003   BLK-IKZF3   
6   adipose tissue  CD4-positive, alpha-beta T cell       11653   BLK-STAT3   
4   adipose tissue  CD4-positive, alpha-beta T cell       11653  BLK-PIK3R1   
3   adipose tissue  CD4-positive, alpha-beta T cell       11653   BLK-IKZF3   
5   adipose tissue  CD4-positive, alpha-beta T cell       11653     BLK-SLA   
8   adipose tissue  CD8-positive, alpha-beta T cell        5441  BLK-PIK3R1   
10  adipose tissue  CD8-positive, alpha-beta T cell        5441   BLK-STAT3   
7   adipose tissue  CD8-positive, alpha-beta T cell        5441   BLK-IKZF3   
9   adipose

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)


✅ Saved sorted file (Only PPI + Cell Count, Self-Interactions Removed): ./results_11032025/Jess_PPI_21032025/PPI_preprocessed_15042025/PPI_contextualization/filtered_Step2_H2/NEUROD1_PPI_filtered_Sorted_S2_H2_24042025.xlsx


📂 Processing file: FUCA1_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded FUCA1_PPI_filtered_S2_H2_24042025.xlsx | Shape: (1, 18)
⚠️ No valid interactor-driven Robust or Moderate Compensation cases found in FUCA1_PPI_filtered_S2_H2_24042025.xlsx. Skipping...


📂 Processing file: KRT6A_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded KRT6A_PPI_filtered_S2_H2_24042025.xlsx | Shape: (92, 18)
✅ Found 74 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 43 rows.
             Tissue                                          Cell Type  \
0            breast  luminal adaptive secretory precursor cell of m...   
1            breast  luminal adaptive secretory precursor cell of m...   
2            breast     

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is


📂 Processing file: APOD_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded APOD_PPI_filtered_S2_H2_24042025.xlsx | Shape: (1394, 18)
✅ Found 993 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 917 rows.
           Tissue                        Cell Type  Cell Count        PPI  \
0  adipose tissue                           B cell       16003  APOD-CD53   
1  adipose tissue  CD4-positive, alpha-beta T cell       11653  APOD-CD53   
2  adipose tissue  CD8-positive, alpha-beta T cell        5441  APOD-CD53   
3  adipose tissue                           T cell       31314  APOD-CD53   
4  adipose tissue                alpha-beta T cell       20182  APOD-CD53   
6  adipose tissue                       blood cell        3586  APOD-AQP9   
7  adipose tissue                       blood cell        3586  APOD-CD53   
8  adipose tissue                       blood cell        3586  APOD-VAPA   
5  adipose tissue            

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is


📂 Processing file: RAD51D_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded RAD51D_PPI_filtered_S2_H2_24042025.xlsx | Shape: (1447, 18)
✅ Found 1141 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 1141 rows.
           Tissue                                Cell Type  Cell Count  \
0  adipose tissue                          epithelial cell      109540   
2  adipose tissue                       immature NK T cell        2551   
1  adipose tissue                       immature NK T cell        2551   
3  adipose tissue                       immature NK T cell        2551   
4  adipose tissue                               macrophage       55294   
5  adipose tissue                               macrophage       55294   
6  adipose tissue                                mast cell        2725   
7  adipose tissue                    mesenchymal stem cell       29293   
8  adipose tissue  mesenchymal stem cell of adipos

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)



📂 Processing file: FADD_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded FADD_PPI_filtered_S2_H2_24042025.xlsx | Shape: (1424, 18)
✅ Found 1306 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 1306 rows.
            Tissue                        Cell Type  Cell Count         PPI  \
0   adipose tissue  CD8-positive, alpha-beta T cell        5441   FADD-BTG1   
1   adipose tissue                           T cell       31314   FADD-BTG1   
2   adipose tissue                       blood cell        3586   FADD-BTG1   
3   adipose tissue               immature NK T cell        2551   FADD-BTG1   
4   adipose tissue             innate lymphoid cell        1031   FADD-BTG1   
5   adipose tissue                       lymphocyte       50559   FADD-BTG1   
6   adipose tissue              natural killer cell        1022   FADD-BTG1   
7   adipose tissue                       neutrophil        3456   FADD-BTG1   
8    adre

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is

✅ Saved sorted file (Only PPI + Cell Count, Self-Interactions Removed): ./results_11032025/Jess_PPI_21032025/PPI_preprocessed_15042025/PPI_contextualization/filtered_Step2_H2/MIP_PPI_filtered_Sorted_S2_H2_24042025.xlsx


📂 Processing file: SLC30A2_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded SLC30A2_PPI_filtered_S2_H2_24042025.xlsx | Shape: (199, 18)
✅ Found 98 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 98 rows.
           Tissue                                        Cell Type  \
0  adipose tissue                                       macrophage   
1  adipose tissue                            mononuclear phagocyte   
2  adipose tissue                                     myeloid cell   
3  adipose tissue                                myeloid leukocyte   
4  adipose tissue             professional antigen presenting cell   
5           blood                           CD14-positive monocyte   
6        

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)


✅ Saved sorted file (Only PPI + Cell Count, Self-Interactions Removed): ./results_11032025/Jess_PPI_21032025/PPI_preprocessed_15042025/PPI_contextualization/filtered_Step2_H2/UBQLN2_PPI_filtered_Sorted_S2_H2_24042025.xlsx


📂 Processing file: PRKAR1A_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded PRKAR1A_PPI_filtered_S2_H2_24042025.xlsx | Shape: (39, 18)
✅ Found 29 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 18 rows.
   Tissue                                Cell Type  Cell Count  \
3   blood  megakaryocyte-erythroid progenitor cell        1744   
4   blood  megakaryocyte-erythroid progenitor cell        1744   
6   blood                                 platelet       28037   
5   blood                                 platelet       28037   
7   blood                                 platelet       28037   
9   blood                           secretory cell       28161   
8   blood                          

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is


📂 Processing file: EMD_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded EMD_PPI_filtered_S2_H2_24042025.xlsx | Shape: (1376, 18)
✅ Found 1009 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 964 rows.
           Tissue                        Cell Type  Cell Count           PPI  \
0  adipose tissue                           B cell       16003       EMD-REL   
1  adipose tissue  CD8-positive, alpha-beta T cell        5441       EMD-REL   
2  adipose tissue                alpha-beta T cell       20182       EMD-REL   
3  adipose tissue                       blood cell        3586       EMD-REL   
4  adipose tissue                       blood cell        3586  EMD-TRAF3IP3   
5  adipose tissue                   dendritic cell        2534       EMD-REL   
6  adipose tissue               hematopoietic cell      121993       EMD-REL   
8  adipose tissue               immature NK T cell        2551  EMD-TRAF3IP3   
7  

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)



📂 Processing file: CDKN1A_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded CDKN1A_PPI_filtered_S2_H2_24042025.xlsx | Shape: (1019, 18)
✅ Found 785 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 635 rows.
           Tissue                        Cell Type  Cell Count           PPI  \
0  adipose tissue  CD4-positive, alpha-beta T cell       11653  CDKN1A-CCND3   
1  adipose tissue  CD4-positive, alpha-beta T cell       11653    CDKN1A-REL   
2  adipose tissue  CD8-positive, alpha-beta T cell        5441  CDKN1A-CCND3   
3  adipose tissue  CD8-positive, alpha-beta T cell        5441    CDKN1A-REL   
4  adipose tissue                   dendritic cell        2534    CDKN1A-REL   
5  adipose tissue                  epithelial cell      109540  CDKN1A-CCND3   
6  adipose tissue               hematopoietic cell      121993  CDKN1A-CCND3   
7  adipose tissue               hematopoietic cell      121993    CDKN1A-REL  

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is

✅ Found 8 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 8 rows.
        Tissue                                Cell Type  Cell Count  \
0        blood                            megakaryocyte       13528   
1        blood  megakaryocyte-erythroid progenitor cell        1744   
2        blood                                 platelet       28037   
3        blood                           secretory cell       28161   
4  bone marrow    CD14-positive, CD16-positive monocyte        4996   
5  bone marrow                            megakaryocyte        3445   
6        liver                            megakaryocyte        6776   
7       spleen                            megakaryocyte        6196   

           PPI  %Cells Expressing POI  %Cells Expressing Interactor  \
0   GAD1-CMTM5               0.066529                     77.772028   
1   GAD1-CMTM5               0.114679                     53.555046  

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is

✅ Found 5 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 5 rows.
           Tissue       Cell Type  Cell Count           PPI  \
0  exocrine gland  malignant cell        3974  GOSR2-CYB561   
1           liver     hepatoblast       23281     GOSR2-EBP   
2           liver       stem cell       26892     GOSR2-EBP   
3          testis  male germ cell        7259  GOSR2-TMCO5A   
4          testis       spermatid        4009  GOSR2-TMCO5A   

   %Cells Expressing POI  %Cells Expressing Interactor Comp_Category  \
0              12.229492                     36.059386             M   
1               2.800567                     40.251707             M   
2               2.506322                     35.170311             M   
3               7.604353                     58.561785             R   
4              12.871040                     82.564230             R   

  Self_Interaction  
0               No

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)



📂 Processing file: LDHB_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded LDHB_PPI_filtered_S2_H2_24042025.xlsx | Shape: (1771, 18)
✅ Found 1669 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 260 rows.
            Tissue                        Cell Type  Cell Count        PPI  \
1   adipose tissue  CD4-positive, alpha-beta T cell       11653  LDHB-LDHA   
3   adipose tissue                alpha-beta T cell       20182  LDHB-LDHA   
4   adipose tissue             innate lymphoid cell        1031  LDHB-LDHA   
6   adipose tissue         mature alpha-beta T cell       17631  LDHB-LDHA   
7   adipose tissue               myofibroblast cell        1066  LDHB-LDHA   
8   adipose tissue              natural killer cell        1022  LDHB-LDHA   
43   bladder organ  CD8-positive, alpha-beta T cell        4432  LDHB-LDHA   
45   bladder organ                           T cell        7530  LDHB-LDHA   
44   bladder organ 

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)


✅ Saved sorted file (Only PPI + Cell Count, Self-Interactions Removed): ./results_11032025/Jess_PPI_21032025/PPI_preprocessed_15042025/PPI_contextualization/filtered_Step2_H2/POLR1C_PPI_filtered_Sorted_S2_H2_24042025.xlsx


📂 Processing file: NEUROG3_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded NEUROG3_PPI_filtered_S2_H2_24042025.xlsx | Shape: (191, 18)
✅ Found 158 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 158 rows.
           Tissue                        Cell Type  Cell Count            PPI  \
0  adipose tissue  CD4-positive, alpha-beta T cell       11653  NEUROG3-TCF12   
1  adipose tissue                           T cell       31314  NEUROG3-TCF12   
2  adipose tissue                alpha-beta T cell       20182  NEUROG3-TCF12   
3  adipose tissue                       lymphocyte       50559  NEUROG3-TCF12   
4  adipose tissue                       macrophage       55294  NEUROG3-TCF12   
5  adipo

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is


📂 Processing file: OAS1_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded OAS1_PPI_filtered_S2_H2_24042025.xlsx | Shape: (5, 18)
⚠️ No valid interactor-driven Robust or Moderate Compensation cases found in OAS1_PPI_filtered_S2_H2_24042025.xlsx. Skipping...


📂 Processing file: GMPPB_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded GMPPB_PPI_filtered_S2_H2_24042025.xlsx | Shape: (2, 18)
⚠️ No valid interactor-driven Robust or Moderate Compensation cases found in GMPPB_PPI_filtered_S2_H2_24042025.xlsx. Skipping...


📂 Processing file: COQ8A_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded COQ8A_PPI_filtered_S2_H2_24042025.xlsx | Shape: (593, 18)
✅ Found 292 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 292 rows.
           Tissue                                        Cell Type  \
0  adipose tissue                                 mesothelial cell   
1   adrenal gland                  CD4-positive, alpha-beta T cell   
3   ad

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is


📂 Processing file: ACTB_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded ACTB_PPI_filtered_S2_H2_24042025.xlsx | Shape: (2108, 18)
✅ Found 2103 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 40 rows.
             Tissue                               Cell Type  Cell Count  \
43   adipose tissue                  neuron associated cell        1900   
255     bone marrow                 basophilic erythroblast        5253   
329     bone marrow                primitive red blood cell        9416   
351           brain                              blood cell        5199   
357           brain                        contractile cell       15585   
363           brain                        endothelial cell       77438   
437           brain                                pericyte       11396   
461           brain              tissue-resident macrophage      629274   
463           brain  vascular associated smooth

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is

✅ Found 38 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 38 rows.
  Tissue                                  Cell Type  Cell Count          PPI  \
0  brain                        Bergmann glial cell       13074  NR0B1-ESRRG   
1  brain                              Purkinje cell      290303  NR0B1-ESRRG   
2  brain                                  astrocyte     1154803  NR0B1-ESRRG   
3  brain                          cerebellar neuron      361299  NR0B1-ESRRG   
4  brain                            efferent neuron      292676  NR0B1-ESRRG   
5  brain                      hippocampal astrocyte       23584  NR0B1-ESRRG   
6  brain                    hippocampal interneuron        9932  NR0B1-ESRRG   
7  brain                            macroglial cell     4076209  NR0B1-ESRRG   
8  brain  neuron associated cell (sensu Vertebrata)       30577  NR0B1-ESRRG   
9  brain                        neuronal brush ce

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
A value is

✅ Saved sorted file (Only PPI + Cell Count, Self-Interactions Removed): ./results_11032025/Jess_PPI_21032025/PPI_preprocessed_15042025/PPI_contextualization/filtered_Step2_H2/TRIM32_PPI_filtered_Sorted_S2_H2_24042025.xlsx


📂 Processing file: TPM3_PPI_filtered_S2_H2_24042025.xlsx
✅ Loaded TPM3_PPI_filtered_S2_H2_24042025.xlsx | Shape: (195, 18)
✅ Found 176 total interactors in clusters with at least one Moderate/Robust Compensation.
✅ Removed H3 redundant cases. Final dataset contains 71 rows.
           Tissue                                        Cell Type  \
1   adrenal gland                  CD4-positive, alpha-beta T cell   
3   adrenal gland                  CD8-positive, alpha-beta T cell   
4   adrenal gland                  CD8-positive, alpha-beta T cell   
6   adrenal gland                                           T cell   
8   adrenal gland  effector memory CD8-positive, alpha-beta T cell   
9   adrenal gland  effector memory CD8-positive, alpha-beta T cell   
10  adrenal

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  final_df["Comp_Rank"] = final_df["Comp_Category"].map(comp_rank)
