## Update QC metrics for blur to better catch poor quality images

Using the data from the `cellpainting_predicts_cardiac_fibrosis` repository, the QC thresholds will be updated to better detect poor quality images.

In the previous experiment, the thresholds were working well in those conditions.
To apply to this experiment, we will rerun the QC to find optimal thresholds for blur for all channels for only one side of the distribution.
More negative represents blur, so we only need one threshold to catch these conditions.
More positive/close to 0 looks to represent empty images but we don't need to catch that condition.

As well, the QC for saturation has already been updated to have a universal threshold of 0.10 or 10% of pixels can be at the maximum value. 
This is a stricter threshold that better accounts for the FOVs where cells are growing on top of each other.

In [5]:
import pandas as pd

In [6]:
# URL of the CSV file on GitHub
github_url = "https://raw.githubusercontent.com/WayScience/cellpainting_predicts_cardiac_fibrosis/main/1.preprocessing_data/qc_results/localhost231120090001/Image.csv"

# Load the CSV file into a pandas DataFrame
qc_df = pd.read_csv(github_url)

# Display the first few rows of the DataFrame
print(qc_df.shape)
qc_df.head()

(960, 147)


Unnamed: 0,Channel_OrigActin,Channel_OrigDNA,Channel_OrigER,Channel_OrigMito,Channel_OrigPM,ExecutionTime_01Images,ExecutionTime_02Metadata,ExecutionTime_03NamesAndTypes,ExecutionTime_04Groups,ExecutionTime_05MeasureImageQuality,...,URL_OrigActin,URL_OrigDNA,URL_OrigER,URL_OrigMito,URL_OrigPM,Width_OrigActin,Width_OrigDNA,Width_OrigER,Width_OrigMito,Width_OrigPM
0,-1,-1,-1,-1,-1,0.0,0.0,1.56,0.0,2.11,...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,1104,1104,1104,1104,1104
1,-1,-1,-1,-1,-1,0.0,0.0,1.08,0.0,2.05,...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,1104,1104,1104,1104,1104
2,-1,-1,-1,-1,-1,0.0,0.0,0.79,0.0,2.51,...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,1104,1104,1104,1104,1104
3,-1,-1,-1,-1,-1,0.0,0.0,0.52,0.0,2.01,...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,1104,1104,1104,1104,1104
4,-1,-1,-1,-1,-1,0.0,0.0,0.6,0.0,1.97,...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,file:/home/jenna/CFReT_data/0.download_data/Im...,1104,1104,1104,1104,1104


In [4]:
# Calculate thresholds for each channel
channels = ["OrigActin", "OrigDNA", "OrigER", "OrigMito", "OrigPM"]
blur_thresholds = {}

for channel in channels:
    col = f"ImageQuality_PowerLogLogSlope_{channel}"
    # Check if the column exists in the DataFrame
    if col in qc_df.columns:
        # Calculate the 25th and 75th percentiles and the IQR
        Q1 = qc_df[col].quantile(0.25)
        Q3 = qc_df[col].quantile(0.75)
        IQR = Q3 - Q1
        # Calculate the blur threshold using IQR method (any value very negative)
        blur_thresholds[channel] = Q1 - 1.5 * IQR

# Display the calculated thresholds
print("Calculated blur thresholds:")
for channel, threshold in blur_thresholds.items():
    print(f"{channel}: {threshold}")

Calculated blur thresholds:
OrigActin: -1.8891791699802942
OrigDNA: -2.2456075474546515
OrigER: -2.2825812279725524
OrigMito: -2.012531942517173
OrigPM: -2.4309820530015642
