# PRTECAN Tutorial

This tutorial demonstrates how to process Tecan plate reader data using the `clophfit.prtecan` module.

What you'll learn:
- Tecan file structure and label blocks
- Building titrations from multiple files (manually and from list file)
- Setting plate scheme, loading additions, background handling
- Inspecting and plotting results
- Brief overview of fitting methods and quality control

In [None]:
# Setup
%load_ext autoreload
%autoreload 2

from pathlib import Path

import arviz as az
import matplotlib.pyplot as plt
import numpy as np

%matplotlib inline

from clophfit import prtecan

# Point to the tests data directory shipped with the repo
data_root = Path("../../tests/Tecan")
l1_dir = data_root / "L1"
l2_dir = data_root / "140220"  # second dataset used by the original tutorial
l4_dir = data_root / "L4"

## 1) Understanding Tecan file structure
Each Tecan file contains global metadata and one or more label blocks (measurement blocks).
Blocks with identical key metadata are equivalent; blocks differing only by Integration Time, Flashes, or Gain are almost equivalent after normalization.

In [None]:
# Load a single Tecan file and inspect label blocks
tf = prtecan.Tecanfile(l1_dir / "290513_7.2.xls")
lb1, lb2 = tf.labelblocks[1], tf.labelblocks[2]
print("Available label blocks:", list(tf.labelblocks.keys()))
lb1.metadata

In [None]:
print("\nSample Data (A01-B06):")
print({k: v for i, (k, v) in enumerate(lb2.data.items()) if i < 18})

Load additional files to compare block equivalence and demonstrate normalization across Gain differences.

In [None]:
tf1 = prtecan.Tecanfile(l1_dir / "290513_5.5.xls")  # two equivalent blocks
tf2 = prtecan.Tecanfile(
    l1_dir / "290513_8.8.xls"
)  # one equivalent, one almost equivalent

print("tf.lb1 = tf2.lb1 (strict):", lb1 == tf2.labelblocks[1])
print("tf.lb2 = tf2.lb2 (strict):", lb2 == tf2.labelblocks[2])
print("tf.lb2 ~ tf2.lb2 (almost):", lb2.__almost_eq__(tf2.labelblocks[2]))

## 2) Grouping files: manual and convenience constructor
You can group equivalent blocks across files either via TecanfilesGroup or by constructing a Titration directly.

In [None]:
# Manual grouping
tfg = prtecan.TecanfilesGroup([tf2, tf, tf1])
lbg1 = tfg.labelblocksgroups[1]
print("Well A01 raw:", lbg1.data["A01"])
print("Well A01 normalized:", lbg1.data_nrm["A01"])

# Same using Titration with explicit x (e.g., pH values)
tit_manual = prtecan.Titration([tf2, tf, tf1], x=np.array([8.8, 7.2, 5.5]), is_ph=True)
print(tit_manual)
print("A01 normalized via Titration:", tit_manual.labelblocksgroups[1].data_nrm["A01"])
tit_manual.labelblocksgroups == tfg.labelblocksgroups

## 3) Build a titration from a list file
Using a list file is convenient and less error-prone. The example list/plate files are in `tests/Tecan/L1`.

In [None]:
tit = prtecan.Titration.fromlistfile(l1_dir / "list.pH.csv", is_ph=True)
print("x values (e.g., pH):", tit.x)
lbg1 = tit.labelblocksgroups[1]
lbg2 = tit.labelblocksgroups[2]
print(
    "Temperature in labelblocksgroup 2:",
    [lb.metadata.get("Temperature").value for lb in lbg2.labelblocks],
    lbg2.labelblocks[5].metadata.get("Temperature").unit[0],
)
(lbg1.metadata, lbg2.metadata)

Within each label-block group, normalized data (by Gain, Flashes, Integration Time) are readily available.
In the case of not fully identical labelblock metadata non-normalized data might not exist (empty dict {}).

In [None]:
# Inspect raw vs normalized for a sample well
well = "H03"
(lbg1.data[well], lbg2.data, lbg1.data_nrm[well], lbg2.data_nrm[well])

## 4) Load plate scheme and additions
The plate scheme defines buffer and control wells; additions define dilution steps.
After loading these, the processed `tit.data[...]` arrays reflect background subtraction and optional dilution correction, depending on `tit.params`.

In [None]:
# Load plate scheme and additions (kept to L1 files for consistency)
tit.load_scheme(l1_dir / "scheme.txt")
print(
    f"Titration with {len(tit.tecanfiles)} files and {len(tit.labelblocksgroups)} label groups"
)
print("Buffer wells:", tit.scheme.buffer)
print("Control wells:", tit.scheme.ctrl)
print("Named groups:", tit.scheme.names)

tit.load_additions(l1_dir / "additions.pH")
print("Additions:", tit.additions)
tit.params.bg_adj = True
tit.params.bg_mth = "meansd"
print("Titration Params:", tit.params)
# Example: compare values in data vs normalized groups (after scheme/additions)
(lbg1.data["H12"], tit.data[1]["H12"], lbg1.data_nrm["H12"], tit.bg[1])

Background handling summary:
- labelblocksgroups[:].data: unchanged raw block data
- labelblocksgroups[:].data_buffersubtracted: background-subtracted
- tit.data: background-subtracted and dilution-corrected (if enabled)

The order in which you apply dilution correction and plate scheme can impact your intermediate results, even though the final results might be the same.

    Dilution correction adjusts the measured data to account for any dilutions made during sample preparation. This typically involves multiplying the measured values by the dilution factor to estimate the true concentration of the sample.

    A plate scheme describes the layout of the samples on a plate (common in laboratory experiments, such as those involving microtiter plates). The plate scheme may involve rearranging or grouping the data in some way based on the physical location of the samples on the plate.

In [None]:
# Demonstrate changing background wells and seeing bg estimate
import copy

tit2 = copy.deepcopy(tit)
tit2.params.bg = True
tit2.buffer.wells = ["D01", "E01"]
tit.bg, tit2.bg

## 5) Quick look at fitting and results
The `tit.results` container provides per-label fits; `tit.result_global` combines multiple labels.
Below we only preview access/plotting. For advanced Bayesian/ODR methods, see the dedicated section.

In [None]:
tit.bg_err

In [None]:
# Access result objects and figures
well = "D10"
single1 = tit.results[1][well]
single2 = tit.results[2][well]
glob = tit.result_global[well]
odr = tit.result_odr[well]

# Display figures inline
print(f"Reduced X2: {single2.result.redchi:.3f}")
single2.figure

In [None]:
print(f"Reduced X2: {glob.result.redchi:.3f}")
glob.figure

In [None]:
print(f"Reduced X2: {odr.mini.sum_square:.3f}")
odr.figure

In [None]:
tit.results[1].dataframe.head()

In [None]:
tit.result_mcmc

In [None]:
tit.result_multi_mcmc

In [None]:
rp = tit.result_mcmc[well]
rp.figure

In [None]:
az.plot_trace(
    rp.mini, var_names=["x_true", "K", "x_diff"], divergences=False, combined=True
)

In [None]:
# 5.1 Bayesian fitting with PyMC
tit.params.mcmc = "single"
result_mcmc = tit.result_mcmc[well]

print("MCMC Results:")
print(f"Kd: {result_mcmc.result.params['K'].value:.2f}")
print(
    f"95% HDI: [{result_mcmc.result.params['K'].min:.2f}, {result_mcmc.result.params['K'].max:.2f}]"
)

# Plot trace
az.plot_trace(result_mcmc.mini, var_names=["K", "x_true"]);

## 6) Quality control and utilities
A few helper plots are useful to quickly assess experiment consistency (buffer, temperature).

In [None]:
# Buffer plot
buf_plot = tit.buffer.plot(nrm=False)
buf_plot.figure

In [None]:
# Temperature plot
temp_plot = tit.plot_temperature()
temp_plot

In [None]:
import pandas as pd
import seaborn as sns

df1 = pd.read_csv(l2_dir / "fit1-1.csv", index_col=0)
# merged_df = tit.result_dfs[1][["K", "sK"]].merge(df1, left_index=True, right_index=True)
merged_df = tit.result_global.dataframe[["K", "sK"]].merge(
    df1, left_index=True, right_index=True
)

sns.jointplot(merged_df, x="K_y", y="K_x", ratio=3, space=0.4)

If a fit fails in a well, the well key will be anyway present in results list of dict.

### Posterior

In [None]:
from clophfit.fitting import plotting

np.random.seed(0)  # noqa: NPY002
remcee = glob.mini.emcee(
    burn=100,
    steps=2000,
    workers=8,
    thin=10,
    nwalkers=30,
    progress=False,
    is_weighted=True,
)

f = plotting.plot_emcee(remcee.flatchain)
print(remcee.flatchain.quantile([0.03, 0.97])["K"].to_list())

In [None]:
samples = remcee.flatchain[["K"]]
# Convert the dictionary of flatchains to an ArviZ InferenceData object
samples_dict = {key: np.array(val) for key, val in samples.items()}
idata = az.from_dict(posterior=samples_dict)
k_samples = idata.posterior["K"].to_numpy()
percentile_value = np.percentile(k_samples, 3)
print(f"Value at which the probability of being higher is 99%: {percentile_value}")

az.plot_forest(k_samples)

### Combining

In [None]:
tit.result_global.compute_all()

In [None]:
with sns.axes_style("darkgrid"):
    g = sns.pairplot(
        tit.result_global.dataframe[["S1_y2", "S0_y2", "K", "S1_y1", "S0_y1"]],
        hue="S1_y1",
        palette="Reds",
        corner=True,
        diag_kind="kde",
    )

In [None]:
df_ctr = tit.results[1].dataframe
for name, wells in tit.scheme.names.items():
    for well in wells:
        df_ctr.loc[well, "ctrl"] = name

df_ctr.loc[df_ctr["ctrl"].isna(), "ctrl"] = "U"

sns.set_style("whitegrid")
g = sns.PairGrid(
    df_ctr,
    x_vars=["K", "S1_1", "S0_1"],
    y_vars=["K", "S1_1", "S0_1"],
    hue="ctrl",
    palette="Set1",
    diag_sharey=False,
)
g.map_lower(plt.scatter)
g.map_upper(sns.kdeplot, fill=True)
g.map_diag(sns.kdeplot)
g.add_legend()

In [None]:
tit.result_global["A04"].figure

In [None]:
keys_unk = tit.fit_keys - set(tit.scheme.ctrl)
res_unk = tit.result_global.dataframe.loc[list(keys_unk)].sort_index()
res_unk["well"] = res_unk.index

f = plt.figure(figsize=(24, 14))
# Make the PairGrida
g = sns.PairGrid(
    res_unk,
    x_vars=["K", "S1_y2", "S0_y2"],
    y_vars="well",
    height=12,
    aspect=0.4,
)
# Draw a dot plot using the stripplot function
g.map(sns.stripplot, size=14, orient="h", palette="Set2", edgecolor="auto")

# Use the same x axis limits on all columns and add better labels
# g.set(xlim=(0, 25), xlabel="Crashes", ylabel="")

# Use semantically meaningful titles for the columns
titles = ["$pK_a$", "B$_{neutral}$", "B$_{anionic}$"]

for ax, title in zip(g.axes.flat, titles, strict=False):
    # Set a different title for each axes
    ax.set(title=title)

    # Make the grid horizontal instead of vertical
    ax.xaxis.grid(False)
    ax.yaxis.grid(True)

sns.despine(left=True, bottom=True)

## 7) Background method comparison
Different background methods may slightly shift baselines; inspect the impact on a single well.

In [None]:
methods = ["mean", "meansd", "fit"]
well = "D10"
fig, axes = plt.subplots(1, len(methods), figsize=(16, 4), sharey=True)
for ax, method in zip(axes, methods, strict=False):
    tit.params.bg_mth = method
    ax.plot(tit.x, tit.data[1][well], "o-", label=method)
    ax.axhline(0, color="gray", ls="--", lw=1)
    ax.set_title(f"method: {method}")
    ax.set_xlabel("pH")
axes[0].set_ylabel("Signal")
plt.tight_layout()

You can decide how to pre-process data with datafit_params:
- [bg] subtract background
- [dil] apply correction for dilution (when e.g. during a titration you add titrant without protein)
- [nrm] normalize for gain, number of flashes and integration time. 

In [None]:
# 3.1 Accessing processed data
well = "D10"
data = {
    "pH": tit.x,
    "Signal (raw)": tit.labelblocksgroups[1].data_nrm[well],
    "Signal (processed)": tit.data[1][well],
}

plt.figure(figsize=(10, 5))
plt.plot(data["pH"], data["Signal (raw)"], "o-", label="Raw")
plt.plot(data["pH"], data["Signal (processed)"], "s-", label="Processed")
plt.xlabel("pH")
plt.ylabel("Fluorescence")
plt.title(f"Data Processing Pipeline for Well {well}")
plt.legend()
plt.grid(True)

## Cl titration analysis

In [None]:
cl_an = prtecan.Titration.fromlistfile(l2_dir / "list.cl.csv", is_ph=False)
cl_an.load_scheme(l2_dir / "scheme.txt")
cl_an.scheme

In [None]:
from clophfit import prtecan

cl_an.load_additions(l2_dir / "additions.cl")
print(cl_an.x)
cl_an.x = prtecan.calculate_conc(cl_an.additions, 1000)
cl_an.x

In [None]:
fres = cl_an.result_global["D10"]
print(fres.is_valid(), fres.result.bic, fres.result.redchi)
fres.figure

## 8) Batch export (optional)
You can export processed data and fit results using `TecanConfig`.
Note: adjust paths and toggles (png, fit, comb) as needed.

In [None]:
tit.params

In [None]:
tit.params.bg_mth = "meansd"
tit.params.mcmc = None
tit.result_global.compute_all()

In [None]:
tit.results[1].compute_all()
tit.results[2].compute_all()

tit.result_odr.compute_all()

In [None]:
from tempfile import mkdtemp

out_dir = Path(mkdtemp())
conf = prtecan.TecanConfig(
    out_fp=out_dir, comb=False, lim=None, title="FullAnalysis", fit=True, png=True
)
tit.export_data_fit(conf)
print("Exported to:", out_dir)
# list(out_dir.glob('*'))[:10]
# print("Contents:", *[f.name for f in output_dir.glob("*")], sep="\n- ")

## 

---
Tips for development vs tutorial hygiene:
- Keep a scratch notebook (e.g., `docs/tutorials/prtecan_dev.ipynb`) for experiments.
- Avoid `os.chdir`; use Path objects relative to repository root as in this notebook.
- When a feature stabilizes, port minimal, clear examples into the main tutorial and keep heavy testing in `tests/`.

In [None]:
k = "A05"
for k in tit.fit_keys:
    print(k, np.nanmean(tit.data[1][k]) / tit.data[2][k].mean())