<a href="https://colab.research.google.com/github/hubertstanczak/2dcos_toolkit/blob/main/notebooks/2dcos_toolkit_colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 2DCOS Analysis Toolkit for CD
Perform complete Two-Dimensional Correlation Spectroscopy (2DCOS) analysis  
on Far-UV Circular Dichroism (CD) data, including MRE conversion, preprocessing,  
correlation map computation, peak detection, and result export.

In [None]:
#@title **Setup**
#@markdown Prepare the environment for analysis in Google Colab. Run once before starting.

import os
import sys
import subprocess
from pathlib import Path

repo_url = "https://github.com/hubertstanczak/2dcos_toolkit.git"
repo_name = "2dcos_toolkit"

os.chdir("/content")

did_clone = False
if not Path(repo_name).exists():
    print("Cloning repository...")
    subprocess.check_call(["git", "clone", "--depth", "1", repo_url, repo_name])
    did_clone = True

os.chdir(repo_name)

src_path = str(Path.cwd() / "src")
if src_path not in sys.path:
    sys.path.insert(0, src_path)

from dcos_toolkit.api import (
    setup_logging,
    init_session,
    load_input_data_and_parse,
    compute_mre,
    compute_2dcos,
    visualize_session,
    package_results,
)

setup_logging(level="INFO", style="colab")

print("Setup complete.")


In [None]:
#@title **Input Data**
#@markdown Upload one or more CD data files in **.csv**, **.xlsx**, **.xls** or a **.zip** archive containing such files.
#@markdown
#@markdown **Expected table format:**
#@markdown - **Row 1:** perturbation values
#@markdown - **Column 1:** wavelength **λ (nm)**
#@markdown - **Body:** CD values (**mdeg**)
#@markdown
#@markdown **JASCO:** reads CD signal from Channel 1.

from pathlib import Path
import shutil
from google.colab import files  



job_name = "my_analysis"  #@param {type:"string"}

input_dir  = Path("data/input_cd")
output_dir = Path("data/results")

SESSION = init_session(job_name=job_name, output_dir=str(output_dir))
SESSION.input_dir = input_dir

SESSION.input_dir.mkdir(parents=True, exist_ok=True)
SESSION.output_dir.mkdir(parents=True, exist_ok=True)

print("Upload CD files (.csv/.xlsx/.xls) or a .zip archive containing such files." )
uploaded = files.upload()

if not uploaded:
    print("Files upload cancelled.")
else:

    for item in SESSION.input_dir.iterdir():
        try:
            if item.is_dir():
                shutil.rmtree(item)
            else:
                item.unlink()
        except Exception:
            pass


    for name, data in uploaded.items():
        (SESSION.input_dir / name).write_bytes(data)

    try:
        load_input_data_and_parse(SESSION, paths=[SESSION.input_dir])
    except (FileNotFoundError, RuntimeError, ValueError) as e:
        print(str(e))

In [None]:
#@title CD parameters for MRE calculation
#@markdown ## Formula for Mean Residue Ellipticity
#@markdown $$[\theta] = \frac{M \cdot \theta_{\text{obs}}}{10 \cdot c \cdot l \cdot n}$$
#@markdown **Where:**
#@markdown - **M** – molar mass of the peptide *(g/mol)*
#@markdown - **θ_obs** – observed ellipticity *(mdeg)*
#@markdown - **c** – concentration of the sample *(mg/mL)*
#@markdown - **l** – optical path length *(cm)*
#@markdown - **n** – number of peptide bonds *(≈ number of residues)*
#@markdown ---
#@markdown ### Parameter description
#@markdown All uploaded CD files are assumed to correspond to the **same sample**.
#@markdown The parameters below (**n, c, l, M**) are applied to **all uploaded datasets**.
#@markdown For different peptides with different physical parameters,
#@markdown please run the notebook separately for each peptide.
#@markdown
#@markdown - `residues_number` (**n**) – number of peptide bonds / residues
#@markdown - `concentration` (**c**) – peptide concentration in **mg/mL**
#@markdown - `path_length` (**l**) – cuvette optical path length in **cm**
#@markdown - `molar_mass` (**M**) – peptide molar mass in **g/mol**

molar_mass     = 1   #@param {type:"number"}
concentration  = 1   #@param {type:"number"}
path_length    = 1   #@param {type:"number"}
residues_number = 1  #@param {type:"number"}

#@markdown ---
#@markdown ### Plot generator
#@markdown If enabled, 1D spectra will be shown and saved for **all** datasets.
generate_mre_plot = True   #@param {type:"boolean"}
use_mre_for_plot  = True  #@param {type:"boolean"}
#@markdown - If `True`, plots show calculated **MRE** [θ].
#@markdown - If `False`, plots show **original CD** in mdeg.


try:
    compute_mre(
        SESSION,
        residues_number=residues_number,
        concentration_mg_ml=concentration,
        path_length_mm=path_length,
        molar_mass_g_mol=molar_mass,
        generate_plot=generate_mre_plot,
        use_mre_for_plot=use_mre_for_plot,
    )
except (RuntimeError, ValueError, TypeError) as e:
    print(str(e))

In [None]:
#@title **2DCOS Calculation**
#@markdown Compute **synchronous (Φ)** and **asynchronous (Ψ)** 2D correlation maps for all parsed datasets.
#@markdown ---
#@markdown **Input signal**
use_mre_for_2dcos = True  #@param {type:"boolean"}
#@markdown - If `True`, use **MRE** (requires the MRE step to be completed first).
#@markdown - If `False`, use **raw CD**.

#@markdown ---
#@markdown **Reference spectrum**
reference_type = "mean"  #@param ["mean", "first", "last", "none"]
#@markdown - **mean**: subtract the mean spectrum (recommended default)
#@markdown - **first**: subtract the first spectrum
#@markdown - **last**: subtract the last spectrum
#@markdown - **none**: no subtraction (uses the original spectra)


try:
    compute_2dcos(
        SESSION,
        use_mre_for_2dcos=use_mre_for_2dcos,
        reference_type=reference_type,
    )
except (RuntimeError, ValueError) as e:
    print(str(e))

In [None]:
#@title **2DCOS Visualization**
#@markdown ## Generate high-resolution **combined figures** (Φ + Ψ) with optional peak annotations.
#@markdown ___
#@markdown # Plot settings

colormap = "jet" #@param ["jet", "coolwarm", "seismic", "bwr", "RdBu_r"]

#@markdown ###Peak mirroring
mark_mirror_peaks = True  #@param {type:"boolean"}
#@markdown - If `True`, peaks are marked on both symmetric halves of the map (mirrored across the diagonal).
#@markdown - If `False`, peaks are marked only on one half; mirrored positions are not shown.

#@markdown ---
#@markdown ### Peak annotation
#@markdown Select the number of most prominent peaks in each category (0–5). Use small values for readability.
#@markdown
#@markdown **Synchronous Φ**
n_sync_diag_peaks = "1"          #@param ["0","1","2","3","4","5"]
#@markdown - Diagonal (auto-peaks)

n_sync_cross_max_peaks = "1"     #@param ["0","1","2","3","4","5"]
#@markdown - Positive cross-peaks (Φ > 0)

n_sync_cross_min_peaks = "1"     #@param ["0","1","2","3","4","5"]
#@markdown - Negative cross-peaks (Φ < 0)

#@markdown **Asynchronous Ψ**
n_async_cross_max_peaks = "1"    #@param ["0","1","2","3","4","5"]
#@markdown - Positive cross-peaks (Ψ > 0)

n_async_cross_min_peaks = "1"    #@param ["0","1","2","3","4","5"]
#@markdown - Negative cross-peaks (Ψ < 0)


try:
    visualize_session(
        SESSION,
        colormap=colormap,
        mark_mirror_peaks=mark_mirror_peaks,
        n_sync_diag_peaks=n_sync_diag_peaks,
        n_sync_cross_max_peaks=n_sync_cross_max_peaks,
        n_sync_cross_min_peaks=n_sync_cross_min_peaks,
        n_async_cross_max_peaks=n_async_cross_max_peaks,
        n_async_cross_min_peaks=n_async_cross_min_peaks,
    )
except (RuntimeError, ValueError) as e:
    print(str(e))

In [None]:
#@title **Export Results**
#@markdown Package selected files into a ZIP file (and download it in Colab).

include_input_file     = True  #@param {type:"boolean"}
#@markdown - Include the original input CD files.

include_mre            = True  #@param {type:"boolean"}
#@markdown - Include MRE tables.

include_mre_plot       = True  #@param {type:"boolean"}
#@markdown - Include 1D MRE plots.

include_2dcos          = True  #@param {type:"boolean"}
#@markdown - Include 2D-COS matrices.

include_2dcos_plot     = True  #@param {type:"boolean"}
#@markdown - Include 2D-COS map plots.

from google.colab import files  

try:
    zip_path = package_results(
        SESSION,
        include_input_file=include_input_file,
        include_mre=include_mre,
        include_mre_plot=include_mre_plot,
        include_2dcos=include_2dcos,
        include_2dcos_plot=include_2dcos_plot,
    )
    files.download(str(zip_path))
except (RuntimeError, ValueError) as e:
    print(str(e))


# How to use this notebook
---

### Data requirements
**Supported formats** - `.csv`, `.xls`, `.xlsx` or `.zip` containing such files.

1. **JASCO Exports** - Native ASCII exports are supported (Channel 1 data is extracted automatically).

2. **Spreadsheets** - requies a specific table latout:
- Row 1 (header) = temperature values,
- Column 1 = wavelength (nm),
- Columns 2...N = CD values (mdeg).

---

### Analysis workflow
1. **Setup** - Initialize the environment, clone the repository, and load backend functions.

2. **Input data** - Upload raw data file(s) via the Colab file picker. Unsupported file formats are automatically skipped.

3. **CD parameters for MRE calculation** - convert raw CD signal to MRE.
- Required parameters: molar mass concentration, path length, residues number.

4. **2DCOS Calculation** - Compute Synchronous ($\Phi$) and Asynchronous ($\Psi$) correlation maps using a selected reference spectrum.

5. **2DCOS Visualization** - Generate high-resolution combined contour plots with customizable colormaps and optional automatic peak annotation.

6. **Export Results** - Select desired output artifacts and download a ZIP archive. Results for each sample are organized into dedicated subfolders.

---

### Interpretation Guide (Noda's Rules)

**Correlation Maps Definitions:**


| Map Type | Diagonal (Auto-peaks) | Off-Diagonal (Cross-peaks) |
| :--- | :--- | :--- |
| **Synchronous ($\Phi$)** | **Always Positive.**<br>Represents the overall magnitude of susceptibility to perturbation<br>*(High intensity = significant structural change)* | **Correlations (In-phase / Out-of-phase).**<br>• **Positive (+):** Signals change in the **same direction** (both increase or decrease).<br>• **Negative (-):** Signals change in **opposite directions** (one increases, other decreases). |
| **Asynchronous ($\Psi$)** | **Always Zero.**<br>By definition, there is no asynchronous signal on the diagonal. | **Sequential Order.**<br>Non-zero peaks indicate a **time delay** between spectral changes<br>*(Used to determine the sequence of events)*. |


**Determining the order of events for a pair of wavelengths $(\lambda_1, \lambda_2)$ under increasing tempreature.**

Compare the signs of the **Cross-peaks** at the same coordinate $(\lambda_1, \lambda_2)$ in both maps:

> **Noda's Rules:**
> * **Same Signs** ($\Phi$ and $\Psi$ are both $+$ or both $-$) $\rightarrow$ Change at $\lambda_1$ occurs BEFORE $\lambda_2$.
> * **Opposite Signs** (one is $+$, the other is $-$) $\rightarrow$ Change at $\lambda_2$ occurs BEFORE $\lambda_1$. 