# Dotplot Visualization Tutorial

This notebook explores all visualisation options provided by the `DotPlotter` class.

A *dot plot* (or *dotplot*) is a classic bioinformatics visualisation that displays all shared
subsequences between two sequences.  Each dot represents a shared k-mer; diagonal runs of
dots indicate conserved regions.  Inversions appear as anti-diagonal lines.

In [None]:
import tempfile
import os

from rusty_dot import SequenceIndex
from rusty_dot.dotplot import DotPlotter

## 1. Build a test index

We create three artificial sequences with different overlap patterns:

In [None]:
# Helper to create a reverse complement
def revcomp(seq):
    table = str.maketrans("ACGTacgt", "TGCAtgca")
    return seq.translate(table)[::-1]

unit   = "ACGTACGTACGT"  # 12 bp repeat unit
seq_a  = unit * 10                         # 120 bp  — the reference
seq_b  = "T" + unit * 9 + "T"             # 120 bp  — shifted by 1
seq_c  = revcomp(unit * 5) + unit * 5     # 120 bp  — half inverted

idx = SequenceIndex(k=8)
idx.add_sequence("reference", seq_a)
idx.add_sequence("shifted",   seq_b)
idx.add_sequence("partial_inv", seq_c)

print(f"Index: {idx}")

## 2. All-vs-all dotplot (default settings)

`DotPlotter.plot()` without arguments produces an all-vs-all grid using all sequences
in the index.

In [None]:
plotter = DotPlotter(idx)

with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as fh:
    all_vs_all_path = fh.name

plotter.plot(
    output_path=all_vs_all_path,
    title="All vs All",
)
print(f"Saved: {all_vs_all_path}  ({os.path.getsize(all_vs_all_path)} bytes)")

## 3. Subset: specific query and target sets

Pass `query_names` and `target_names` to restrict the grid to a subset of sequences.

In [None]:
with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as fh:
    subset_path = fh.name

plotter.plot(
    query_names=["reference", "shifted"],
    target_names=["partial_inv"],
    output_path=subset_path,
    title="Reference & Shifted vs Partial Inversion",
)
print(f"Subset plot saved: {subset_path}")

## 4. Single-pair dotplot

`plot_single` renders one comparison panel with its own figure size and title.

In [None]:
with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as fh:
    single_path = fh.name

plotter.plot_single(
    query_name="reference",
    target_name="shifted",
    output_path=single_path,
    figsize=(5, 5),
    title="reference vs shifted",
)
print(f"Single-pair plot saved: {single_path}")

## 5. Customising dot appearance

All plotting methods accept `dot_size` and `dot_color` to control the appearance of match lines.

In [None]:
with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as fh:
    styled_path = fh.name

plotter.plot(
    output_path=styled_path,
    dot_size=1.5,
    dot_color="crimson",
    dpi=200,
    title="Custom style: crimson, dpi=200",
)
print(f"Styled plot saved: {styled_path}")

## 6. Controlling merge behaviour

When `merge=True` (default), consecutive co-linear k-mer hits are merged into single lines.
Set `merge=False` to display every individual k-mer hit as its own point — useful for
inspecting raw k-mer density.

In [None]:
with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as fh:
    unmerged_path = fh.name

plotter.plot_single(
    query_name="reference",
    target_name="shifted",
    output_path=unmerged_path,
    merge=False,
    title="reference vs shifted (unmerged k-mer hits)",
)
print(f"Unmerged plot saved: {unmerged_path}")

## 7. Output resolution

Use the `dpi` parameter to control the resolution of the saved image.
Higher DPI is better for print-quality figures.

In [None]:
for dpi in [72, 150, 300]:
    with tempfile.NamedTemporaryFile(suffix=f"_dpi{dpi}.png", delete=False) as fh:
        path = fh.name
    plotter.plot_single(
        "reference", "shifted",
        output_path=path,
        dpi=dpi,
        title=f"DPI = {dpi}",
    )
    size_kb = os.path.getsize(path) / 1024
    print(f"DPI={dpi:4d}  file size={size_kb:.1f} kB  path={path}")

## 8. Panel size control

For all-vs-all grids, `figsize_per_panel` controls the size (in inches) of each subplot panel.

In [None]:
with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as fh:
    large_path = fh.name

plotter.plot(
    output_path=large_path,
    figsize_per_panel=6.0,   # each panel is 6×6 inches
    title="Large panels (6 inches each)",
)
print(f"Large-panel plot saved: {large_path}")

## Summary of DotPlotter parameters

| Parameter | Default | Description |
|-----------|---------|-------------|
| `query_names` | `None` | List of query sequence names (rows); `None` = all |
| `target_names` | `None` | List of target sequence names (columns); `None` = all |
| `output_path` | `"dotplot.png"` | Output file path |
| `figsize_per_panel` | `4.0` | Inches per subplot panel (all-vs-all only) |
| `figsize` | `(6, 6)` | Total figure size for `plot_single` |
| `dot_size` | `0.5` | Line/marker size for each match |
| `dot_color` | `"blue"` | Colour of match lines |
| `merge` | `True` | Merge co-linear k-mer runs into blocks |
| `title` | `None` | Figure title |
| `dpi` | `150` | Output image resolution |