# English TextRecognitionDataGenerator Capabilities


This notebook demonstrates how to use the **TextRecognitionDataGenerator (TRDG)** package to synthesize
English text images. Each section highlights a different feature, and every generated sample is displayed inline
so you can inspect the output directly in the notebook.



## Setup

Run the next cell to import the helpers that we will use throughout the demonstration. It also ensures that the
notebook can locate the repository assets (fonts, dictionaries, background images) regardless of the directory
from which it is executed.


In [None]:
import math
import random
import sys
from contextlib import contextmanager
from pathlib import Path
from types import SimpleNamespace
from unittest.mock import patch

import numpy as np
import matplotlib.pyplot as plt


plt.rcParams["figure.dpi"] = 150
random.seed(12345)
np.random.seed(12345)


def locate_repo_root(start: Path) -> Path:
    if (start / "trdg").exists():
        return start
    for parent in start.parents:
        if (parent / "trdg").exists():
            return parent
    raise RuntimeError("Could not find the trdg directory from the current location.")


NOTEBOOK_DIR = Path().resolve()
REPO_ROOT = locate_repo_root(NOTEBOOK_DIR)
if str(REPO_ROOT) not in sys.path:
    sys.path.insert(0, str(REPO_ROOT))
TRDG_ROOT = REPO_ROOT / "trdg"
EN_DICT_PATH = TRDG_ROOT / "dicts" / "en.txt"


from trdg.generators import (
    GeneratorFromDict,
    GeneratorFromRandom,
    GeneratorFromStrings,
    GeneratorFromWikipedia,
)
from trdg.utils import load_fonts, load_dict, mask_to_bboxes, draw_bounding_boxes


def collect_samples(generator, count):
    results = []
    iterator = iter(generator)
    for _ in range(count):
        try:
            image_content, label = next(iterator)
        except StopIteration:
            break
        if isinstance(image_content, tuple):
            image, mask = image_content
            sample = {"image": image, "label": label, "mask": mask}
        else:
            sample = {"image": image_content, "label": label}
        results.append(sample)
    return results


def show_samples(samples, cols=3, title=None):
    if not samples:
        print("No samples to display.")
        return
    rows = math.ceil(len(samples) / cols)
    fig, axes = plt.subplots(rows, cols, figsize=(cols * 3.5, rows * 3.5))
    axes = np.atleast_2d(axes)
    for ax, sample in zip(axes.flat, samples):
        image = sample["image"]
        arr = np.array(image)
        if image.mode in ("L", "1") or arr.ndim == 2:
            ax.imshow(arr, cmap="gray")
        else:
            ax.imshow(arr)
        label = sample.get("label")
        if label:
            ax.set_title(label, fontsize=9)
        ax.axis("off")
    for ax in axes.flat[len(samples):]:
        ax.axis("off")
    if title:
        fig.suptitle(title, fontsize=12, y=1.02)
    plt.tight_layout()
    plt.show()


def show_image_and_mask(sample, title=None):
    image = sample["image"]
    mask = sample.get("mask")
    if mask is None:
        print("Sample does not include a mask.")
        return
    fig, axes = plt.subplots(1, 2, figsize=(6, 3))
    axes[0].imshow(np.array(image))
    axes[0].set_title("Rendered image")
    axes[0].axis("off")
    axes[1].imshow(np.array(mask))
    axes[1].set_title("Character mask")
    axes[1].axis("off")
    if title:
        fig.suptitle(title, fontsize=12)
    plt.tight_layout()
    plt.show()


@contextmanager
def wikipedia_stub(summary):
    dummy_page = SimpleNamespace(summary=summary)
    with patch("trdg.string_generator.wikipedia.random", return_value="Synthetic intelligence"):
        with patch("trdg.string_generator.wikipedia.page", return_value=dummy_page):
            with patch("trdg.string_generator.wikipedia.set_lang", return_value=None):
                yield



## English resources

TRDG ships with language-specific dictionaries and fonts. The next cell shows how many English words and fonts are
available out of the box.


In [None]:

en_fonts = load_fonts("en")
english_words = load_dict(str(EN_DICT_PATH))

print(f"Loaded {len(en_fonts)} English fonts. A few examples:")
for font_path in sorted(en_fonts)[:5]:
    print(" •", Path(font_path).name)

print(f"
English dictionary entries: {len(english_words)}")
print("Sample words:", ", ".join(english_words[:10]))



## Core generators

TRDG provides several high-level generators that control how the source text is produced. Regardless of the generator,
all rendering options (fonts, colors, distortions, backgrounds, etc.) work the same way.


### Dictionary-based strings

In [None]:

dict_gen = GeneratorFromDict(
    count=6,
    length=3,
    allow_variable=True,
    language="en",
    size=48,
)
dict_samples = collect_samples(dict_gen, 6)
show_samples(dict_samples, cols=3, title="Dictionary words (variable length)")


### Random character streams

In [None]:

random_gen = GeneratorFromRandom(
    count=6,
    length=3,
    allow_variable=True,
    language="en",
    size=48,
    use_letters=True,
    use_numbers=True,
    use_symbols=True,
)
random_samples = collect_samples(random_gen, 6)
show_samples(random_samples, cols=3, title="Mixed letters, digits, and symbols")


### Fixed list of strings

In [None]:

phrases = [
    "OpenAI builds practical AI tools",
    "Synthetic data accelerates OCR",
    "Fonts affect recognition accuracy",
    "Control spacing and alignment",
    "Noise helps robust training",
    "Augment text with distortions",
]
strings_gen = GeneratorFromStrings(
    strings=phrases,
    count=len(phrases),
    language="en",
    size=46,
)
string_samples = collect_samples(strings_gen, len(phrases))
show_samples(string_samples, cols=3, title="Custom phrases")


### Sentences from Wikipedia

In [None]:

summary_text = (
    "TextRecognitionDataGenerator can synthesize countless variations of English text images. "
    "These artificial examples are useful when building optical character recognition models. "
    "The generator supports backgrounds, distortions, strokes, and even handwritten synthesis."
)

with wikipedia_stub(summary_text):
    wiki_gen = GeneratorFromWikipedia(
        count=3,
        minimum_length=5,
        language="en",
        size=44,
    )
    wiki_samples = collect_samples(wiki_gen, 3)

show_samples(wiki_samples, cols=3, title="Wikipedia sentences (stubbed data)")
print("Example sentence:
", wiki_samples[0]["label"])



## Layout controls

TRDG exposes parameters to control orientation, alignment, margins, and automatic cropping.


In [None]:

orientation_samples = []
orientation_samples += collect_samples(
    GeneratorFromStrings(
        ["Horizontal orientation"],
        count=1,
        language="en",
        size=56,
        orientation=0,
    ),
    1,
)
vertical_sample = collect_samples(
    GeneratorFromStrings(
        ["Vertical"],
        count=1,
        language="en",
        size=56,
        orientation=1,
    ),
    1,
)[0]
vertical_sample["label"] = "Vertical orientation"
orientation_samples.append(vertical_sample)
show_samples(orientation_samples, cols=2, title="Horizontal vs vertical text")


In [None]:

alignment_labels = {0: "Left alignment", 1: "Center alignment", 2: "Right alignment"}
alignment_samples = []
for alignment in (0, 1, 2):
    sample = collect_samples(
        GeneratorFromStrings(
            [alignment_labels[alignment]],
            count=1,
            language="en",
            size=48,
            width=420,
            alignment=alignment,
            margins=(20, 20, 20, 20),
        ),
        1,
    )[0]
    sample["label"] = alignment_labels[alignment]
    alignment_samples.append(sample)
show_samples(alignment_samples, cols=3, title="Canvas alignment with fixed width")


In [None]:

margins_and_fit = []
base = collect_samples(
    GeneratorFromStrings(
        ["Default margins"],
        count=1,
        language="en",
        size=60,
        width=360,
    ),
    1,
)[0]
wide = collect_samples(
    GeneratorFromStrings(
        ["Custom margins"],
        count=1,
        language="en",
        size=60,
        width=360,
        margins=(40, 60, 40, 60),
    ),
    1,
)[0]
wide["label"] = "Margins=(40,60,40,60)"
fit_sample = collect_samples(
    GeneratorFromStrings(
        ["Fit trims margins"],
        count=1,
        language="en",
        size=60,
        margins=(40, 60, 40, 60),
        fit=True,
    ),
    1,
)[0]
fit_sample["label"] = "fit=True"
margins_and_fit.extend([base, wide, fit_sample])
show_samples(margins_and_fit, cols=3, title="Margins and the fit option")



## Backgrounds

Pick from Gaussian noise, plain white, quasicrystal patterns, or photographic backgrounds stored in
`trdg/images`.


In [None]:

background_types = [
    (0, "Gaussian noise"),
    (1, "Plain white"),
    (2, "Quasicrystal"),
    (3, "Photographic"),
]
background_samples = []
for bg_type, label in background_types:
    sample = collect_samples(
        GeneratorFromStrings(
            [label],
            count=1,
            language="en",
            size=48,
            background_type=bg_type,
        ),
        1,
    )[0]
    sample["label"] = label
    background_samples.append(sample)
show_samples(background_samples, cols=4, title="Built-in background generators")



## Distortion, skew, and blur

Use sinusoidal, cosine, or random warping, control skewing, and apply deterministic or random Gaussian blur.


In [None]:

distortion_labels = {
    0: "No distortion",
    1: "Sine",
    2: "Cosine",
    3: "Random",
}
distortion_samples = []
for dist_type in range(4):
    sample = collect_samples(
        GeneratorFromStrings(
            [distortion_labels[dist_type]],
            count=1,
            language="en",
            size=52,
            distorsion_type=dist_type,
            distorsion_orientation=2,
        ),
        1,
    )[0]
    sample["label"] = distortion_labels[dist_type]
    distortion_samples.append(sample)
show_samples(distortion_samples, cols=4, title="Distortion modes")


In [None]:

skew_samples = []
fixed_skew = collect_samples(
    GeneratorFromStrings(
        ["Fixed skew"],
        count=1,
        language="en",
        size=52,
        skewing_angle=15,
        random_skew=False,
    ),
    1,
)[0]
fixed_skew["label"] = "15° skew"
skew_samples.append(fixed_skew)
random_skew_sample = collect_samples(
    GeneratorFromStrings(
        ["Random skew"],
        count=1,
        language="en",
        size=52,
        skewing_angle=15,
        random_skew=True,
    ),
    1,
)[0]
random_skew_sample["label"] = "Randomized"
skew_samples.append(random_skew_sample)
show_samples(skew_samples, cols=2, title="Skew control")


In [None]:

blur_samples = []
for blur, label in [(0, "No blur"), (2, "Blur radius 2"), (4, "Random blur ≤ 4")]:
    sample = collect_samples(
        GeneratorFromStrings(
            [label],
            count=1,
            language="en",
            size=48,
            blur=blur,
            random_blur=(label.startswith("Random")),
        ),
        1,
    )[0]
    sample["label"] = label
    blur_samples.append(sample)
show_samples(blur_samples, cols=3, title="Gaussian blur options")



## Color, stroke, and image modes

Customize text colors, draw outline strokes, and convert the final image to grayscale or other Pillow modes.


In [None]:

color_samples = []
default_color = collect_samples(
    GeneratorFromStrings(
        ["Default color"],
        count=1,
        language="en",
        size=52,
    ),
    1,
)[0]
default_color["label"] = "Default"
color_samples.append(default_color)
blue_gradient = collect_samples(
    GeneratorFromStrings(
        ["Blue gradient"],
        count=1,
        language="en",
        size=52,
        text_color="#1e3a8a,#60a5fa",
    ),
    1,
)[0]
blue_gradient["label"] = "text_color range"
color_samples.append(blue_gradient)
stroked = collect_samples(
    GeneratorFromStrings(
        ["Stroke width"],
        count=1,
        language="en",
        size=52,
        stroke_width=3,
        stroke_fill="#f9fafb",
    ),
    1,
)[0]
stroked["label"] = "Stroke width 3"
color_samples.append(stroked)
gray_mode = collect_samples(
    GeneratorFromStrings(
        ["Grayscale mode"],
        count=1,
        language="en",
        size=52,
        image_mode="L",
        text_color="#111827,#f3f4f6",
    ),
    1,
)[0]
gray_mode["label"] = "image_mode='L'"
color_samples.append(gray_mode)
show_samples(color_samples, cols=4, title="Color controls and output modes")



## Spacing controls

`space_width` scales the width of whitespace characters, `character_spacing` inserts extra pixels between letters,
and `word_split=True` treats words as units when computing the character mask (useful for ligature-based scripts).


In [None]:

space_samples = []
for factor in [0.6, 1.0, 1.8]:
    sample = collect_samples(
        GeneratorFromStrings(
            [f"space_width={factor}"],
            count=1,
            language="en",
            size=46,
            space_width=factor,
        ),
        1,
    )[0]
    sample["label"] = f"space_width={factor}"
    space_samples.append(sample)
show_samples(space_samples, cols=3, title="Space width multiplier")


In [None]:

character_spacing_samples = []
for spacing in [0, 4, 8]:
    sample = collect_samples(
        GeneratorFromStrings(
            [f"character_spacing={spacing}"],
            count=1,
            language="en",
            size=46,
            character_spacing=spacing,
        ),
        1,
    )[0]
    sample["label"] = f"character_spacing={spacing}"
    character_spacing_samples.append(sample)
show_samples(character_spacing_samples, cols=3, title="Character spacing in pixels")


In [None]:

phrase = "word split controls groupings"
without_split = collect_samples(
    GeneratorFromStrings(
        [phrase],
        count=1,
        language="en",
        size=50,
        output_mask=True,
        word_split=False,
    ),
    1,
)[0]
with_split = collect_samples(
    GeneratorFromStrings(
        [phrase],
        count=1,
        language="en",
        size=50,
        output_mask=True,
        word_split=True,
    ),
    1,
)[0]

boxes_no_split = mask_to_bboxes(without_split["mask"])
boxes_with_split = mask_to_bboxes(with_split["mask"])

boxed_no_split = without_split["image"].copy()
draw_bounding_boxes(boxed_no_split, boxes_no_split, color="red")
boxed_with_split = with_split["image"].copy()
draw_bounding_boxes(boxed_with_split, boxes_with_split, color="lime")

word_split_samples = [
    {"image": boxed_no_split, "label": "word_split=False"},
    {"image": boxed_with_split, "label": "word_split=True"},
]
show_samples(word_split_samples, cols=2, title="Bounding boxes from character masks")



## Masks and bounding boxes

Setting `output_mask=True` returns the per-character mask alongside the rendered image. You can reuse the helper
`mask_to_bboxes` to recover bounding boxes programmatically.


In [None]:

mask_sample = collect_samples(
    GeneratorFromStrings(
        ["Bounding boxes"],
        count=1,
        language="en",
        size=58,
        output_mask=True,
        stroke_width=1,
    ),
    1,
)[0]

show_image_and_mask(mask_sample, title="Image and mask output")

boxes = mask_to_bboxes(mask_sample["mask"])
boxed_image = mask_sample["image"].copy()
draw_bounding_boxes(boxed_image, boxes, color="orange")
show_samples([{ "image": boxed_image, "label": "mask_to_bboxes"}], cols=1, title="Overlayed bounding boxes")



## Optional handwritten synthesis

Enabling handwriting requires the additional dependencies listed in `requirements-hw.txt` (notably TensorFlow).
The code below tries to generate a sample and prints a friendly message if the optional stack is missing.


In [None]:

try:
    handwritten_samples = collect_samples(
        GeneratorFromStrings(
            ["Handwritten text demo"],
            count=1,
            language="en",
            size=48,
            is_handwritten=True,
            text_color="#111111,#444444",
        ),
        1,
    )
    show_samples(handwritten_samples, cols=1, title="Handwriting model output")
except Exception as exc:
    print("Handwritten generation requires the optional TensorFlow-based model.")
    print("Install dependencies from requirements-hw.txt and rerun this cell if you need it.")
    print("Original error:", exc)



## Next steps

You can combine any of the illustrated options to build more elaborate augmentation pipelines. When integrating TRDG
into larger workflows, use the generators directly inside your training loops to avoid storing every synthetic image on disk.
