## Testing Playground

You've probably noticed this project lacks a unit testing set-up. Honestly, I found unit testing tricky for this and felt visual inspections of outputs, like markdowns or visualizations, were generally more trustworthy.

But I get the worry of accidentally breaking things when contributing code. So, I made this notebook to help with that. It's a work in progress, aimed at letting you easily see specific outputs of interest for smoother development.

Previously, my basic testing meant using a `test.py` file to generate a few markdowns and then checking them manually. Or, for a deeper look, I'd run the `cli.py` and wait a while to see everything, which isn't quick on my laptop.

This notebook aims to streamline that process, letting you test and inspect targeted parts of the output without the fear of breaking things.

**Before you begin, it's recommended that you put the `conversations.json` file close by, like in a `./data/` folder.**

**Make sure to restart the kernel and clear all outputs before committing changes, to ensure personal data isn't accidentally included.**

In [None]:
"""Playground for testing and debugging."""

from __future__ import annotations

from pathlib import Path
from typing import TYPE_CHECKING, Callable

from controllers.file_system import conversation_set_from_json, save_conversation

if TYPE_CHECKING:
    from models.conversation import Conversation
    from models.conversation_set import ConversationSet

conversations_path: Path = Path("data") / "conversations.json"  # adjust path if needed
output_path = Path("output")
Path("output").mkdir(exist_ok=True)

conversation_set: ConversationSet = conversation_set_from_json(
    json_filepath=conversations_path,
)

In [None]:
def clear_output() -> None:
    """Clear output folder."""
    for file in output_path.glob(pattern="*"):
        file.unlink()

In [None]:
clear_output()  # run this whenever you want to clear the output folder

## Markdown

In [None]:
# Utility function to get statistics and print conversations based on a criteria
def get_top_convos(
    attr_func: Callable[[Conversation], int],
    description: str,
    count: int = 5,
) -> None:
    """Get statistics and save top conversations based on a criteria."""
    stats: list[int] = [attr_func(c) for c in conversation_set.conversation_list]
    avg_stat: float = sum(stats) / len(stats)
    median_stat: int = sorted(stats)[len(stats) // 2]
    max_stat: int = max(stats)

    print(
        f"Average {description}: {avg_stat}\n"
        f"Median {description}: {median_stat}\n"
        f"Max {description}: {max_stat}\n",
    )

    convos_sorted_by_attr: list[Conversation] = sorted(
        conversation_set.conversation_list,
        key=attr_func,
        reverse=True,
    )

    for convo in convos_sorted_by_attr[:count]:
        print(
            f"id: {convo.id}\n"
            f"title: {convo.title}\n"
            f"{description}: {attr_func(convo)}\n",
        )
        file_path: Path = output_path / f"{convo.sanitized_title()}.md"
        save_conversation(conversation=convo, filepath=file_path)
        print(f"saved to '{file_path.resolve()}'\n")

In [None]:
get_top_convos(attr_func=lambda c: c.leaf_count(), description="leaf count")

In [None]:
get_top_convos(attr_func=lambda c: c.message_count(), description="message count")

In [None]:
get_top_convos(
    attr_func=lambda c: len(c.content_types()),
    description="content type count",
)

In [None]:
get_top_convos(attr_func=lambda c: len(c.used_plugins()), description="plugin count")

## Data Visualization

### Word Clouds

In [None]:
from random import choice

from controllers.data_analysis import wordcloud_from_conversation_set
from utils.utils import get_colormap_names, get_font_names

if TYPE_CHECKING:
    from datetime import datetime

    from PIL.Image import Image

weeks_dict: dict[datetime, ConversationSet] = conversation_set.grouped_by_week()

week: datetime = choice(seq=list(weeks_dict.keys()))

sample_conv_set: ConversationSet = weeks_dict[week]

font_name: str = choice(seq=get_font_names())

font_path: str = f"assets/fonts/{font_name}.ttf"

colormap: str = choice(seq=get_colormap_names())


img: Image = wordcloud_from_conversation_set(
    conv_set=sample_conv_set,
    font_path=font_path,
    colormap=colormap,
)

print(f"font: {font_name}\ncolormap: {colormap}\n")

img.show()

### Graphs

In [None]:
from controllers.data_analysis import weekwise_graph_from_conversation_set

if TYPE_CHECKING:
    from matplotlib.figure import Figure


fig: Figure = weekwise_graph_from_conversation_set(conv_set=sample_conv_set)