## Testing Playground

You've probably noticed this project lacks a unit testing set-up. Honestly, I found unit testing tricky for this and felt visual inspections of outputs, like markdowns or visualizations, were generally more trustworthy.

But I get the worry of accidentally breaking things when contributing code. So, I made this notebook to help with that. It's a work in progress, aimed at letting you easily see specific outputs of interest for smoother development.

Previously, my basic testing meant using a `test.py` file to generate a few markdowns and then checking them manually. Or, for a deeper look, I'd run the `cli.py` and wait a while to see everything, which isn't quick on my laptop.

This notebook aims to streamline that process, letting you test and inspect targeted parts of the output without the fear of breaking things.

**Before you begin, it's recommended that you put the `conversations.json` file close by, like in a `./data/` folder.**

**Make sure to restart the kernel and clear all outputs before committing changes, to ensure personal data isn't accidentally included.**

In [None]:
"""Playground for testing and debugging."""

from __future__ import annotations

from pathlib import Path
from typing import TYPE_CHECKING, Callable

from convoviz.models import ConversationSet

if TYPE_CHECKING:
    from convoviz.models import Conversation

conversations_path = Path("data") / "conversations.json"  # adjust path if needed
output_path = Path("output")
Path("output").mkdir(exist_ok=True)

collection: ConversationSet = ConversationSet.from_json(conversations_path)

In [None]:
def clear_output() -> None:
    """Clear output folder."""
    for file in output_path.glob("*"):
        file.unlink()

In [None]:
clear_output()  # run this whenever you want to clear the output folder

## Markdown

In [None]:
# Utility function to get statistics and print conversations based on a criteria
def get_top_convos(
    attr_func: Callable[[Conversation], int],
    description: str,
    count: int = 5,
) -> None:
    """Get statistics and save top conversations based on a criteria."""
    stats = [attr_func(c) for c in collection.array]
    avg_stat = sum(stats) / len(stats)
    median_stat = sorted(stats)[len(stats) // 2]
    max_stat = max(stats)

    print(
        f"Average {description}: {avg_stat}\n"
        f"Median {description}: {median_stat}\n"
        f"Max {description}: {max_stat}\n",
    )

    convos_sorted_by_attr = sorted(
        collection.array,
        key=attr_func,
        reverse=True,
    )

    for convo in convos_sorted_by_attr[:count]:
        print(
            f"id: {convo.conversation_id}\n"
            f"title: {convo.title}\n"
            f"{description}: {attr_func(convo)}\n",
        )
        file_path = output_path / f"{convo.title}.md"
        convo.save(file_path)
        print(f"saved to '{file_path.resolve()}'\n")

In [None]:
get_top_convos(lambda c: c.leaf_count, "leaf count")

In [None]:
get_top_convos(lambda c: c.message_count("assistant"), "message count")

In [None]:
get_top_convos(lambda c: len(c.content_types), "content type count")

In [None]:
get_top_convos(lambda c: len(c.used_plugins), "plugin count")

## Data Visualization

### Word Clouds

In [None]:
from random import choice

from convoviz.utils import colormaps, font_names

week_groups = collection.group_by_week()

week = choice(list(week_groups.keys()))

sample_conv_set = week_groups[week]

font_name = choice(font_names())

font_path = f"convoviz/assets/fonts/{font_name}.ttf"

colormap = choice(colormaps())


img = sample_conv_set.wordcloud(font_path=font_path, colormap=colormap)

print(f"font: {font_name}\ncolormap: {colormap}\n")

img.show()

### Graphs

In [None]:
fig = sample_conv_set.week_barplot("Prompts per day")