# Fuzz Archives: Artist-Centered Graph

*Powered by **Traverse** and Visualized with **Cosmograph***

Build a similarity graph where **each node is a unique artist** and edges
connect artists that share genre/style tags.  Tags from all of an artist's
records are merged together, so prolific artists accumulate a rich tag
profile.  The more tags two artists share, the stronger the link.

**Prerequisites:**
```bash
pip install -e ".[dev]"
cd src/traverse/cosmograph/app && npm install && npm run build
```

## 1. Configuration

In [None]:
from pathlib import Path

RECORDS_CSV = Path(r"C:\Users\xtrem\Documents\Datasets\records.csv")
OUT_DIR = Path("_out_artist")
FORCE = False  # set True to rebuild cache

# Tuning parameters
MAX_NODES = 0               # 0 = unlimited — keep every artist
MIN_SHARED_TAGS = 1         # min shared tags to create an edge (1 = any overlap)
MAX_EDGES = 0               # 0 = unlimited (per-node cap handles sparsity)
MAX_EDGES_PER_NODE = 2      # keep only top-K edges per artist (minimizes edges)
MAX_TAG_DEGREE = 200        # sample tags shared by more artists than this
SAMPLE_HIGH_DEGREE = True   # True = sample down; False = skip entirely

## 2. Build or Load Graph

Build the artist-centered similarity graph from the records CSV (or load
from cache).  Uses a separate `_out_artist/` cache dir to avoid colliding
with other graph caches.

In [None]:
from traverse.graph.artist_graph import build_artist_graph
from traverse.graph.cache import GraphCache

cache = GraphCache(
    cache_dir=OUT_DIR,
    build_fn=lambda: build_artist_graph(
        RECORDS_CSV,
        max_nodes=MAX_NODES,
        min_shared_tags=MIN_SHARED_TAGS,
        max_edges=MAX_EDGES,
        max_edges_per_node=MAX_EDGES_PER_NODE,
        max_tag_degree=MAX_TAG_DEGREE,
        sample_high_degree=SAMPLE_HIGH_DEGREE,
    ),
    force=FORCE,
)
graph, records_df = cache.load_or_build()

n_pts = len(graph["points"])
n_lks = len(graph["links"])
total_items = n_pts + n_lks
print(f"Graph: {n_pts:,} nodes, {n_lks:,} edges ({total_items:,} total items)")
print(f"Records: {len(records_df):,} rows")

## 3. Community Detection

Run Louvain community detection and add community labels to each artist node.

In [None]:
from collections import Counter
from traverse.graph.community import add_communities, CommunityAlgorithm

graph = add_communities(graph, CommunityAlgorithm.LOUVAIN, seed=42)

comm_counts = Counter(pt["community"] for pt in graph["points"])
print(f"{len(comm_counts)} communities:")
for comm_id, count in comm_counts.most_common():
    print(f"  Community {comm_id}: {count} nodes")

## 4. Sanity Check

Sample an artist and its neighbors to verify the graph makes sense.

In [None]:
import random

sample_pt = random.choice(graph["points"])
sample_id = sample_pt["id"]
print(f"Sample node: {sample_pt}")
print()

# Find neighbors
id_to_pt = {pt["id"]: pt for pt in graph["points"]}
neighbors = []
for lk in graph["links"]:
    w = lk.get("weight", 1)
    if lk["source"] == sample_id:
        neighbors.append((lk["target"], w))
    elif lk["target"] == sample_id:
        neighbors.append((lk["source"], w))

neighbors.sort(key=lambda x: x[1], reverse=True)
print(f"{len(neighbors)} neighbors (top 10 by shared tags):")
for nid, w in neighbors[:10]:
    npt = id_to_pt.get(nid, {})
    print(f"  w={w}: {npt.get('label', nid)}")

## 5. Export & Serve

Export the community graph JSON and start the Cosmograph server.

In [None]:
from traverse.graph.adapters_cosmograph import CosmographAdapter
from traverse.cosmograph.server import serve, _default_dist_dir

n_pts = len(graph["points"])
n_lks = len(graph["links"])
print(f"Graph: {n_pts:,} nodes, {n_lks:,} edges")

# Browser safety check — Cosmograph handles ~500K edges max in-browser
if n_lks > 500_000:
    print(f"WARNING: {n_lks:,} edges is too many for browser visualization.")
    print("Consider increasing MIN_SHARED_TAGS or lowering MAX_EDGES/MAX_NODES.")
    print("Skipping export. Adjust params and re-run.")
else:
    meta = {"clusterField": "community", "title": "Fuzz Archives"}
    out_path = _default_dist_dir() / "cosmo_artists_community.json"
    CosmographAdapter.write(graph, out_path, meta=meta)
    print()
    print("Starting server — open in browser:")
    print("  http://127.0.0.1:8080/?data=/cosmo_artists_community.json")
    print()
    print("Press Ctrl+C (or interrupt the kernel) to stop.")
    serve(port=8080)