# DraCor API: first steps

This short chapter shows how to list DraCor corpora and peek at a few plays via the HTTP API using a tiny helper (`utils/dracor_client.py`).

> **Learning goals**
> - Call the DraCor API from Python.
> - List available corpora and preview plays.
>
> **Requirements**
> - Basic Python; internet connection.
>
> **What you'll do**
> 1) Import a small API helper. 2) Fetch corpora. 3) Count plays in a few corpora.
>
> **Exercise (at the end)**
> - Repeat for different corpora; compare sizes and languages.

In [None]:
# Make project root importable so `from utils import dracor_client` works
import os, sys
from pathlib import Path

# If running the .ipynb from docs/, go one level up to project root
PROJECT_ROOT = Path.cwd()
if (PROJECT_ROOT / "docs").exists():
    PROJECT_ROOT = PROJECT_ROOT

# Handle cases where the CWD is docs/ or the book’s build dir
candidates = [
    PROJECT_ROOT,                  # when running in docs/
    PROJECT_ROOT.parent,           # when running at project root
    Path(__file__).resolve().parents[1] if "__file__" in globals() else None
]
for p in [c for c in candidates if c]:
    if (p / "utils" / "dracor_client.py").exists():
        sys.path.insert(0, str(p))
        break


In [None]:
# Setup (pretty display and imports)
import pandas as pd
pd.set_option("display.max_colwidth", 80)
pd.set_option("display.precision", 0)

from utils import dracor_client as dc
from datetime import datetime
print("Last run:", datetime.utcnow().strftime("%Y-%m-%d %H:%M UTC"))

In [None]:
# 1) Fetch all corpora
corpora = dc.corpora()
len(corpora), corpora[:2]  # quick sanity check (length + first 2 items)

In [None]:
# 2) Put corpora into a tidy DataFrame
df_corpora = pd.DataFrame(corpora)
# Different corpora may expose different metadata; keep safe columns.
keep = [c for c in ["id", "name", "description", "languages"] if c in df_corpora.columns]
df_corpora[keep].head().style.set_properties(**{"text-align": "left"})

## How many plays per corpus?
Let’s sample a handful of corpora and count how many plays each contains.

In [None]:
sample = corpora[:6]  # take the first six corpora for a quick preview
rows = []
for c in sample:
    cid = c.get("id")
    try:
        plays = dc.corpus_plays(cid)
        rows.append({"corpus": cid, "n_plays": len(plays)})
    except Exception as e:
        rows.append({"corpus": cid, "n_plays": None, "error": str(e)[:60]})
pd.DataFrame(rows)

In [None]:
# 3) Small bar chart (top: first six corpora)
import matplotlib.pyplot as plt
counts = pd.DataFrame(rows).dropna(subset=["n_plays"]).sort_values("n_plays", ascending=False)
plt.figure(figsize=(6,3.5))
plt.bar(counts["corpus"], counts["n_plays"]) 
plt.title("Play counts (sampled corpora)")
plt.ylabel("# plays")
plt.xticks(rotation=45, ha="right")
plt.tight_layout()
plt.show()

## Exercise

1. Change `sample = corpora[:6]` to something like `sample = [c for c in corpora if c['id'] in ['ger', 'rus', 'ita']]`.
2. Re-run the counting cell. Which corpus has the most plays?
3. Optional: fetch the plays for one corpus (`dc.corpus_plays('ger')`) and list the first 5 titles as a table.

> **Takeaways**
> - You can access DraCor corpora and plays with a few lines of code.
> - Keeping results in tidy DataFrames makes quick summaries and plots trivial.
> - In later chapters we’ll navigate plays, characters, and networks in more detail.