A typed, ergonomic, read-only async Python client for the SCP Foundation Wiki's Crom GraphQL API.
thaumiel wraps Crom's GraphQL endpoint in a small, fully-typed surface: fetch pages, filter and sort them with a Python DSL, page through results without touching cursors, and budget your rate-limit quota — all async, all type-checked.
- Fully typed: Frozen Pydantic v2 models (
Page,Author,Attribution, ...), checked under pyright strict. - Ergonomic filter DSL: Build server-side filters with Python operators:
(F.rating >= 100) & (F.tag == "scp"). Illegal filters raise at build time, not at the server. - Automatic pagination:
pages()is an async iterator that follows Crom's cursors for you;fetch_page_batch()exposes them when you want manual control. - Costly-field provenance: Opt into expensive fields per call, and tell "not requested" apart from "server returned null" via
page.requested(...). - Quota estimation:
estimate_*predicts a call's point cost before you spend it. - Typed errors and optional retry: A
ThaumielErrorhierarchy plus a configurableRetryPolicywith exponential backoff.
Requires Python 3.14+.
pip install thaumielimport asyncio
from thaumiel import AsyncClient
async def main() -> None:
async with AsyncClient() as client:
# Crom stores SCP wiki URLs with the http:// scheme.
page = await client.page("http://scp-wiki.wikidot.com/scp-173")
if page is None:
return
print(page.title, page.rating)
print(page.tags[:3])
asyncio.run(main())SCP-173 10752.0
('autonomous', 'ectoentropic', 'euclid')
page() returns None (not an exception) when nothing matches, and takes either a url or a wikidot_id.
pages() streams every match, following pagination automatically. Combine F accessors into a filter and pass a Sort:
import asyncio
from thaumiel import AsyncClient, F, Sort, SortKey
async def main() -> None:
# Highest-rated SCP articles on the English wiki.
query = F.url.starts_with("http://scp-wiki.wikidot.com") & (F.tag == "scp")
async with AsyncClient() as client:
shown = 0
async for page in client.pages(
filter=query, sort=Sort.by(SortKey.RATING), page_size=5
):
print(f"{page.rating:>6.0f} {page.title}")
shown += 1
if shown == 5:
break
asyncio.run(main()) 10752 SCP-173
7145 ●●|●●●●●|●●|●
5544 SCP-049
5240 SCP-____-J
4790 SCP-096
Count matches without fetching them:
await client.count_pages(F.tag == "scp") # -> 69916Need the cursor yourself (checkpointing, UI paging)? fetch_page_batch() returns one PageBatch with .pages, .end_cursor, and .has_next_page.
Each F accessor exposes only the operators its field supports; an unsupported operator or a wrong-typed value raises InvalidPredicateError immediately.
| Accessor | Field type | Operators |
|---|---|---|
F.url |
prefix string | == != .starts_with() |
F.title |
string (case-insensitive) | == != .eq_lower() .neq_lower() .starts_with() .starts_with_lower() |
F.author |
string (case-insensitive) | same as F.title; matches an attribution's display name |
F.category |
string | == != |
F.rating |
int | == != < <= > >= |
F.created_at |
datetime | == != < <= > >= |
F.is_hidden, F.is_user_page |
bool | == != |
F.tag |
tag set | == (has) != (lacks) .all_of() .any_of() .none_of() |
Combine predicates with & (and), | (or), and ~ (not).
Warning
Because ==/>=/... are overloaded, the combinators & | ~ bind looser than the comparisons. Parenthesize every comparison:
(F.rating >= 100) & (F.tag == "scp") # correct
F.rating >= 100 & F.tag == "scp" # WRONG: parsed as F.rating >= (100 & F.tag) == "scp"A predicate lowers to Crom's GraphQL input only when a request is issued, but you can inspect it:
(F.rating >= 100).compile().model_dump(by_alias=True, exclude_unset=True)
# {'onWikidotPage': {'rating': {'gte': 100}}}Some fields cost extra rate-limit points and are opt-in per call. A field you don't request stays None; some can be None even when requested, so page.requested(...) disambiguates.
from thaumiel import CostlyField
page = await client.page(
"http://scp-wiki.wikidot.com/scp-173",
source=True,
attributions=True,
)
print(len(page.source)) # 1680
print(page.summary) # None
print(page.requested(CostlyField.SUMMARY)) # False — we never asked for it
credit = page.attributions[0]
print(credit.type.value, credit.user_display_name) # AUTHOR Moto42Crom meters usage in points (reported via the x-ratelimit-remaining header; the ceiling is 300000). Estimate before you spend — costly fields in pages() are billed per page:
from thaumiel import estimate_count, estimate_page, estimate_pages
estimate_page(source=True, attributions=True) # 4
estimate_count() # 2
estimate_pages(page_size=100, source=True) # 200Every error subclasses ThaumielError:
from thaumiel import GraphQLError, RateLimitError, TransportError
try:
page = await client.page(url)
except RateLimitError as exc: # HTTP 429 (a subclass of TransportError)
...
except TransportError as exc: # other HTTP/network failure; .status_code, .cause
...
except GraphQLError as exc: # query-level errors; .errors
...Every call is read-only and idempotent, so retrying is safe. RetryPolicy backs off exponentially on rate limits (and optionally on 5xx):
from thaumiel import AsyncClient, RetryPolicy
policy = RetryPolicy(max_attempts=4, backoff=0.5)
async with AsyncClient() as client:
# Pass a factory, not a coroutine: a retry needs a fresh awaitable.
page = await policy.run(lambda: client.page("http://scp-wiki.wikidot.com/scp-173"))from thaumiel import AsyncClient
client = AsyncClient(
user_agent="my-app/1.0 (me@example.com)", # good Crom etiquette
timeout=30.0,
)For full control — connection limits, event hooks, observing quota headers — inject your own httpx.AsyncClient. thaumiel will not close a client it did not create:
import httpx
from thaumiel import AsyncClient
http = httpx.AsyncClient(headers={"User-Agent": "my-app/1.0"})
client = AsyncClient(http_client=http)
# ... use client ...
await http.aclose() # you own it; you close itMore end-to-end scripts live in examples/.
- Read-only: thaumiel offers no writes.
- Async only: There is no synchronous client.
- Wikidot pages only:
pages()skips non-Wikidot nodes (e.g. RuFoundation), so it can yield fewer rows thancount_pagesreports for the same filter. - Curated filter surface: Only the fields in the table above are filterable, and some support equality only.
- Quota-bound: Requests cost points against Crom's quota; budget with
estimate_*. - Alpha: While on 0.x, the public API may change before 1.0.
MIT — see LICENSE.
