# reading the a11yhood data

this document describes existing approaches for accessing data in the a11yhood supabase.

## TLDR

* seed scripts are for local dev/test data and assume a writable database
* direct `supabase` access needs shared auth via `SUPABASE_URL` and `SUPABASE_KEY`
* the `a11yhood` rest api is the lowest friction option when credentials are unavailable
* the public api may enforce rate limits, so cache and pace requests for bulk pulls with caching clients like `hishel`
* a bulk archive would be best for offline or large pulls


In [1]:
import hishel.httpx, httpx, pandas, operator, os, supabase, dotenv  


it seems there are three ways to do this.

1. run the `seed_scripts` base scripts
2. use the `supabase` api directly
3. scrape the `https://a11yhood-backend.cs.washington.edu/api/products` url

## seed scripts

- Runtime: Python 3 in the backend env (run via `uv run python ...` or `python ...`).
- Dependencies: database access plus the backend packages (notably `python-dotenv`, `sqlalchemy`, and `supabase` used across the scripts).
- Environment variables: `ENV_FILE` (defaults to `.env.test`), `DATABASE_URL` (required by DB-backed scripts), and optional `LIBRARYTHING_API_KEY` (for LibraryThing config).
- System constraints: database/schema must exist and be writable; scripts expect local dev/test configuration and will read from the env file when present.

## connecting with `supabase` directly

the seed scripts are wrappers around `supabase` so we still need auth to access the data.

In [2]:
# connects to a supabase database and prints the contents of a table
import os, supabase, operator, dotenv
dotenv.load_dotenv()
client: supabase.Client = supabase.create_client(*operator.itemgetter("SUPABASE_URL", "SUPABASE_KEY")(os.environ))

## scraping with `httpx` and `beautifulsoup4`

Expect rate limits on the public endpoint; use caching and backoff for repeat or large requests.

In [3]:

async with hishel.httpx.AsyncCacheClient() as client: 
    response = await client.get("https://a11yhood-backend.cs.washington.edu/api/products")
    display((data := pandas.DataFrame(response.json())).head())
    
    

Unnamed: 0,name,description,source,source_url,type,image_url,image_alt,external_id,tags,source_last_updated,...,banned_at,average_rating,rating_count,display_rating,source_rating,source_rating_count,computed_rating,stars,urls,editor_ids
0,Table-mount Vise (0-20mm range),I designed this vise as a side project but it ...,Thingiverse,https://www.thingiverse.com/thing:7298351,Fabrication,https://cdn.thingiverse.com/assets/f2/d6/7c/71...,,7298351,"[assistive device, engineering, movable, screw...",2026-02-18T16:31:43Z,...,,,0,,,,,0,[],[]
1,Plushie Toy Crutches,All of the toy crutch models I could find on T...,Thingiverse,https://www.thingiverse.com/thing:7298096,Fabrication,https://cdn.thingiverse.com/assets/81/40/05/db...,,7298096,"[play pretend, plushie, plushie accessories, p...",2026-02-18T06:21:39Z,...,,,0,,,,,0,[],[]
2,Ascension,A full-featured intelligent voice assistant bu...,GitHub,https://github.com/Ascension-Yugi/Ascension,Software,https://avatars.githubusercontent.com/u/140940...,,1158367912,[C#],2026-02-15T08:57:57Z,...,,,0,,,3.0,,3,[],[]
3,indextts2-rust,High-performance Rust implementation of IndexT...,GitHub,https://github.com/DevMan57/indextts2-rust,Software,https://avatars.githubusercontent.com/u/228857...,,1136766559,[Rust],2026-02-11T11:49:52Z,...,,,0,,,3.0,,3,[],[]
4,gesture-control-with-voice-command,my first big project in python,GitHub,https://github.com/Zaniac25/gesture-control-wi...,Software,https://avatars.githubusercontent.com/u/150893...,,1032011228,[Python],2026-02-16T16:56:34Z,...,,,0,,,3.0,,3,[],[]


## conclusion

Direct API access is fastest for quick reads, seed scripts help local dev/test data, and scraping the public REST endpoint works when credentials are unavailable.
Choose the approach based on needed access level, latency, and how much setup you can tolerate.