Eduba House Recommendation Engine

A learning prototype recommender and insights engine for Eduba House, a connection venture (book clubs, founder talks, workshops, gatherings) built on a 5-layer Connection Framework: Trust → Productive Friction → The Container → Repeat Architecture → Scale Paradox.

One model, three jobs:

People matching — given a member, suggest other members they're likely to connect with.
Event suggestions — given a member, suggest upcoming events they'd attend (respecting their relationship stage).
Management insights — framework-grounded operational views: churn risk, empty rooms, clique traps, scale-collapse risk, topic supply/demand, segments.

Quick start

# 1. Install
pip install -r requirements.txt

# 2. Generate the synthetic dataset (~4 seconds, seeded, reproducible)
python -m src.generate

# 3. Train and compare all four recommenders end-to-end
python -m src.evaluate

# 4. Run the framework-grounded operational report
python -m src.insights

That's the whole loop. The four CSVs in data/synthetic/ (members, events, attendance, connections) are regenerated from seed=42 — they're gitignored on purpose.

What you'll see

`python -m src.evaluate`

Precision@k / recall@k across four methods on the same chronological split:

method	event P@5	event R@5	match P@5	match R@5
popularity (baseline)	0.018	0.041	0.022	0.043
content (interest cosine + stage gate)	0.118	0.304	0.012	0.026
collaborative (ALS + Adamic-Adar)	0.026	0.047	0.246	0.700
hybrid (late-fusion)	0.118	0.304	0.245	0.699

Content owns event recs by 6.5×. Collaborative owns matching by 20×. Hybrid is never worse than either specialist on its native task — one model for both jobs.

`python -m src.insights`

Eight named views, all derived from the Connection Framework:

Churn risk — members active in [T-6, T-3] but silent in [T-3, T]; not Recognition-stage
Empty Room — high-friction events (level ≥ 3) that didn't half-fill
Shallow Pool — series with return rate below 40%
Clique Trap — closed sub-communities (< 5% events shared with outsiders)
Founder Bottleneck — any host hosting > 30% of attended events
Scale Collapse — events over capacity, or series whose average attendance exceeds the 20-person small-group cap
Topic supply vs. demand — under- and over-served interests
Member segments — KMeans clusters on ALS user-factors, labeled by modal persona

The headline number: ALS recovers 66% of the hidden persona structure without ever seeing persona labels. The embeddings are learning real community structure, not memorizing.

Why this is interesting

Eduba House isn't a generic events platform — it operates on a deliberate connection-architecture thesis. That thesis is baked into the system:

Container intimacy is signal. Co-attendance in an 8-seat dinner is worth ~5× co-attendance in a 40-seat panel. The collaborative recommender weights ALS confidence and graph edges by 1/capacity, encoding this directly.
Members have a stage, not just interests. Recognition (0–2 events) → Acquaintance (3–9) → Friendship (10–19) → Community (20+). The content recommender gates high-friction events against the member's current stage so it doesn't push a vulnerability retreat at someone who's done two coffee meetups.
Insights map to named failure modes. Empty Room, Shallow Pool, Clique Trap, Founder Bottleneck, Scale Collapse — these are the framework's own taxonomy, not generic dashboard metrics.

Project layout

.
├── CLAUDE.md                # full spec — schema, personas, formula, definitions, API contract
├── DEVELOPMENT_NOTES.md     # running decision log
├── README.md                # this file
├── requirements.txt
├── data/synthetic/          # generated CSVs (gitignored — reproducible from seed)
└── src/
    ├── config.py            # single source of truth: sizes, weights, thresholds
    ├── generate.py          # personas → members → events → attendance → connections
    ├── evaluate.py          # chronological split, edge holdout, P@k / R@k harness
    ├── insights.py          # the eight framework-grounded views
    └── methods/
        ├── base.py          # Recommender API contract
        ├── popularity.py    # baseline
        ├── content.py       # interest cosine + stage-friction gate
        ├── collaborative.py # implicit ALS + Adamic-Adar on intimacy-weighted graph
        └── embeddings.py    # late-fusion hybrid

For the full specification — schema types, persona definitions, attendance-probability formula, evaluation methodology — see CLAUDE.md. For the running narrative of decisions and what changed when, see DEVELOPMENT_NOTES.md.

Tweaking the dataset

Everything tunable lives in src/config.py: dataset size, persona definitions, generator weights, friction levels, insight thresholds. Change a value, re-run python -m src.generate, re-run python -m src.evaluate. The whole loop takes under 10 seconds.

The current settings (1500 members, 500 events, seed 42) produce a dataset where 77% of connection edges link same-persona members vs. 17% chance — the hidden signal recommenders are graded against without ever being shown.

Status

Prototype. Synthetic data only; no real Eduba House data flows through this yet. The methods, eval harness, and insights work end-to-end and produce interpretable results. Next likely directions: a joint-trained PyTorch embedding model to exceed late-fusion, real-data integration once Eduba House has enough operational history, and a UI for the insights output.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Eduba House Recommendation Engine

Quick start

What you'll see

`python -m src.evaluate`

`python -m src.insights`

Why this is interesting

Project layout

Tweaking the dataset

Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
DEVELOPMENT_NOTES.md		DEVELOPMENT_NOTES.md
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Eduba House Recommendation Engine

Quick start

What you'll see

python -m src.evaluate

python -m src.insights

Why this is interesting

Project layout

Tweaking the dataset

Status

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`python -m src.evaluate`

`python -m src.insights`

Packages