Skip to content

Crowdsourcing

Chronic Tinkerer edited this page May 6, 2026 · 1 revision

Crowdsourcing

Status: Planned. Not yet implemented. This page documents the design direction so consumer addons and contributors can make compatible choices today.

The pitch

LibCodex's bundled seed is generated offline from DBC dumps and Wowhead scrapes. That covers static data well, but a lot of useful information only surfaces when a real player actually plays:

  • Level-scaling labels. Quest names that read differently depending on your character's level.
  • Faction-specific names. Some NPCs read as different names to Alliance vs. Horde players.
  • Spawn-only data. NPC level ranges, classifications, and spawn coordinates for content that's gated behind quests, instances, or rare timers.
  • Loot tables. Drop sources confirmed empirically rather than scraped.
  • Connected realms. The active realm graph changes as Blizzard merges shards.

The Runtime adapter already captures all of this into LibCodexDB per character. The crowdsource pipeline is the planned mechanism for redistributing that captured data back to the published bundle, so every user benefits from every other user's exploration.

Sketch of the pipeline

The current direction (subject to revision):

  1. Desktop uploader app. A small native tool that reads WTF/.../SavedVariables/LibCodex.lua, anonymizes per-character state, and uploads it to a server.
  2. Server-side merge. Aggregate uploaded fragments across users. De-duplicate by id. Apply the same merge rules as the in-game library: bundled / handcrafted entries are protected; runtime data fills gaps.
  3. Bake-and-publish. Fold high-confidence aggregated data back into the bake tool's input set. The next published release of LibCodex carries it as bundled data for everyone.

The library code makes no assumptions about this pipeline yet. There's nothing to configure in the addon today.

How this affects schema decisions today

Crowdsourcing changes which data shapes are good and bad to add to a module. The current rules of thumb:

  • Keep per-character state OFF the catalog. Per-character flags like "this character has completed this quest" or "this character has this mount" do not belong in the shared bundle. They go in the consumer addon's own SavedVariables, not LibCodex's.
  • Per-faction / per-realm / per-region data is fine when it's a property of the world, not of a player. NPC names that differ between A and H characters describe the world. A character's reputation amount with a faction does not.
  • Provenance matters. Every entry's sources array carries where the data came from ("bundled", "runtime", custom adapter names). Crowdsource aggregation will use this to weight conflicting reports.
  • Locked fields stick. _handcrafted = true and _locked = { "field1", ... } mark fields that runtime data can never overwrite. Crowdsource ingestion will respect the same locks.

Privacy considerations (non-binding, just direction)

The plan is to anonymize per-character data before upload — strip realm/character names from the payload, keep only the world-describing fields. The library design hasn't enforced this in code yet because the upload mechanism doesn't exist. When it lands, it'll be opt-in by default.

Questions worth keeping open

  • Conflict resolution. When two users report different values for the same field, what wins? Newer wins? Most-common wins? Highest-trust source wins?
  • Trust model. Do uploaders need accounts? Anonymous? Reputation-weighted?
  • Push cadence. How often does the published bundle re-bake? Per major patch? On-demand?

If you have opinions, file an issue labeled crowdsource. The direction is open enough that early feedback will shape the implementation.

Clone this wiki locally