Skip to content

dasmig/birth-generator

Repository files navigation

Birth Generator for C++

Requires C++23 (e.g., -std=c++23 for GCC/Clang, /std:c++latest for MSVC).

Birth Generator for C++

GitHub license CI GitHub Releases GitHub Issues C++23 Header-only Platform Documentation

API Reference · Usage Guide · Releases

Features

  • Demographically Plausible Birthdays. Generates random birthdays using a multi-axis pipeline: country-specific age pyramids, monthly birth seasonality, and weekday deficit from C-section rates.

  • Population-Weighted Country Selection. Countries are selected with probability proportional to their population, mirroring real-world demographics — a birth from India or China is far more likely than one from Iceland.

  • Rich Birth Data. Every generated birth includes: ISO date (YYYY-MM-DD), year, month, day, age, biological sex, weekday, life expectancy remaining, generational cohort label, and country code.

  • UN WPP 2024 Data. Age pyramids and life expectancy sourced from the United Nations World Population Prospects 2024 revision via the PPgp/wpp2024 R package.

  • Monthly Seasonality. Latitude-based sinusoidal model — Northern Hemisphere peaks in September, Southern Hemisphere peaks in March.

  • Weekday Deficit. Weekend births are less probable in countries with high C-section / scheduled delivery rates, reflecting real-world hospital scheduling patterns.

  • Deterministic Seeding. Per-call get_birth(seed) for reproducible results, generator-level seed() / unseed() for deterministic sequences, and birth::seed() for replaying a previous generation.

  • Uniform Selection. Switch to equal-probability country selection with weighted(false).

  • Multi-Instance Support. Construct independent bthg instances with their own data and random engine.

Integration

birthgen.hpp is the single required file released here. You also need random.hpp in the same directory. Add

#include <dasmig/birthgen.hpp>

// For convenience.
using bthg = dasmig::bthg;

to the files you want to generate births and set the necessary switches to enable C++23 (e.g., -std=c++23 for GCC and Clang).

Additionally you must supply the birth generator with the resources folder containing full/ and/or lite/ subdirectories with the three TSV data files, also available in the release.

Usage

#include <dasmig/birthgen.hpp>
#include <iostream>

// For convenience.
using bthg = dasmig::bthg;

// Manually load a specific dataset tier if necessary.
bthg::instance().load(dasmig::dataset::lite);  // ~195 sovereign states
// OR
bthg::instance().load(dasmig::dataset::full);  // ~235 countries & territories

// Generate a random birth (population-weighted country selection).
auto b = bthg::instance().get_birth();
std::cout << b.date_string() << ""
          << b.country_code << ", age " << +b.age
          << ", " << b.cohort << '\n';

// Generate a birth from a specific country.
auto us = bthg::instance().get_birth("US");
std::cout << "US: " << us << '\n';           // implicit string conversion

// Request a specific sex.
auto m = bthg::instance().get_birth("BR", dasmig::sex::male);

// Request a specific birth year.
auto y = bthg::instance().get_birth("JP", dasmig::year_t{1990});

// Request both sex and year.
auto sy = bthg::instance().get_birth("DE", dasmig::sex::female, dasmig::year_t{1985});

// Request an age range (e.g. adults 18–65).
auto ar = bthg::instance().get_birth("US", dasmig::age_range{18, 65});

// Access all available fields.
std::cout << "Sex:      " << (b.bio_sex == dasmig::sex::male ? "M" : "F") << '\n';
std::cout << "Weekday:  " << +b.weekday << " (0=Sun..6=Sat)" << '\n';
std::cout << "LE left:  " << b.le_remaining << " years" << '\n';

// Deterministic generation — same seed always produces the same birth.
auto seeded = bthg::instance().get_birth(42);

// Replay a previous birth using its seed.
auto replay = bthg::instance().get_birth(seeded.seed());

// Seed the engine for a deterministic sequence.
bthg::instance().seed(100);
// ... generate births ...
bthg::instance().unseed(); // restore non-deterministic state

// Switch to uniform random selection (equal probability per country).
bthg::instance().weighted(false);
auto uniform = bthg::instance().get_birth();
bthg::instance().weighted(true);

// Independent instance — separate data and random engine.
bthg my_gen;
my_gen.load("path/to/resources/lite");
auto c = my_gen.get_birth();

For the complete feature guide — fields, seeding, weighting, and more — see the Usage Guide.

Generation Pipeline

Each call to get_birth() runs this pipeline:

  1. Country — select from loaded countries (population-weighted or uniform).
  2. Sex — male or female, weighted by the country's M:F population ratio (or fixed if specified).
  3. Age — drawn from the country-specific age pyramid, optionally clamped to a range (or derived from a fixed year).
  4. Birth yearreference_year − age.
  5. Month — drawn from latitude-based seasonal weights.
  6. Day — uniform within the month, then rejection-sampled for weekday deficit (weekend births rejected with probability proportional to C-section rate).
  7. Weekday — computed from the final date using std::chrono.
  8. Life expectancy remainingmax(0, LE_at_birth − age).
  9. Cohort label — Greatest Generation through Generation Alpha.

Data

The birth data is sourced from:

Source License Contribution
PPgp/wpp2024 (UN WPP 2024) CC BY 3.0 IGO Age pyramids, life expectancy at birth
REST Countries v3.1 Open Source ISO codes, latitude, independence status
WHO Global Health Observatory Reference C-section rates by country

See LICENSE_DATA.txt for details.

To regenerate datasets:

python scripts/prepare_births.py             # generate both tiers
python scripts/prepare_births.py --tier lite # lite only
python scripts/prepare_births.py --tier full # full only

Related Libraries

Library Description
name-generator Culturally appropriate full names
nickname-generator Gamer-style nicknames
biodata-generator Procedural human physical characteristics
city-generator Weighted city selection by population
country-generator Weighted country selection by population
entity-generator ECS-based entity generation