Requires C++23 (e.g.,
-std=c++23for GCC/Clang,/std:c++latestfor MSVC).
API Reference · Usage Guide · Releases
-
Demographically Plausible Characteristics. Generates random human physical traits using a multi-stage pipeline: height from country/sex-specific Gaussian distributions, BMI from log-normal distributions, and categorical sampling for phenotypic traits.
-
Seven Physical Traits. Every generated profile includes: height (cm), weight (kg), BMI, eye colour, hair colour, Fitzpatrick skin type, ABO/Rh blood type, and handedness.
-
Country-Specific Distributions. Height means, BMI values, eye/hair/skin distributions, blood type frequencies, and left-handedness rates all vary by country, reflecting real-world population statistics.
-
Data-Driven. Height and BMI from NCD-RisC via Our World in Data, blood types from published population studies, phenotypic traits from Katsara & Nothnagel 2019, VISAGE consortium, and Papadatou-Pastou 2020.
-
Deterministic Seeding. Per-call
get_biodata(seed)for reproducible results, generator-levelseed()/unseed()for deterministic sequences, andbiodata::seed()for replaying a previous generation. -
Multi-Instance Support. Construct independent
bdginstances with their own data and random engine. -
Typed Enumerations.
eye_color,hair_color,skin_type,blood_type, andhandednessenums with string conversion helpers.
biodatagen.hpp is the single required file released here. You also need random.hpp in the same directory. Add
#include <dasmig/biodatagen.hpp>
// For convenience.
using bdg = dasmig::bdg;to the files you want to generate biodata and set the necessary switches to enable C++23 (e.g., -std=c++23 for GCC and Clang).
Additionally you must supply the biodata generator with the resources folder containing full/ and/or lite/ subdirectories with the TSV data file, also available in the release.
#include <dasmig/biodatagen.hpp>
#include <iostream>
// For convenience.
using bdg = dasmig::bdg;
// Manually load a specific dataset tier if necessary.
bdg::instance().load(dasmig::dataset::lite); // ~111 countries (best coverage)
// OR
bdg::instance().load(dasmig::dataset::full); // ~197 countries (gap-filled)
// Generate random biodata (uniform country selection).
auto b = bdg::instance().get_biodata();
std::cout << b << '\n'; // implicit string conversion
// Generate biodata for a specific country.
auto us = bdg::instance().get_biodata("US");
std::cout << "Height: " << us.height_cm << " cm\n";
std::cout << "Weight: " << us.weight_kg << " kg\n";
std::cout << "BMI: " << us.bmi << "\n";
// Request a specific sex.
auto m = bdg::instance().get_biodata("BR", dasmig::sex::male);
// Access typed enum fields.
std::cout << "Eyes: " << dasmig::biodata::eye_color_str(b.eyes) << '\n';
std::cout << "Hair: " << dasmig::biodata::hair_color_str(b.hair) << '\n';
std::cout << "Skin: " << dasmig::biodata::skin_type_str(b.skin) << '\n';
std::cout << "Blood: " << dasmig::biodata::blood_type_str(b.blood) << '\n';
std::cout << "Hand: " << dasmig::biodata::handedness_str(b.hand) << '\n';
// Deterministic generation — same seed always produces the same result.
auto seeded = bdg::instance().get_biodata("US", std::uint64_t{42});
// Replay a previous generation using its seed.
auto replay = bdg::instance().get_biodata("US", seeded.seed());
// Seed the engine for a deterministic sequence.
bdg::instance().seed(100);
// ... generate biodata ...
bdg::instance().unseed(); // restore non-deterministic state
// Independent instance — separate data and random engine.
bdg my_gen;
my_gen.load("path/to/resources/lite");
auto c = my_gen.get_biodata("JP");For the complete feature guide — fields, seeding, enums, and more — see the Usage Guide.
Each call to get_biodata() runs this pipeline:
- Sex — 50/50 or forced via
sexparameter. - Height — Gaussian distribution using country/sex-specific mean and standard deviation from NCD-RisC anthropometric data.
- BMI — Log-normal distribution from country/sex-specific mean, modelling the natural right-skew of BMI.
- Weight — Derived:
BMI × height_m². - Eye Colour — Categorical sampling from country-specific blue/intermediate/brown distribution.
- Hair Colour — Categorical sampling from country-specific black/brown/blond/red distribution.
- Skin Type — Categorical sampling from Fitzpatrick I–VI distribution.
- Blood Type — Categorical sampling from ABO/Rh frequencies (O+, A+, B+, AB+, O−, A−, B−, AB−).
- Handedness — Bernoulli sampling from country-specific left-handedness rate.
| Trait | Source | Coverage |
|---|---|---|
| Height (mean, SD) | NCD-RisC via OWID | 202 countries |
| BMI (mean) | WHO GHO via OWID | 197 countries |
| Blood type | Published population studies (Wikipedia compilation) | 124 countries |
| Eye colour | Katsara & Nothnagel 2019 + regional estimates | 70 countries |
| Hair colour | VISAGE consortium + regional estimates | 62 countries |
| Skin tone | WHO UV guidance + ethnic composition estimates | 81 countries |
| Handedness | Papadatou-Pastou et al. 2020 meta-analysis | 75 countries |
| Tier | Countries | Description |
|---|---|---|
lite |
~111 | Countries with specific data for at least one phenotypic trait |
full |
~197 | All countries with height data; phenotypic gaps filled with regional defaults |
# Example
make
# Tests
make test
# Code coverage
make coverage
# API docs
make docsTested with:
- Clang 18+ (
-std=c++23) - GCC 14+ (
-std=c++23) - MSVC 19.38+ (
/std:c++latest)
| Dependency | Version | Bundled | Purpose |
|---|---|---|---|
| effolkronium/random | 1.4.1 | Yes (random.hpp) |
Thread-safe RNG wrapper |
| Catch2 | 3.x | Yes (amalgamated) | Unit testing |
| Library | Description |
|---|---|
| name-generator | Culturally appropriate full names |
| nickname-generator | Gamer-style nicknames |
| birth-generator | Demographically plausible birthdays |
| city-generator | Weighted city selection by population |
| country-generator | Weighted country selection by population |
| entity-generator | ECS-based entity generation |
This library is released under the MIT License.
MIT License
Copyright (c) 2020-2026 Diego Dasso Migotto

