Skip to content

dasmig/city-generator

Repository files navigation

City Generator for C++

Requires C++23 (e.g., -std=c++23 for GCC/Clang, /std:c++latest for MSVC).

City Generator for C++

GitHub license CI GitHub Releases GitHub Issues C++23 Header-only Platform Documentation

API Reference · Usage Guide · Releases

Features

  • Population-Weighted Generation. Cities are selected with probability proportional to their population, mirroring real-world demographic distributions.

  • Rich City Data. Every generated city includes 16 GeoNames fields: name, coordinates, country, administrative divisions, population, elevation, timezone, and more.

  • Country Filtering. Generate cities from a specific country with get_city("BR").

  • GeoNames Data. Ships with two dataset tiers — full (~200k cities, pop ≥ 500) and lite (~25k major cities, pop ≥ 15,000) — from GeoNames, licensed under CC BY 4.0.

  • Deterministic Seeding. Per-call get_city(seed) for reproducible results, generator-level seed() / unseed() for deterministic sequences, and city::seed() for replaying a previous generation.

  • Uniform Selection. Switch to equal-probability selection with weighted(false) — every city has the same chance regardless of population.

  • Multi-Instance Support. Construct independent cg instances with their own data and random engine — ideal for embedding inside other generators.

Integration

citygen.hpp is the single required file released here. You need to add

#include <dasmig/citygen.hpp>

// For convenience.
using cg = dasmig::cg;

to the files you want to generate cities and set the necessary switches to enable C++23 (e.g., -std=c++23 for GCC and Clang).

Additionally you must supply the city generator with the resources folder containing full/cities.tsv and/or lite/cities.tsv, also available in the release.

Usage

#include <dasmig/citygen.hpp>
#include <iostream>

// For convenience.
using cg = dasmig::cg;

// Manually load a specific dataset tier if necessary.
cg::instance().load(dasmig::dataset::lite);  // ~25k major cities
// OR
cg::instance().load(dasmig::dataset::full);  // ~200k cities

// Generate a random city (population-weighted).
auto city = cg::instance().get_city();
std::cout << city.name << ", " << city.country_code
          << " (pop. " << city.population << ")" << std::endl;

// Generate a city from a specific country.
auto br_city = cg::instance().get_city("BR");

// Access all available fields.
std::cout << city.latitude << ", " << city.longitude << std::endl;
std::cout << city.timezone << std::endl;
std::cout << city.elevation << " m" << std::endl;

// Deterministic generation — same seed always produces the same city.
auto seeded = cg::instance().get_city(42);

// Replay a previous city using its seed.
auto replay = cg::instance().get_city(seeded.seed());

// Seed the engine for a deterministic sequence.
cg::instance().seed(100);
// ... generate cities ...
cg::instance().unseed(); // restore non-deterministic state

// Switch to uniform random selection (equal probability).
cg::instance().weighted(false);
auto uniform_city = cg::instance().get_city();  // any city equally likely
cg::instance().weighted(true);  // restore population-weighted

// Independent instance — separate data and random engine.
cg my_gen;
my_gen.load("path/to/cities.tsv");
auto c = my_gen.get_city();

For the complete feature guide — fields, filtering, weighting, and more — see the Usage Guide.

Data

The city data is sourced from the GeoNames geographical database. Two tiers are provided:

Tier Source Cities
full cities500.zip ~200k (pop ≥ 500)
lite cities15000.zip ~25k (pop ≥ 15,000)

The data is licensed under CC BY 4.0. See LICENSE_DATA.txt for details.

To regenerate datasets:

python scripts/prepare_geonames.py            # generate both tiers
python scripts/prepare_geonames.py --tier lite # lite only
python scripts/prepare_geonames.py --tier full # full only

Related Libraries

Library Description
name-generator Culturally appropriate full names
nickname-generator Gamer-style nicknames
birth-generator Demographically plausible birthdays
biodata-generator Procedural human physical characteristics
country-generator Weighted country selection by population
entity-generator ECS-based entity generation