# Generating Populations and Species
Assuming that the civilization is on the verge of moving from a phaze 0 to phase 1 civ.
* baseline species
* n populations (pops)
* population varience (within the norms of the species)
* allegiance groups to factions

Starting with an example user form data

In [147]:

from numpy import random, round, interp, linspace
import pickle
from sklearn.cluster import KMeans
import altair as alt
import sys, os
from sklearn.cluster import KMeans

import pandas as pd


In [74]:

data = {
    'population_conformity':0.4,
    'population_literacy':0.7,
    'population_aggression':0.5,
    'population_constitution':0.4,
    'starting_pop': 100
}

def build_species(data):
    species = {}
    for attr in ['population_conformity', 'population_literacy', 'population_aggression', 'population_constitution']:
        species[attr] = data[attr]
    return species
    
species = build_species(data)
species


{'population_conformity': 0.4,
 'population_literacy': 0.7,
 'population_aggression': 0.5,
 'population_constitution': 0.4}

## Creating groups of populations, all with different attributes

Populations vary based on conformity, some populations are more literate, some are more aggressive. 

In [75]:
pop_std = .2 * (1-species['population_conformity'])
print("the population conformity: ", pop_std)

def vary_pops(species):
    pop = {}
    for k in list(species.keys()):
        pop[k] = abs(round(random.normal(species[k], pop_std),3))
    return pop

pops = pd.DataFrame([vary_pops(species) for i in range(data['starting_pop'])])
pops

the population conformity:  0.12


Unnamed: 0,population_conformity,population_literacy,population_aggression,population_constitution
0,0.618,0.670,0.758,0.333
1,0.278,0.645,0.512,0.512
2,0.486,0.665,0.476,0.485
3,0.370,0.634,0.462,0.362
4,0.375,0.679,0.672,0.379
...,...,...,...,...
95,0.434,0.776,0.583,0.466
96,0.372,0.648,0.415,0.575
97,0.239,0.432,0.635,0.228
98,0.504,0.655,0.413,0.498


In [79]:
kmeans = KMeans(n_clusters=2).fit(pops)
pops['faction_no'] = kmeans.labels_

In [81]:
chart = alt.Chart(pops).mark_circle().encode(x='population_literacy',y='population_aggression',color='faction_no:N')
chart

The end result is that population factions are grouped by the idealogical differences of the people in them. This gives each faction a distinct culture. 

In [84]:
pd.DataFrame(kmeans.cluster_centers_, columns=pops.columns)

Unnamed: 0,population_conformity,population_literacy,population_aggression,population_constitution,faction_no
0,0.40052,0.63176,0.57382,0.33972,1.0
1,0.40468,0.78764,0.45122,0.4434,0.0


The cluster centers can represent the 'zeitgeist' of that faction's culture. 

## Deciding the right number of factions to have.

In [91]:
faction_range = range(1,10)

potential_factions = [KMeans(n_clusters=i).fit(pops) for i in faction_range]

For a range in potential factions, use the perplexity to decide which grouping makes more sense. 

In [100]:
potential_clusters = pd.DataFrame([[len(i.cluster_centers_), i.inertia_] for i in potential_factions],columns=["n-clusters","inertia"])
alt.Chart(potential_clusters).mark_line().encode(x="n-clusters",y="inertia")

I feel like there isn't a way to pick an ideal number of factions procedurally,  so until I think of something better I'm just going to go with an arbitrary range based on `population_conformity`.

The amount of different nations is relative to `1-population_conformity` and then scaled out over a number of steps defined as `n_steps`

In [142]:
n_steps = 6

def get_n_factions(n_steps):
    x = interp(
        (1-data["population_conformity"]),
            linspace(0, 1, num=n_steps),
            [i for i in range(n_steps)]
        )
    return int(round(x))

get_n_factions(n_steps)

3

In [144]:
kmeans = KMeans(n_clusters=get_n_factions(n_steps)).fit(pops)
pops['faction_no'] = kmeans.labels_

In [145]:
chart = alt.Chart(pops).mark_circle().encode(x='population_literacy',y='population_aggression',color='faction_no:N')
chart

So individual `pops` vary slightly, relative to the `population_conformity`. And population_conformity also determines the number of factions in the group. 

## Other population attributes

In [157]:
# I got the syllables by parsing out a global list of city names. 
syllables = pickle.load(open("../../web/app/creators/specs/syllables.p", "rb"))

def make_word(n):
    syl = random.choice(syllables, n)
    word = "".join(syl)
    return word.capitalize()

make_word(2)


'Hoomow'

In [162]:
factions = [{'id':i,'name':make_word(2)} for i in range(kmeans.n_clusters)]

[{'id': 0, 'name': 'Waopur'},
 {'id': 1, 'name': 'Hwabhuj'},
 {'id': 2, 'name': 'Texgla'}]

## Creatign edges for the graph

In [169]:
pops.to_dict('records')

[{'population_conformity': 0.618,
  'population_literacy': 0.67,
  'population_aggression': 0.758,
  'population_constitution': 0.333,
  'faction_no': 1},
 {'population_conformity': 0.278,
  'population_literacy': 0.645,
  'population_aggression': 0.512,
  'population_constitution': 0.512,
  'faction_no': 0},
 {'population_conformity': 0.486,
  'population_literacy': 0.665,
  'population_aggression': 0.476,
  'population_constitution': 0.485,
  'faction_no': 0},
 {'population_conformity': 0.37,
  'population_literacy': 0.634,
  'population_aggression': 0.462,
  'population_constitution': 0.362,
  'faction_no': 1},
 {'population_conformity': 0.375,
  'population_literacy': 0.679,
  'population_aggression': 0.672,
  'population_constitution': 0.379,
  'faction_no': 1},
 {'population_conformity': 0.377,
  'population_literacy': 0.869,
  'population_aggression': 0.652,
  'population_constitution': 0.602,
  'faction_no': 2},
 {'population_conformity': 0.289,
  'population_literacy': 0.702,
