See [here](https://github.com/gaow/SEQLinkagePaper/blob/3bdfd1092b75e82f94cd53b98c964b7e1f38d4a9/simulations/LinkagePowerCalc.py#L306) what I did before. It might be too difficult to read as a beginer so let me try simplify it here.

## A `list` data structure

In [91]:
demographic_summary = [(x+1,0) for x in range(4)] # list comprehension

In [92]:
demographic_summary

[(1, 0), (2, 0), (3, 0), (4, 0)]

In [93]:
demographic_summary[0]

(1, 0)

In [96]:
demographic_summary[3][1]

0

It is awkward for storing one to one mapping data. Let's convert it to a 

## `dict`

In [97]:
demographic_summary = dict(demographic_summary)

In [98]:
demographic_summary

{1: 0, 2: 0, 3: 0, 4: 0}

In [88]:
demographic_summary[8]

0.0

This is from literature:

In [99]:
demographic_summary[1] = 14081/33942
demographic_summary[2] = 12853/33942
demographic_summary[3] = 5028/33942
demographic_summary[4] = 1980/33942

In [100]:
demographic_summary

{1: 0.4148547522243828,
 2: 0.3786753874256084,
 3: 0.1481350539155029,
 4: 0.05833480643450592}

But we want to normalize it such that we always have samples of >=2 offsprings. This is straightforward:

In [44]:
for k in demographic_summary:
    if k == 1:
        continue
    else:
        # is shorthand for: demographic_summary[k] = demographic_summary[k] / (1 - demographic_summary[1])
        demographic_summary[k] /= (1 - demographic_summary[1])
demographic_summary[1] = 0

In [45]:
list(demographic_summary.values())

[0, 0.4, 0.6, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]

Now we draw 1000 pedigrees from this multinomial distribution,

In [77]:
import numpy as np
n = 10
data = np.random.multinomial(n, list(demographic_summary.values()))
data

array([0, 6, 4, 0, 0, 0, 0, 0, 0, 0])

Now we make data also a dictionary so we know the one to one mapping

In [78]:
data = dict([(k, x) for k, x in zip(demographic_summary.keys(), data)])

In [79]:
data

{1: 0, 2: 6, 3: 4, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0}

Now that we have the proportions, the dumb (impossible) way to generate pedigress structure is to do this manually. Eg, for one pedigree of 2 offspring:

```
FAM1 0 0 M1 2 0
FAM1 0 0 F1 1 0
FAM1 M1 F1 O1 1 0
FAM1 M1 F1 O2 2 0
```

But let's try to do with computers:

In [86]:
# FIXME: write this to a file using `open()` function .. 
num_fam = 0
fam_id = mid = fid = sid = sex = phen = ''
for fam_type in data:
    if fam_type == 1:
        # single off-spring family
        continue
    for i in range(data[fam_type]):
        num_fam += 1
        fam_id = f'FAM{num_fam}'
        mid = f'M{num_fam}'
        fid = f'F{num_fam}'
        # for founders
        print(f"{fam_id}\t0\t0\t{mid}\t2\t0")
        print(f"{fam_id}\t0\t0\t{fid}\t1\t0")
        for j in range(fam_type):
            sid = f"O{j+1}"
            # FIXME: make sex random
            sex = 1
            print(f"{fam_id}\t{mid}\t{fid}\t{sid}\t{sex}\t0")

FAM1	0	0	M1	2	0
FAM1	0	0	F1	1	0
FAM1	M1	F1	O1	1	0
FAM1	M1	F1	O2	1	0
FAM2	0	0	M2	2	0
FAM2	0	0	F2	1	0
FAM2	M2	F2	O1	1	0
FAM2	M2	F2	O2	1	0
FAM3	0	0	M3	2	0
FAM3	0	0	F3	1	0
FAM3	M3	F3	O1	1	0
FAM3	M3	F3	O2	1	0
FAM4	0	0	M4	2	0
FAM4	0	0	F4	1	0
FAM4	M4	F4	O1	1	0
FAM4	M4	F4	O2	1	0
FAM5	0	0	M5	2	0
FAM5	0	0	F5	1	0
FAM5	M5	F5	O1	1	0
FAM5	M5	F5	O2	1	0
FAM6	0	0	M6	2	0
FAM6	0	0	F6	1	0
FAM6	M6	F6	O1	1	0
FAM6	M6	F6	O2	1	0
FAM7	0	0	M7	2	0
FAM7	0	0	F7	1	0
FAM7	M7	F7	O1	1	0
FAM7	M7	F7	O2	1	0
FAM7	M7	F7	O3	1	0
FAM8	0	0	M8	2	0
FAM8	0	0	F8	1	0
FAM8	M8	F8	O1	1	0
FAM8	M8	F8	O2	1	0
FAM8	M8	F8	O3	1	0
FAM9	0	0	M9	2	0
FAM9	0	0	F9	1	0
FAM9	M9	F9	O1	1	0
FAM9	M9	F9	O2	1	0
FAM9	M9	F9	O3	1	0
FAM10	0	0	M10	2	0
FAM10	0	0	F10	1	0
FAM10	M10	F10	O1	1	0
FAM10	M10	F10	O2	1	0
FAM10	M10	F10	O3	1	0
