# Serialized Population Demo

## Motivation
- No need to redo burn-in
    - saves dtk state to file
    - loads file and continues

## Configuration

## Created Files
- state-00100.dtk
- state-00300.dtk


## Load and Continue

## Manipulating a dtk file

- file access, wrting, reading, compression
- https://github.com/InstituteforDiseaseModeling/DtkTrunk/tree/master/Scripts/serialization 


## Pycharm Demo
- run eradication.exe and generate .dtk file
- Open dtk file in Pycharm
- show node content in debugger

## Demo
- load dtk file at timestep 1
- create more individuals by copying
- change age
- add infections
- https://github.com/tfischle-idmod/SerializedPopulation-Tools

In [9]:
import sys
sys.path.append("C:\\Users\\tfischle\\Github\\SerializedPopulation-Tools")
import change_serialized_population as csp
import pathlib
import plotly
import plotly.graph_objs as go

plotly.offline.init_notebook_mode(connected=True)

In [10]:
serialized_file = "state-00001.dtk"
path = pathlib.PureWindowsPath(r"C:\Users\tfischle\Desktop\Eradication_2.21\output", serialized_file)

ser_pop = csp.SerializedPopulation(path)
node_0 = ser_pop.nodes[0]

In [11]:
# plot m_age 
ind_values = csp.getPropertyValues(node_0.individualHumans, "m_age")

layout = go.Layout( title="m_age", 
                           bargap=0.2, bargroupgap=0.1, 
                           xaxis=dict(title="age in days"),
                           yaxis=dict(title="#individuals"))

data = [go.Histogram(x=ind_values)]
plotly.offline.iplot (
    { 'data': data, 
      'layout': layout}
)

In [12]:
# plot 
fct = lambda ind: len(ind["infections"]) > 0
ind_values = csp.getPropertyValues(node_0.individualHumans, "m_age", filter_fct=fct)

data = [go.Histogram(x=ind_values)]
plotly.offline.iplot ({ 'data': data, 'layout': layout})

In [17]:
#print infection
import json
path_infection = pathlib.PureWindowsPath(r"C:\Users\tfischle\Desktop\Eradication_2.21\infection.json")
with open(path_infection, "r") as file:
    infection = json.load(file)
    print(json.dumps(infection, indent=4))

{
    "__class__": "Infection",
    "suid": {
        "id": 31
    },
    "duration": 7,
    "total_duration": 9.7328,
    "incubation_timer": 0,
    "infectious_timer": 9.7328,
    "infectiousness": 3.5,
    "infectiousnessByRoute": [],
    "StateChange": 0,
    "infection_strain": {
        "cladeID": 0,
        "geneticID": 0
    },
    "m_is_symptomatic": 1,
    "m_is_newly_symptomatic": 0
}


In [5]:
# load infection and append it to individuals
ser_pop.addInfection(node_0,path_infection, lambda ind: ind["m_age"] > 30000)

In [6]:
# plot 
fct = lambda ind: len(ind["infections"]) > 0
ind_values = csp.getPropertyValues(node_0.individualHumans, "m_age", filter_fct=fct)

data = [go.Histogram(x=ind_values)]
plotly.offline.iplot ({ 'data': data, 'layout': layout})

Note: reading writing is on node level

In [7]:
serialized_output_file = "dtk-added_infections.dtk"
output_path = pathlib.PureWindowsPath(r"C:\Users\tfischle\Desktop\Eradication_2.21", serialized_output_file)
ser_pop.close()
ser_pop.write(output_path)

## Current Restrictions
- Individuals cannot be "created" only copied
- Infections cannot be "created" only copied
- 

In [None]:


Use Cases:
    - Generate (part of) a population with certain attributes
    - Query Individuals with certain attributes
        - Follow individuals with certain attributes over a longer time and change attributes
    - Update to next dtk version
    
Looks like there is a need for more high level functions
- Edward mentioned table with node_id and attributes an individual should have at this node
- make file compatible with newer dtk version

- Researchers will be mainly interacting with individuals 
- Pandas is good to display, filter and manipulate low level data
    - 
- Create a population with certain attributes
    - Use individual from SP file as source
        - e.g. 1000 copies of individual [23], 2000 copies of individual [42], etc
    - 
    
    
Workflow
1) use pandas to display and filter data
    - How to display/access nested data
2) copy individuals, change attributes, add to population 
3) save

1) use pandas to display and filter data
2) select some individuals as  source
3) create population
4) replace (all) existing individuals
3) save



In [None]:
## Change a Serialized Population
import matplotlib.pyplot as plt
from change_serialized_population import *
import plotly
import plotly.graph_objs as go
import utils

plotly.offline.init_notebook_mode(connected=True)

# load a serialized population file
path = pathlib.PureWindowsPath(r"C:\Users\tfischle\Github\DtkTrunk_master\Regression\Generic\13_Generic_Individual_Properties")
serialized_file = "state-00015.dtk"

ser_pop = SerializedPopulation(str(path) + '/' + serialized_file)
node0 = ser_pop.nodes[0]

In [None]:
df = pandas.io.json.json_normalize(ser_pop.nodes[0].individualHumans)
df.head()['infections']

In [None]:
df = pandas.io.json.json_normalize(ser_pop.nodes[0].individualHumans,'infections')
df.tail()

In [None]:
df = pandas.DataFrame(ser_pop.nodes[0].individualHumans)
df = df.assign(temp_f = pandas.DataFrame(i for i in df['infections']))

df.to_html("table.html")
#df.head()

In [None]:
#df = pandas.io.json.json_normalize(ser_pop.nodes[0].individualHumans,'infections',['m_age'])
#df = pandas.io.json.json_normalize(ser_pop.nodes[0].individualHumans)
#df = df[df.m_age < 100]
#df.to_html("table.html")
df = pandas.DataFrame(ser_pop.nodes[0].individualHumans)
all_names_index = df.set_index(['m_age','infections']).sort_index()
all_names_index


# Plot Number of Individuals, Age

In [None]:
# plot m_age 
ind_values = getPropertyValues(node0.individualHumans, "m_age")

data = [go.Histogram(x=ind_values)]
plotly.offline.iplot ({ 'data': data, 'layout': go.Layout( title="m_age", bargap=0.2, bargroupgap=0.1)})

# Plot Number of Individuals, Gender

In [None]:
ind_values = getPropertyValues(node0.individualHumans, "m_gender")

data = [go.Histogram(x=ind_values)]
plotly.offline.iplot ({ 'data': data, 'layout': go.Layout( title="m_age", bargap=0.2, bargroupgap=0.1)})

# Only Plot Age of Individuals with gender = 0

In [None]:
fct = lambda ind: ind["m_gender"] == 0
ind_values = getPropertyValues(node0.individualHumans, "m_age", filter_fct=fct)

data = [go.Histogram(x=ind_values)]
plotly.offline.iplot ({ 'data': data, 'layout': go.Layout( title="m_age", bargap=0.2, bargroupgap=0.1)})

# Age of Individuals with more than one infection

In [None]:
fct = lambda ind: len(ind["infections"]) >= 1
ind_values = getPropertyValues(node0.individualHumans, "m_age", filter_fct=fct)

data = [go.Histogram(x=ind_values)]
plotly.offline.iplot ({ 'data': data, 'layout': go.Layout( title="m_age", bargap=0.2, bargroupgap=0.1)})

# Count number of Infections

In [None]:
# plot 
select_fct = lambda ind: len(ind["infections"])
filter_fct = lambda ind: True
ind_values = getPropertyValues2(node0.individualHumans, select_fct = select_fct, filter_fct=filter_fct)

data = [go.Histogram(x=ind_values)]
plotly.offline.iplot ({ 'data': data, 'layout': go.Layout( title="m_age", bargap=0.2, bargroupgap=0.1)})

# Add Infections

In [None]:
new_infection = createInfection("Generic", ser_pop.getNextInfectionSuid())
print(new_infection)
add(node0.individualHumans, "infections", new_infection, lambda ind: ind["m_age"] > 40000)

select_fct = lambda ind: len(ind["infections"])
ind_values = getPropertyValues2(node0.individualHumans, select_fct = select_fct)

data = [go.Histogram(x=ind_values)]
plotly.offline.iplot ({ 'data': data, 'layout': go.Layout( title="m_age", bargap=0.2, bargroupgap=0.1)})

# Find

In [None]:
print(find("age", ser_pop.nodes))

# Change Age Distribution

In [None]:
fct = randomGauss   # random.gauss(1000, 10)
distribution = utils.createDistribution("m_age", len(node0.individualHumans), fct)
setAttributes(distribution, node0.individualHumans)

ind_values = getPropertyValues(node0.individualHumans, "m_age")

data = [go.Histogram(x=ind_values)]
plotly.offline.iplot ({ 'data': data, 'layout': go.Layout( title="m_age", bargap=0.2, bargroupgap=0.1)})

In [None]:
fct = randomGauss   # random.gauss(1000, 10)
filter_fct = lambda ind: ind.m_age > 1000
attribute_age = {"m_age": 700 }

setAttribute(attribute_age, node0.individualHumans, filter_fct = filter_fct)

ind_values = getPropertyValues(node0.individualHumans, "m_age")

data = [go.Histogram(x=ind_values)]
plotly.offline.iplot ({ 'data': data, 'layout': go.Layout( title="m_age", bargap=0.2, bargroupgap=0.1)})

# Close and Save

In [None]:
ser_pop.close()
ser_pop.write()