# A simple 4 layer contact network based on survey data

Assume we want to build a simple contact network model consisting of 4 location types: households, school classes, work places and cities.
To increase realism, we use survey data to create the population of agents and to define some of the location properties.

In [1]:
import random

import pop2net as p2n
from pop2net.data_fakers.soep import soep_faker

In this example, we only use fake survey data (but of course you should real survey data here):

In [2]:
df_soep = soep_faker.soep(size=1000)
df_soep.head()

Unnamed: 0,age,gender,work_hours_day,nace2_division,hid,pid
0,37.0,female,11.950542,43,6996,184
1,24.0,male,5.665439,84,8673,2105
2,51.0,male,3.937453,99,8673,2705
3,24.0,male,8.0,99,4557,8909
4,81.0,female,0.0,-2,2239,269


The first contact layer `Home` is a location class where agents of one household meet each other for 12 hours.
We use the agent attribute `hid`(household id), which is provided by the survey data, to group the agents in their *empirical* households.

In [3]:
class Home(p2n.LocationDesigner):
    def split(self, agent):
        """Group the agents by their household id."""
        return agent.hid

    def weight(self, agent):
        """Weight the connection between the agent and the Home by 12."""
        return 12

The second layer models work places. The agents are grouped by their NACE2 division which is provided in the survey data.
The connection is weighted by the agents' empirical work hours given by the survey data.

In [4]:
class Work(p2n.LocationDesigner):
    n_agents = 10

    def filter(self, agent):
        """Ignore agents that have 0 work hours or an invalid NACE2 value."""
        return True if agent.work_hours_day > 0 and agent.nace2_division > 0 else False

    def split(self, agent):
        """Group agents by NACE2 division."""
        return agent.nace2_division

    def weight(self, agent):
        """Weight the connection between the agent and the Work instance
        by the agent's daily work hours."""
        return agent.work_hours_day

The third type of location are cities.
We build 2 of them.
Using `stick_together()`, we make sure that agents of the same household live in the same city.

In [5]:
class City(p2n.LocationDesigner):
    n_locations = 2
    
    def stick_together(self, agent):
        """Keep agents of the same household together when assigning the agents to cities."""
        return agent.hid

The fourth contact layer models a school consisting of multiple classrooms including agents of the same age.
Using `nest()` we ensure that children from the same city visit the same school.

In [6]:
class School(p2n.LocationDesigner):
    n_agents = 15  # Set the number of agents to 15.

    def filter(self, agent):
        """Ignore agents younger than 6 or older than 18."""
        return True if 6 <= agent.age <= 18 else False

    def split(self, agent):
        """Group the agents by age."""
        return agent.age

    def weight(self, agent):
        """Weight the connection between the agent and the School by 6."""
        return 6

    def nest(self):
        """Nest this location type within the location type City."""
        return City

Create the necessary pop2net objects:

In [7]:
model = p2n.Model()
creator = p2n.Creator(model=model)
inspector = p2n.NetworkInspector(model=model)

In the following we build the network.
100 rows are sampled from the `df_soep` and are translated into agents.
The argument `sample_level` ensures that we always sample complete households.
Using the argument `location classes` we can define which contact layers we want to use to build our network.

In [8]:
creator.create(
    df=df_soep,
    n_agents=100,
    sample_level="hid",
    location_designers=[
        Home,
        City,
        Work,
        School,
    ],
)

inspector.plot_networks(location_color="label")

In [9]:
inspector.eval_affiliations()



______________________________________
Number of locations
______________________________________

location_label
Home      35
Work      27
School     6
City       2
Name: count, dtype: int64


______________________________________
Number of agents per location
______________________________________

                     mean       std   min   25%   50%    75%   max
location_label                                                    
City            51.000000  0.000000  51.0  51.0  51.0  51.00  51.0
Home             2.914286  1.314432   1.0   2.0   3.0   4.00   6.0
School           1.333333  0.516398   1.0   1.0   1.0   1.75   2.0
Work             1.518519  0.893152   1.0   1.0   1.0   2.00   5.0


______________________________________
Number of affiliated locations per agent
______________________________________

mean    2.480392
std     0.502083
min     2.000000
25%     2.000000
50%     2.000000
75%     3.000000
max     3.000000
Name: n_affiliated_locations, dtype: float64


Maybe you have noticed that some of the school classes or work places are undercrowded because the overall population is too small.
Let's create a new network model and increase the population size to 5000 agents:

In [10]:
model = p2n.Model()
creator = p2n.Creator(model=model)
inspector = p2n.NetworkInspector(model=model)

creator.create(
    df=df_soep,
    n_agents=5000,
    sample_level="hid",
    location_designers=[
        Home,
        City,
        Work,  
        School,
    ],
)

(AgentList (5002 objects), LocationList (1973 objects))

The table below shows that now the population is large enough to fill all locations as we wanted.

In [11]:
inspector.eval_affiliations()



______________________________________
Number of locations
______________________________________

location_label
Home      1715
Work       224
School      32
City         2
Name: count, dtype: int64


______________________________________
Number of agents per location
______________________________________

                       mean       std     min     25%     50%     75%     max
location_label                                                               
City            2501.000000  0.000000  2501.0  2501.0  2501.0  2501.0  2501.0
Home               2.916618  1.425938     1.0     2.0     3.0     4.0     8.0
School            15.156250  2.677377     8.0    15.0    15.0    17.0    21.0
Work               9.687500  1.860764     2.0    10.0    10.0    10.0    14.0


______________________________________
Number of affiliated locations per agent
______________________________________

mean    2.530788
std     0.501100
min     2.000000
25%     2.000000
50%     3.000000
75%     3.00