## SAMPLE SYNTHETIC POPULATION 

###### CREATE A SYNTHETIC POPULATION OF n (=500k) AGENTS (.PICKLE FILE) 

Sample a population of *n* agents 
The output .pickle file will contain for each country the following elements

- **AGE** : a np.array of length = n  which represents the age for each of the agents sampled 
- **HOUSEHOLDS**: a np.array of lenght = n indicating the type of household for each of the agents sampled
- **HOUSEHOLDS_TOT**: a np.array indicating the cumulative number of households grouped by type of households
- **DIABETES**: a np.array of length = n indicating whether the agent has or does not have diabetes  >>> values 1 or 0 
- **HYPERTENSION**: a np.array of length = n indicating whether the agent has or does not have hypertension >>> values 1 or 0 
- **AGE-GROUPS**: a np.array of length = 101 indicating the number of agents that have each specific age level == i 

This Script uses functions defined in the **Sample_Households** and **Sample_Comorbidities** Jupyter Notebooks. Therefore, these Notebooks have to be imported to be able to use the necessary functions. Specifically, this Notebook will first sample the Households strcutres and retreive the agents' age levels. Following, there will be associated to each agent its specific comorbid status.  

### TABLE OF CONTENTS     
[1. Sample Population Functions](#fn)       
[2. Sample, Export and Save Synthetic Population](#pop_ex)    
[3. Read Sampled Population](#read)      

In [1]:
import pandas as pd
import numpy as np
import ipynb
import pickle
import os
import json
from ipynb.fs.full.Sample_Households import *  # import the Sample Comorbidities script and functions
from ipynb.fs.full.Sample_Comorbidities import * # import the Sample Households script and functions

b_dir = './' # set base directory 

### 1.  Sample Population Function
<a id="fn"></a>

In [2]:
def sample_population(country,n = 10000000):
    '''FUNCTION to sample a synthetic population'''

    #np.random.seed(seed) # set the seed 
    n = int(n) # set the number of agents to simulate
     
    # Create households for the specific country --> RETURNS: Households, Age, Households_tot for each agent 
    # Functions from the Sample_Households Script 
    print("Making households... ")
    if country == 'Italy':
        households, age, households_tot = sample_households_italy(n)
    elif country == 'Spain':
        households, age, households_tot = sample_households_spain(n)
    elif country == 'Germany':
        households, age, households_tot = sample_households_germany(n)
    elif country == 'France':
        households, age, households_tot = sample_households_france(n)
    else: 
        # function for the country is not defined -- error
        print('Function not defined for country %s' %country)
        
    households = households.astype('int64') # set the type as int64
    age = age.astype('int64') # set the type as int64
    print("Done.")
    
    # Create age_groups --> RETURNS: List of number of agents for each age value that have that age
    print("Creating age group sector array... ")
    n_age = 101 
    age_groups = tuple([np.where(age == i)[0] for i in range(0, n_age)]) # gives the position of the elements for a given age value 
    print("Done.")

    # Sample comorbidities
    # Fcuntions from the Sample_Comorbidities Script
    print("Sampling comorbidities... ")
    diabetes, hypertension = None, None # initialize diabetes and hypertension
    diabetes, hypertension = sample_joint_comorbidities(age, country) # sample diabetes and hypertension for the country
    diabetes = diabetes.astype('int64') # set the type as int64
    hypertension = hypertension.astype('int64') # set the type as int64
    print("Done.")
    
    # Save and export to a pickle file 
    # Save in the inidicated base_directory 
    print("Saving... ")
    pickle.dump((age, households, households_tot, diabetes, hypertension, age_groups), open(os.path.join(b_dir,'{}_population_{}.pickle'.format(country, int(n))), 'wb'), protocol=4)
    print("Done.")
    print('####')

### 2.  Sample, Export and Save Synthetic Population
<a id="pop_ex"></a>

To execute the synthetic population sampling >> uncomment each of the sample_population() function of the following cell

In [3]:
## SIMULATE SYNTHETIC POPULATION ITALY; 500k
#sample_population('Italy',500000)

## SIMULATE SYNTHETIC POPULATION SPAIN; 500k
#sample_population('Spain',500000)

## SIMULATE SYNTHETIC POPULATION GERMANY; 500k
#sample_population('Germany',500000)

## SIMULATE SYNTHETIC POPULATION FRANCE; 500k
#sample_population('France',500000)

### 3.  Read Sampled Population
<a id="read"></a>

For further analysis on the Synthetic Population generated, it could be imported the Pickle File of interest by turning into 'Code' type the following cell and executing it. 