Interactive sim
install environment in a terminal:

1) conda activate gates_bep

2) then type in this command: ipython kernel install --name gates_bep --user

NOTE: it takes about 12 seconds per time-step

In [1]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>")) #makes the display bars longer and spread out 100% across the screen width

In [2]:
import pandas as pd
from numpy.random import randn
from numpy.random import seed
from numpy import mean
from numpy import std
from numpy import cov
from matplotlib import pyplot
from scipy.stats import pearsonr
from scipy.stats import spearmanr
from scipy.stats import stats

# 1. Set up interactive sim by first the InteractiveContext modules that runs interactive sims, and make specs 

In [3]:
#This is the interactive context module from vivarium that runs the simulation 

from vivarium import InteractiveContext

In [4]:
!make_specs #uses the model_spec.in to create the yaml files 

#india_nocorr.yaml does not have the correlation structure for wasting/stunting and birthweight.  

In [5]:
#look for the file path of the yaml files (feb 2020 model)

%pwd 
# %cd ..
# %ls
%cd /ihme/homes/nicoly/vivarium_gates_bep/src/vivarium_gates_bep/model_specifications
%ls
# %cd /ihme/homes/nicoly/vivarium_gates_bep/src/vivarium_gates_bep/src/vivarium_gates_bep/model_specifications
# ! changes locally
# % changes in python (the running interpreter)

/ihme/homes/nicoly/vivarium_gates_bep/src/vivarium_gates_bep/model_specifications
[0m[01;34mbranches[0m/             india_nocorr.yaml  mali.yaml      pakistan.yaml
bw_risk_corr_spec.in  india.yaml         model_spec.in  tanzania.yaml


In [6]:
%pwd
%ls

[0m[01;34mbranches[0m/             india_nocorr.yaml  mali.yaml      pakistan.yaml
bw_risk_corr_spec.in  india.yaml         model_spec.in  tanzania.yaml


In [11]:
# can go into the branches folder, then can look at the specs; this will tell me change the scenario names
# %less scenarios.yaml <--this doesnt work so need to manually overide with terminal and nano

In [12]:
#lets me look into the contents of india.yaml file

%less india.yaml
#opens up the yaml file so we can look into the yaml file, look under maternal_supplementation to check what scenario this is. Currently at baseline. Can change it to run interactive sim on other scenarios

Last git pull for vivarium_gates_bep repo: october 1 2020

These are the specs for interactive sim (eg. India)

    input_data:
        location: India
        input_draw_number: 0
        artifact_path: /share/costeffectiveness/artifacts/vivarium_conic_lsff/india.hdf
    interpolation:
        order: 0
        extrapolate: True
    randomness:
        map_size: 1_000_000
        key_columns: ['entrance_time', 'age']
        random_seed: 0
        
    population:
        population_size: 10_000
        age_start: 0
        age_end: 0
        
    maternal_supplementation:
        scenario: 'baseline'

# 1. DEFINE SIM with InteractiveContext

In [8]:
#Define sim (use full filepath)
#currently running on no correlation structures

sim_india = InteractiveContext('/ihme/homes/nicoly/vivarium_gates_bep/src/vivarium_gates_bep/model_specifications/india.yaml') #gives model before time-steps

2020-10-03 13:58:07.974 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.1.population_manager.metrics as modifier to metrics
2020-10-03 13:58:07.982 | DEBUG    | vivarium.framework.artifact.manager:_load_artifact:66 - Running simulation from artifact located at /share/costeffectiveness/artifacts/vivarium_gates_bep/india.hdf.
2020-10-03 13:58:07.983 | DEBUG    | vivarium.framework.artifact.manager:_load_artifact:67 - Artifact base filter terms are ['draw == 0', "location == 'India' | location == 'Global'"].
2020-10-03 13:58:07.984 | DEBUG    | vivarium.framework.artifact.manager:_load_artifact:68 - Artifact additional filter terms are None.
2020-10-03 13:58:08.501 | DEBUG    | vivarium.framework.values:_register_value_producer:323 - Registering value pipeline protein_energy_malnutrition.disability_weight
2020-10-03 13:58:08.502 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering disability_weight.1.protein_energy_malnu

2020-10-03 13:58:33.864 | DEBUG    | vivarium.framework.values:_register_value_producer:323 - Registering value pipeline diarrheal_diseases.excess_mortality_rate.population_attributable_fraction
2020-10-03 13:58:33.865 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering mortality_rate.2.state.diarrheal_diseases.adjust_mortality_rate as modifier to mortality_rate
2020-10-03 13:58:34.337 | DEBUG    | vivarium.framework.values:_register_value_producer:323 - Registering value pipeline diarrheal_diseases.remission_rate
2020-10-03 13:58:34.338 | DEBUG    | vivarium.framework.values:_register_value_producer:323 - Registering value pipeline diarrheal_diseases.remission_rate.paf
2020-10-03 13:58:34.807 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering cause_specific_mortality_rate.3.disease_model.measles.adjust_cause_specific_mortality_rate as modifier to cause_specific_mortality_rate
2020-10-03 13:58:34.809 | DEBUG    | vivarium.frame

2020-10-03 13:58:45.294 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.15.disease_observer.measles.metrics as modifier to metrics
2020-10-03 13:58:45.296 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.16.disease_observer.lower_respiratory_infections.metrics as modifier to metrics
2020-10-03 13:58:45.297 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.17.disease_observer.protein_energy_malnutrition.metrics as modifier to metrics
2020-10-03 13:58:45.298 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.18.risk_observer.child_growth_failure.metrics as modifier to metrics
2020-10-03 13:58:45.299 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.19.risk_observer.low_birth_weight_and_short_gestation.metrics as modifier to metrics
2020-10-03 13:58:45.309 | DEBUG    | vivarium.framework.values:

In [9]:
#Define sim (use full filepath)

sim_pakistan = InteractiveContext('/ihme/homes/nicoly/vivarium_gates_bep/src/vivarium_gates_bep/model_specifications/pakistan.yaml') #gives model before time-steps

2020-10-03 13:59:16.831 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.1.population_manager.metrics as modifier to metrics
2020-10-03 13:59:16.838 | DEBUG    | vivarium.framework.artifact.manager:_load_artifact:66 - Running simulation from artifact located at /share/costeffectiveness/artifacts/vivarium_gates_bep/pakistan.hdf.
2020-10-03 13:59:16.839 | DEBUG    | vivarium.framework.artifact.manager:_load_artifact:67 - Artifact base filter terms are ['draw == 0', "location == 'Pakistan' | location == 'Global'"].
2020-10-03 13:59:16.840 | DEBUG    | vivarium.framework.artifact.manager:_load_artifact:68 - Artifact additional filter terms are None.
2020-10-03 13:59:17.376 | DEBUG    | vivarium.framework.values:_register_value_producer:323 - Registering value pipeline protein_energy_malnutrition.disability_weight
2020-10-03 13:59:17.377 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering disability_weight.1.protein_energy

2020-10-03 13:59:42.990 | DEBUG    | vivarium.framework.values:_register_value_producer:323 - Registering value pipeline diarrheal_diseases.excess_mortality_rate
2020-10-03 13:59:42.992 | DEBUG    | vivarium.framework.values:_register_value_producer:323 - Registering value pipeline diarrheal_diseases.excess_mortality_rate.population_attributable_fraction
2020-10-03 13:59:42.994 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering mortality_rate.2.state.diarrheal_diseases.adjust_mortality_rate as modifier to mortality_rate
2020-10-03 13:59:43.482 | DEBUG    | vivarium.framework.values:_register_value_producer:323 - Registering value pipeline diarrheal_diseases.remission_rate
2020-10-03 13:59:43.483 | DEBUG    | vivarium.framework.values:_register_value_producer:323 - Registering value pipeline diarrheal_diseases.remission_rate.paf
2020-10-03 13:59:43.948 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering cause_specific_mortality_

2020-10-03 13:59:54.837 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.14.disease_observer.diarrheal_diseases.metrics as modifier to metrics
2020-10-03 13:59:54.838 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.15.disease_observer.measles.metrics as modifier to metrics
2020-10-03 13:59:54.840 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.16.disease_observer.lower_respiratory_infections.metrics as modifier to metrics
2020-10-03 13:59:54.841 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.17.disease_observer.protein_energy_malnutrition.metrics as modifier to metrics
2020-10-03 13:59:54.843 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.18.risk_observer.child_growth_failure.metrics as modifier to metrics
2020-10-03 13:59:54.843 | DEBUG    | vivarium.framework.values:register_value_

In [7]:
#Define sim (use full filepath)

sim_mali = InteractiveContext('/ihme/homes/nicoly/vivarium_gates_bep/src/vivarium_gates_bep/model_specifications/mali.yaml')

2020-10-03 13:56:58.971 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.1.population_manager.metrics as modifier to metrics
2020-10-03 13:56:58.989 | DEBUG    | vivarium.framework.artifact.manager:_load_artifact:66 - Running simulation from artifact located at /share/costeffectiveness/artifacts/vivarium_gates_bep/mali.hdf.
2020-10-03 13:56:58.991 | DEBUG    | vivarium.framework.artifact.manager:_load_artifact:67 - Artifact base filter terms are ['draw == 0', "location == 'Mali' | location == 'Global'"].
2020-10-03 13:56:58.992 | DEBUG    | vivarium.framework.artifact.manager:_load_artifact:68 - Artifact additional filter terms are None.
2020-10-03 13:56:59.572 | DEBUG    | vivarium.framework.values:_register_value_producer:323 - Registering value pipeline protein_energy_malnutrition.disability_weight
2020-10-03 13:56:59.573 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering disability_weight.1.protein_energy_malnutr

2020-10-03 13:57:25.083 | DEBUG    | vivarium.framework.values:_register_value_producer:323 - Registering value pipeline diarrheal_diseases.excess_mortality_rate.population_attributable_fraction
2020-10-03 13:57:25.084 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering mortality_rate.2.state.diarrheal_diseases.adjust_mortality_rate as modifier to mortality_rate
2020-10-03 13:57:25.618 | DEBUG    | vivarium.framework.values:_register_value_producer:323 - Registering value pipeline diarrheal_diseases.remission_rate
2020-10-03 13:57:25.620 | DEBUG    | vivarium.framework.values:_register_value_producer:323 - Registering value pipeline diarrheal_diseases.remission_rate.paf
2020-10-03 13:57:26.099 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering cause_specific_mortality_rate.3.disease_model.measles.adjust_cause_specific_mortality_rate as modifier to cause_specific_mortality_rate
2020-10-03 13:57:26.101 | DEBUG    | vivarium.frame

2020-10-03 13:57:36.984 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.15.disease_observer.measles.metrics as modifier to metrics
2020-10-03 13:57:36.985 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.16.disease_observer.lower_respiratory_infections.metrics as modifier to metrics
2020-10-03 13:57:36.987 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.17.disease_observer.protein_energy_malnutrition.metrics as modifier to metrics
2020-10-03 13:57:36.988 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.18.risk_observer.child_growth_failure.metrics as modifier to metrics
2020-10-03 13:57:36.989 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.19.risk_observer.low_birth_weight_and_short_gestation.metrics as modifier to metrics
2020-10-03 13:57:36.990 | DEBUG    | vivarium.framework.values:

In [10]:
#Define sim (use full filepath)

sim_tanzania = InteractiveContext('/ihme/homes/nicoly/vivarium_gates_bep/src/vivarium_gates_bep/model_specifications/tanzania.yaml') #gives model before time-steps

2020-10-03 14:00:25.764 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.1.population_manager.metrics as modifier to metrics
2020-10-03 14:00:25.771 | DEBUG    | vivarium.framework.artifact.manager:_load_artifact:66 - Running simulation from artifact located at /share/costeffectiveness/artifacts/vivarium_gates_bep/tanzania.hdf.
2020-10-03 14:00:25.772 | DEBUG    | vivarium.framework.artifact.manager:_load_artifact:67 - Artifact base filter terms are ['draw == 0', "location == 'Tanzania' | location == 'Global'"].
2020-10-03 14:00:25.773 | DEBUG    | vivarium.framework.artifact.manager:_load_artifact:68 - Artifact additional filter terms are None.
2020-10-03 14:00:26.312 | DEBUG    | vivarium.framework.values:_register_value_producer:323 - Registering value pipeline protein_energy_malnutrition.disability_weight
2020-10-03 14:00:26.314 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering disability_weight.1.protein_energy

2020-10-03 14:00:51.781 | DEBUG    | vivarium.framework.values:_register_value_producer:323 - Registering value pipeline diarrheal_diseases.excess_mortality_rate.population_attributable_fraction
2020-10-03 14:00:51.782 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering mortality_rate.2.state.diarrheal_diseases.adjust_mortality_rate as modifier to mortality_rate
2020-10-03 14:00:52.265 | DEBUG    | vivarium.framework.values:_register_value_producer:323 - Registering value pipeline diarrheal_diseases.remission_rate
2020-10-03 14:00:52.266 | DEBUG    | vivarium.framework.values:_register_value_producer:323 - Registering value pipeline diarrheal_diseases.remission_rate.paf
2020-10-03 14:00:52.750 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering cause_specific_mortality_rate.3.disease_model.measles.adjust_cause_specific_mortality_rate as modifier to cause_specific_mortality_rate
2020-10-03 14:00:52.752 | DEBUG    | vivarium.frame

2020-10-03 14:01:03.486 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.15.disease_observer.measles.metrics as modifier to metrics
2020-10-03 14:01:03.487 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.16.disease_observer.lower_respiratory_infections.metrics as modifier to metrics
2020-10-03 14:01:03.489 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.17.disease_observer.protein_energy_malnutrition.metrics as modifier to metrics
2020-10-03 14:01:03.490 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.18.risk_observer.child_growth_failure.metrics as modifier to metrics
2020-10-03 14:01:03.491 | DEBUG    | vivarium.framework.values:register_value_modifier:373 - Registering metrics.19.risk_observer.low_birth_weight_and_short_gestation.metrics as modifier to metrics
2020-10-03 14:01:03.492 | DEBUG    | vivarium.framework.values:

# 2. Now can run the sim using TIME-STEPS

## India

In [None]:
#run for sim which is India

NUM_STEPS = 400

# run time-steps                            
for i in range(NUM_STEPS):
    sim_india.step()

2020-10-03 14:02:40.316 | DEBUG    | vivarium.framework.engine:step:140 - 2020-07-02 00:00:00
2020-10-03 14:02:49.061 | DEBUG    | vivarium.framework.engine:step:140 - 2020-07-03 00:00:00
2020-10-03 14:02:58.249 | DEBUG    | vivarium.framework.engine:step:140 - 2020-07-04 00:00:00
2020-10-03 14:03:07.280 | DEBUG    | vivarium.framework.engine:step:140 - 2020-07-05 00:00:00
2020-10-03 14:03:16.113 | DEBUG    | vivarium.framework.engine:step:140 - 2020-07-06 00:00:00
2020-10-03 14:03:24.930 | DEBUG    | vivarium.framework.engine:step:140 - 2020-07-07 00:00:00
2020-10-03 14:03:33.947 | DEBUG    | vivarium.framework.engine:step:140 - 2020-07-08 00:00:00
2020-10-03 14:03:43.103 | DEBUG    | vivarium.framework.engine:step:140 - 2020-07-09 00:00:00
2020-10-03 14:03:52.236 | DEBUG    | vivarium.framework.engine:step:140 - 2020-07-10 00:00:00
2020-10-03 14:04:01.363 | DEBUG    | vivarium.framework.engine:step:140 - 2020-07-11 00:00:00
2020-10-03 14:04:09.886 | DEBUG    | vivarium.framework.engi

## Pakistan

In [None]:
#run for sim_pakistan

NUM_STEPS = 400

# run time-steps  \  
for i in range(NUM_STEPS):
    sim_pakistan.step()

## Mali

In [None]:
#run for sim_mali

NUM_STEPS = 400
# 400 for 1 year
# 32 for 1 month

# run time-steps  \  
for i in range(NUM_STEPS):
    sim_mali.step()

## Tanzania

In [None]:
#run for sim_tanzania

NUM_STEPS = 400

# run time-steps  \  
for i in range(NUM_STEPS):
    sim_tanzania.step()

# 3. Get pop df after model has run the time-steps indicated in 2 above 

In [None]:
#co-variate population using get_population() method

pop_india = sim_india.get_population()
pop_pakistan = sim_pakistan.get_population()
pop_mali = sim_mali.get_population()
pop_tanzania = sim_tanzania.get_population()

#check columns for one country
pop_mali.head()

#10,000 original simulants were specified in the yaml file, some would have died during time-steps (this is draw 0)

# Explore pop df
## note: explore pop columes alive (tells us whether the simulant has died or is still alive) This is an important column because we want to analyse the alive population only at 6 months

pop_countryname.alive.head()

In [None]:
pop_india.alive.head()

In [None]:
#explore pop columes birthweight 

pop_india.birth_weight.head()

# 4. Get pipeline for interested variables (pipeline is a kind of method, so doesnt change in between sims. Only the values coming out of the pipeline changes)

In [11]:
#list the key names of the pipeline

sim_india.list_values()

['metrics',
 'protein_energy_malnutrition.disability_weight',
 'disability_weight',
 'cause_specific_mortality_rate',
 'protein_energy_malnutrition.excess_mortality_rate',
 'protein_energy_malnutrition.excess_mortality_rate.population_attributable_fraction',
 'mortality_rate',
 'child_wasting.exposure',
 'child_wasting.propensity',
 'diarrheal_diseases.incidence_rate',
 'diarrheal_diseases.incidence_rate.paf',
 'measles.incidence_rate',
 'measles.incidence_rate.paf',
 'lower_respiratory_infections.incidence_rate',
 'lower_respiratory_infections.incidence_rate.paf',
 'child_stunting.propensity',
 'child_stunting.exposure',
 'affected_unmodeled.csmr',
 'affected_unmodeled.csmr.population_attributable_fraction',
 'all_causes.mortality_hazard',
 'all_causes.mortality_hazard.population_attributable_fraction',
 'diarrheal_diseases.dwell_time',
 'diarrheal_diseases.disability_weight',
 'diarrheal_diseases.excess_mortality_rate',
 'diarrheal_diseases.excess_mortality_rate.population_attributab

In [None]:
#interested pipelines : 'child_stunting.exposure', 'child_wasting.exposure'
#this 'get_value' gets the wasting pipline values -not totall sure what this means! get some clarification
# a pipeline is a 'python method thingie'

#pipeline for india sim

pipe_wasting_india = sim_india.get_value('child_wasting.exposure')
pipe_stunting_india = sim_india.get_value('child_stunting.exposure')
pipe_lbwsg_india = sim_india.get_value('low_birth_weight_and_short_gestation.exposure')

In [None]:
#pipeline for pakistan sim

pipe_wasting_pakistan = sim_pakistan.get_value('child_wasting.exposure')
pipe_stunting_pakistan = sim_pakistan.get_value('child_stunting.exposure')
pipe_lbwsg_pakistan = sim_pakistan.get_value('low_birth_weight_and_short_gestation.exposure')

In [None]:
#pipeline for mali sim

pipe_wasting_mali = sim_mali.get_value('child_wasting.exposure')
pipe_stunting_mali = sim_mali.get_value('child_stunting.exposure')
pipe_lbwsg_mali = sim_mali.get_value('low_birth_weight_and_short_gestation.exposure')

In [None]:
#pipeline for tanzania sim

pipe_wasting_tanzania = sim_tanzania.get_value('child_wasting.exposure')
pipe_stunting_tanzania = sim_tanzania.get_value('child_stunting.exposure')
pipe_lbwsg_tanzania = sim_tanzania.get_value('low_birth_weight_and_short_gestation.exposure')

In [None]:
#the skip_post_processor skips the conversion of the z-scores to cats 1-4 (this is a method, and shouldnt need to be re-run)

#explore

pipe_wasting_india(pop_india.index, skip_post_processor=True).head()

In [None]:
#stunting pipeline

pipe_stunting_india(pop_india.index, skip_post_processor=True).head()

# 5a. Create mask to filter alive/dead population 

In [None]:
# create mask; read more about masking here ->http://danielandreasen.github.io/:about/2015/01/19/masks-in-python/

#(pop.alive == 'alive').all()

mask_alive_india = pop_india.alive == 'alive'
mask_alive_pakistan = pop_pakistan.alive == 'alive'
mask_alive_mali = pop_mali.alive == 'alive'
mask_alive_tanzania = pop_tanzania.alive == 'alive'

#mask_alive[:5] = False
mask_alive_india.head()

In [None]:
#test mask alive

pop_india[mask_alive_india].head()

#pop[bolean series] --> if you pass a bolean series inside square brackets of a data frame, you get a dataframe where bolean is true
# the boleans series is a different object, so make sure they have matching indexes

# 5b. create index for alive population and dead population 

In [None]:
#another way to do it

alive_pop_india = pop_india.loc[mask_alive_india, :]
dead_pop_india = pop_india.loc[~mask_alive_india, :]

alive_pop_pakistan = pop_pakistan.loc[mask_alive_pakistan, :]
dead_pop_pakistan = pop_pakistan.loc[~mask_alive_pakistan, :]

alive_pop_mali = pop_mali.loc[mask_alive_mali, :]
dead_pop_mali = pop_mali.loc[~mask_alive_mali, :]

alive_pop_tanzania = pop_tanzania.loc[mask_alive_tanzania, :]
dead_pop_tanzania = pop_tanzania.loc[~mask_alive_tanzania, :]


#----------------notes------------------------------------------------
# pop.loc[mask_alive, ['mother_malnourished', 'tracked']].head() 
# two dimensions of dataframe where loc[row,columns]
# here I'm saying i only want the rows defined by my ask, and the colon means all columns 
# .loc makes it more clear which filters I am using for rows and columns

In [None]:
alive_pop_india.index

In [None]:
dead_pop_india.index

In [None]:
pop_india.index[mask_alive_india]

#by default, this is giving us a pd index data frame with all the values where mask_alive = alive is TRUE
#pop.index[mask_alive[~mask_alive].index] # ~ means mask_alive= False (~ this is the complement operator)

# 5c. CREATE alive index to filter out the alive population only at time step

In [None]:
#CREATE ALIVE INDEX

alive_index_india = alive_pop_india.index
alive_index_pakistan = alive_pop_pakistan.index
alive_index_mali = alive_pop_mali.index
alive_index_tanzania = alive_pop_tanzania.index

# 6. Create dataframe with interested variables

In [None]:
#get the dataframe with individual data to look at correlation between birthweight and wasting/stunting z scores

india = pd.DataFrame({'stunting_india': pipe_stunting_india(alive_index_india, skip_post_processor=True),
                    'wasting_india': pipe_wasting_india(alive_index_india, skip_post_processor=True),
                    'lbwsg_india': pipe_lbwsg_india(alive_index_india, skip_post_processor=True).birth_weight,
                    'birth_weight_india': pop_india.loc[alive_index_india].birth_weight,
                    'mom_status_india': pop_india.loc[alive_index_india].mother_malnourished,
                    'ifa_status_india': pop_india.loc[alive_index_india].baseline_maternal_supplementation_type,
                    'alive_status_india': pop_india.loc[alive_index_india].alive })
len(india)

##note number who died is 10,000-len(df1)

In [None]:
#get the dataframe with individual data to look at correlation between birthweight and wasting/stunting z scores

mali = pd.DataFrame({'stunting_mali': pipe_stunting_mali(alive_index_mali, skip_post_processor=True),
                    'wasting_mali': pipe_wasting_mali(alive_index_mali, skip_post_processor=True),
                    'lbwsg_mali': pipe_lbwsg_mali(alive_index_mali, skip_post_processor=True).birth_weight,
                    'birth_weight_mali': pop_mali.loc[alive_index_mali].birth_weight,
                    'mom_status_mali': pop_mali.loc[alive_index_mali].mother_malnourished,
                    'ifa_status_mali': pop_mali.loc[alive_index_mali].baseline_maternal_supplementation_type,
                    'alive_status_mali': pop_mali.loc[alive_index_mali].alive })
len(mali)

##note number who died is 10,000-len(df1)

In [None]:
#PAKSITAN

#get the dataframe with individual data to look at correlation between birthweight and wasting/stunting z scores

pakistan = pd.DataFrame({'stunting_pakistan': pipe_stunting_pakistan(alive_index_pakistan, skip_post_processor=True),
                    'wasting_pakistan': pipe_wasting_pakistan(alive_index_pakistan, skip_post_processor=True),
                    'lbwsg_pakistan': pipe_lbwsg_pakistan(alive_index_pakistan, skip_post_processor=True).birth_weight,
                    'birth_weight_pakistan': pop_pakistan.loc[alive_index_pakistan].birth_weight,
                    'mom_status_pakistan': pop_pakistan.loc[alive_index_pakistan].mother_malnourished,
                    'ifa_status_pakistan': pop_pakistan.loc[alive_index_pakistan].baseline_maternal_supplementation_type,
                    'alive_status_pakistan': pop_pakistan.loc[alive_index_pakistan].alive })
len(pakistan)


##note number who died is 10,000-len(df1)

In [None]:
#TANZANIA

#get the dataframe with individual data to look at correlation between birthweight and wasting/stunting z scores

tanzania = pd.DataFrame({'stunting_tanzania': pipe_stunting_tanzania(alive_index_tanzania, skip_post_processor=True),
                    'wasting_tanzania': pipe_wasting_tanzania(alive_index_tanzania, skip_post_processor=True),
                    'lbwsg_tanzania': pipe_lbwsg_tanzania(alive_index_tanzania, skip_post_processor=True).birth_weight,
                    'birth_weight_tanzania': pop_tanzania.loc[alive_index_tanzania].birth_weight,
                    'mom_status_tanzania': pop_tanzania.loc[alive_index_tanzania].mother_malnourished,
                    'ifa_status_tanzania': pop_tanzania.loc[alive_index_tanzania].baseline_maternal_supplementation_type,
                    'alive_status_tanzania': pop_tanzania.loc[alive_index_tanzania].alive })
len(tanzania)

##note number who died is 10,000-len(df1)

In [None]:
india.head()

# 7. Z-SCORES for stunting and wasting for each country

# ----------------------------------------------------------
## Individual z-score at 32 days



In [None]:
def getZscores(df):
    name =[x for x in globals() if globals()[x] is df][0]
    df['stunting_z_score_%s' % name] = df['stunting_%s' % name] - 10
    df['wasting_z_score_%s' % name] = df['wasting_%s' % name] -  10
    return df

india = getZscores(india)
pakistan = getZscores(pakistan)
mali = getZscores(mali)
tanzania = getZscores(tanzania)

india.head()
# mali['stunting_z_score_mali'] = mali['stunting_mali'] -  10
# mali['wasting_z_score_mali'] = mali['wasting_mali'] -  10
# mali.head()

# These are z-scores between 27 days and 1 year (32 days)

In [None]:
# print('INTERACTIVE SIM Z-SCORES -no correlation ')

# print('\n\n')

# print('stunting z-score in INDIA at 32 days =' + str(india.stunting_z_score_india.mean()))
# #print('stunting z-score in PAKISTAN at 32 days =' + str(pakistan.stunting_z_score_pakistan.mean()))
# #print('stunting z-score in MALI at 32 days =' + str(mali.stunting_z_score_mali.mean()))
# #print('stunting z-score in TANZANIA at 32 days =' + str(tanzania.stunting_z_score_tanzania.mean()))

# print('\n\n')

# print('wasting z-score in INDIA at 32 days =' + str(india.wasting_z_score_india.mean()))
# #print('wasting z-score in PAKISTAN at 32 days =' + str(pakistan.wasting_z_score_pakistan.mean()))
# #print('wasting z-score in MALI at 32 days =' + str(mali.wasting_z_score_mali.mean()))
# #print('wasting z-score in TANZANIA at 32 days =' + str(tanzania.wasting_z_score_tanzania.mean()))

#print(mali.wasting_z_score_mali.mean())

# These are z-scores between 1 and 2 years (400 days)

In [None]:
!whoami
!date

In [None]:
print('INTERACTIVE SIM Z-SCORES at 400 days')

print('\n\n')

print('stunting z-score in INDIA at 400 days =' + str(india.stunting_z_score_india.mean()))
print('stunting z-score in PAKISTAN at 400 days =' + str(pakistan.stunting_z_score_pakistan.mean()))
print('stunting z-score in MALI at 400 days =' + str(mali.stunting_z_score_mali.mean()))
print('stunting z-score in TANZANIA at 400 days =' + str(tanzania.stunting_z_score_tanzania.mean()))

print('\n\n')

print('wasting z-score in INDIA at 400 days =' + str(india.wasting_z_score_india.mean()))
print('wasting z-score in PAKISTAN at 400 days =' + str(pakistan.wasting_z_score_pakistan.mean()))
print('wasting z-score in MALI at 400 days =' + str(mali.wasting_z_score_mali.mean()))
print('wasting z-score in TANZANIA at 400 days =' + str(tanzania.wasting_z_score_tanzania.mean()))

#print(mali.wasting_z_score_mali.mean())

In [None]:
print(mali.wasting_mali.head())
print(mali.wasting_mali.mean())

## =============================== CGF ends here ================================================

## birthweight by mom status

In [None]:
gb = df1.groupby('mom_status')

print('birthweight from malnourished mothers = ' + str(gb.get_group('malnourished').lbwsg.mean()))
print('birthweight from normal mothers = ' + str(gb.get_group('normal').lbwsg.mean()))
print('birthweight difference = ' + str(gb.get_group('normal').lbwsg.mean()-gb.get_group('malnourished').lbwsg.mean()))

# we want to update this to new data where we got the delta birthweight shift in grams from the literature of 160g

In [None]:
gb2 = df2.groupby('mom_status_mali')

print('birthweight from malnourished mothers = ' + str(gb2.get_group('malnourished').lbwsg_mali.mean()))
print('birthweight from normal mothers = ' + str(gb2.get_group('normal').lbwsg_mali.mean()))
print('birthweight difference = ' + str(gb2.get_group('normal').lbwsg_mali.mean()-gb2.get_group('malnourished').lbwsg_mali.mean()))

# we want to update this to new data where we got the delta birthweight shift in grams from the literature of 160g

In [None]:
pyplot.scatter(df1.mom_status, df1.lbwsg)

## stunting z-score by mom status

In [None]:
gb2 = df2.groupby('mom_status_mali')

mal_stunted_mali = gb2.get_group('malnourished').stunting_mali.mean()-10
norm_stunted_mali = gb2.get_group('normal').stunting_mali.mean()-10
stunted_diff_mali = (mal_stunted) - (norm_stunted)


print('stunting z-score from malnourished mothers = ' + str(mal_stunted_mali))
print('stunting z-score from normal mothers = ' + str(norm_stunted_mali))
print('stunting z-score difference = ' + str(stunted_diff_mali))

In [None]:
gb = df1.groupby('mom_status')

mal_stunted = gb.get_group('malnourished').stunting.mean()-10
norm_stunted = gb.get_group('normal').stunting.mean()-10
stunted_diff = (mal_stunted) - (norm_stunted)


print('stunting z-score from malnourished mothers = ' + str(mal_stunted))
print('stunting z-score from normal mothers = ' + str(norm_stunted))
print('stunting z-score difference = ' + str(stunted_diff))

In [None]:
#scatter plot of z scores for stunting by mom status

pyplot.scatter(df1.mom_status, df1.stunting-10)

## wasting z-score by mom status

In [None]:
gb = df1.groupby('mom_status')

mal_wasted = gb.get_group('malnourished').wasting.mean() -10
norm_wasted = gb.get_group('normal').wasting.mean() -10
wasted_diff = (mal_wasted) - (norm_wasted)


print('wasting z-score from malnourished mothers = ' + str(mal_wasted))
print('wasting z-score from normal mothers = ' + str(norm_wasted))
print('wasting z-score difference = ' + str(wasted_diff))

In [None]:
pyplot.scatter(df1.mom_status, df1.wasting-10)

## Make scatter plots to visually inspect correlation

In [None]:
pyplot.scatter(df2.lbwsg_mali, df2.wasting_mali-10)
pyplot.show()

In [None]:
#get individual birthweight and individual WLZ - overall

pyplot.scatter(df1.lbwsg, df1.wasting-10)
pyplot.show()

In [None]:
#get individual birthweight and individual LAZ - overall

pyplot.scatter(df1.lbwsg, df1.stunting-10)
pyplot.show()

In [None]:
mask_mommal = df1.mom_status == 'malnourished'

In [None]:
#lbw and wasting among normal mothers

pyplot.scatter(df1.loc[~mask_mommal,['lbwsg']], df1.loc[~mask_mommal,['wasting']]-10)

In [None]:
#lbw and wasting among malnourished 

pyplot.scatter(df1.loc[mask_mommal,['lbwsg']], df1.loc[mask_mommal,['wasting']]-10)

# 7. Calulcate correlation and spearman co-efficient

correlation formula

cov(X, Y) = (sum (x - mean(X)) * (y - mean(Y)) ) * 1/(n-1)

In [None]:
bw_mali = df2.lbwsg_mali
laz_mali = df2.stunting_mali-10
wlz_mali = df2.wasting_mali-10
n_mali = len(df2)


In [None]:
bw = df1.lbwsg
laz = df1.stunting-10
wlz = df1.wasting-10
n = len(df1)


In [None]:
#Covariance

covariance_laz = cov(bw, laz)
print(covariance_laz)

In [None]:
covariance_wlz = cov(bw, wlz)
print(covariance_wlz)

In [None]:
print(stats.spearmanr(bw_mali, laz_mali))

In [None]:
print('spearman correlation coefficient for birthweight and laz at 6 months:')
print(stats.spearmanr(bw, laz))

print('spearman correlation coefficient for birthweight and wlz at 6 months:')
print(stats.spearmanr(bw, wlz))

Pearson formula
	
Pearson's correlation coefficient = covariance(X, Y) / (stdv(X) * stdv(Y))

Spearman formula

Spearman's correlation coefficient = covariance(rank(X), rank(Y)) / (stdv(rank(X)) * stdv(rank(Y)))

# Abie's proposal

0) intialize simulants attributes for birthweight and gestational age 

1) find a pseudo propensity for birthweight between 0-1 which says effectively where individul lies in the birthweight distribution 

- eg. if simulant 0 has birthweight 3371g, figure out where this simulant lies in the percentile in the distribution this simulants lie

2) initliaze a laz score to be correlated with the birthweight with the spearman correlation of 0.4 (from the maled data) 



