<img  src="https://www.bcm.edu/themes/custom/bcm_bootstrap_subtheme/css/../images/BCM-90x90.svg" alt="Drawing" style="height: 100px;float: left;"/>


# Mask or no mask: A decision guide for teachers, parents, and students

* August, 2021
* Institute for Clinical & Translational Research, Baylor College of Medicine

# Context

With  increasing incidences of [COVID-19 infections in the Greater Houston Area](https://www.tmc.edu/coronavirus-updates/daily-new-covid-19-positive-cases-for-the-greater-houston-area/) and  students returning to school, informed decisions must be made regarding teacher and student continuous masking recommendations.

As of today (August 21),
1. An executive order is in place which [prohibits Texas government funded entities from mandating mask wearing on premises](https://gov.texas.gov/news/post/governor-abbott-issues-executive-order-prohibiting-government-entities-from-mandating-masks).
2. [Houston ISD](https://www.houstonisd.org/) is one of [a handful of schools in the Houston area](https://www.khou.com/article/news/health/coronavirus/masks-houston-school-districts-covid-coronavirus/285-25154fcd-97ae-4d8b-9444-be77a0b58f89) which does have a mask wearing mandate in place.
3. Colleges in the our area have established differing policies. Notably, Rice University has [COVID policies that mandate mask wearing indoors](https://coronavirus.rice.edu/policies). 

To provide quantitative insight into these mixed stances, we ran a targeted simulation using our [COVID-19 Outbreak Simulator](https://ictr.github.io/covid19-outbreak-simulator/) to compare the number of COVID-cases that develop in schools that mandate or do not mandate mask wearing.

**Disclaimer**: This report was prepared by researchers in the [Institute for Clinical & Translational Research](https://www.bcm.edu/research/research-offices/institute-for-clinical-translational-research), Baylor College of Medicine. It uses a limited model to predict the impact of mask wearing on COVID-19 incidences in public schools in the Greater Houston Area, and does not necessarily reflect the impact of other site specific COVID-prevention policies that may be in place at each of these schools.

# Conclusions

The delta variant is highly infectious. With the Houston's existing high regional community infection rate we found that:

1. Almost all schools will have at least 1 school-acquired infection.
2. If masks are worn at all the time, `0.8%` of high school students will become infected at school. This risk is, however, lower than getting infected from the community (`2.8%`). The proportions for school and community-based infections are higher for elementary schools(`1.0%` and `3.0%` respectively) because students in elementary schools are not vaccinated.
3. If masks are not worn at all, `15.4%` of high school students will be infected within a month, and one fifth (`21.0%`) all students in elementary schools will be infected. Again, we are assuming that there will be no post-infection contact tracing and testing will be performed so asymptomatic carriers will continue to attend school and infect others.
4. The proportion of students who will be isolated due to the showing of symptoms will be `1.8%` for high school students and `2.1%` for elementary school students if masks are worn. About `7.7%` of students in high school and `9.7%` of students in elementary schools will be isolated if masks are not worn.
5. If community infection rate drops to 1/4 of the current level (around `500` in Harris County or `750` in the Greater Houston Area, students isolated due to COVID-19 infections will be much lower, which will be around `0.5%` if masks are worn at all time, and `2.5%` if masks are not worn.

These results reflect our assumptions that

1. `60%` of teachers are vaccinated, `30%` of high school students are vaccinated, and no students in elementary schools are vaccinated.
2. Students in high schools have more physical interactions whereas students in elementary schools mostly interact with their own classmates.
3. Around `20%` of all students and teachers have already been infected and recovered from COVID.

Please see the `Methods` section for details.

# Results

The following table lists 

1. **Community infection rate (per mission)**, which is assumed to be `1600/million` now, but we also provide results for `400/million` for comparison purposes.
2. **School** can be either high school or elementary school, which differ by school size and student movement patterns.
3. **Mask wearing**, can be no mask, with mask, or non-mandatory, for which assume 50% of students and teachers will wear mask.
4. **Proportion of schools that have no infection** at all in a month.
5. **Average proportion of students who will get infected from the community** (home). This proportion is impact on risk from the community infection rate, vaccine proportions, and the proportion of students who have been infected and then remain immune to re-infection.
6. **Average proportion of students who will get infected from school**. 
7. **Average proportion of students who will get isolated after showing COVID symptoms**, with infections from community or their schools.

In [68]:
## This function retrieves results for all contexts as a Python DataFrame.

import pickle
import os
import pandas as pd

def get_results(contexts):
    all_res = []
    for context in contexts:
        res_file = path(f'{context["name"]}.pickle')
        if not res_file.exists():
            print(f'Missing {context["name"]}')
            continue
        with open(res_file, 'rb') as infile:
            loaded = pickle.load(infile)
            #if not 'num_replicates' in loaded or loaded["num_replicates"] < context["num_replicates"]:
            #    print(f'WRONG {context["name"]}')
                #os.remove(res_file)
                #continue
            res = {
                'Community Infection Rate (per million)': int(context['community_infection_rate'] * 1000000),
                'School': context['school'],
                'Mask wearing': context['mask_wearing'],
#                 'Vaccination coverage (%)': int(context['vac_coverage'] * 100),
                
            }
            res.update(loaded)
            res.pop('num_replicates', None)
            all_res.append(res)

    all_res = pd.DataFrame(all_res)
    return all_res

pd.options.display.float_format = '{:,.1f}'.format


In [72]:
%preview  res 

res = get_results(all_contexts)

with pd.ExcelWriter(path(f'results.xlsx'), engine='xlsxwriter') as writer: 
    res.to_excel(writer, index=False, sheet_name='All')
    
    workbook  = writer.book
    
    fmt_num = workbook.add_format({'num_format': '0.00'})
    header_format = workbook.add_format({
        'bold': True,
        'text_wrap': True})
    
    worksheet = writer.sheets["All"]
    worksheet.set_column('J:P', None, fmt_num)
    #worksheet.set_row(0, 0, header_format)

Unnamed: 0,Community Infection Rate (per million),School,Mask wearing,Proportion of schools with no infection %,Avg proportion of students infected from community %,Avg proportion of students infected from school %,Avg proportion of students isolated %
0,1600,high,with_mask,0.8,2.8,0.8,1.9
1,1600,high,partial_mask,0.0,2.8,4.2,3.5
2,1600,high,no_mask,0.0,2.7,15.4,7.7
3,1600,elementary,with_mask,2.7,3.0,1.0,2.1
4,1600,elementary,partial_mask,0.0,2.9,5.8,4.2
5,1600,elementary,no_mask,0.0,2.8,21.0,9.7
6,400,high,with_mask,32.1,0.7,0.2,0.5
7,400,high,partial_mask,7.8,0.7,1.0,0.8
8,400,high,no_mask,2.9,0.7,3.9,1.8
9,400,elementary,with_mask,40.0,0.8,0.3,0.5


# Methods

We assume that all students and teachers live in the same "community" that is subject to a fixed "community infection rate", which is the probability that people are infected. People who are infected from the community would go to school, with or without mask, and have the same probability of infecting others. People who show symptoms are isolated for 10 days before they go back to school, at which time they have recovered and are largely immune to reinfection. We **do not yet model other post-infection reactions such as contact tracing and testing**, which are being employed by many schools and universities to prevent the spread of the coronavirus.

We assume 

* **Community infection rate (CIR)**: CIR is the actual community rate of infection and is used to model the probability at which people are infected. The actual number will differ from the Public Health reported confirmed case numbers due to reporting delays and the sufficiency of testing (see https://ourworldindata.org/covid-models for an expanded discussion). As of writing, 
    * There are around `2000` confirmed cases per day in the Harris County with `4.7 M` people, or
    * Around `3000` confirmed cases per day in the Greater Houston Area with a population of about `9` million,
    
  so the raw CIR is about `400 / million`. Using a popular method to adjust for the under-reporting of actual CIR, namely multiplying the number of confirmed cases by a factor of 4, we use **1600 / million** as the CIR for the Greater Houston Area. Note that the estimates from [The Institute for Health Metrics and Evaluation (IHME) an independent global health research center at the University of Washington](https://covid19.healthdata.org/global?view=cumulative-deaths&tab=trend) is around 2121 per million for the state of Texas.

* **Vaccination coverage**: According to government report, 45.7% of people are fully vaccinated in Texas. We assume that adults (teachers) have higher vaccination rates and assume that
    * `60%` of teachers and staff are vaccinated in both high schools and elementary schools.
    * `30%` of students are vaccinated in high schools.
    * Students in elementary schools are not vaccinated.
  
* **Size of schools**: According to online information, we assume that
    * High schools have 800 students with a 15:1 teacher student ratio (50 teachers, 50 staff).
    * Elementary schools have 500 students with a 15:1 teacher student ratio (30 teachers, 30 staff).
    
* **Transmissibility of virus**: The highly contagious delta variant has become the most prevalent version of COVID-19 in the country, and also in the state of Texas. The delta variant has 
    * A reproduction number of around `6.5` for symptomatic cases (range 5-8)
    * A reproduction number of `4.6` for asymptomatic carriers because we assume that asymptomatic carriers are 70% infectious compared to symptomatic carriers.
    * An average incubation period of `4.4`
    * `30%` of people will remain asymptomatic after being infected.
    * `20%` of the population (teachers and students) have already been infected, and recovered. They are less susceptible to another infection (`50%` reduction) and have lower viral load (`50%` reduction) if they are infected.    
    
* **Infection patterns**:
    * We assume that students and teachers will spend half of the "interacting" time at home and half of the time at school (8 hours each), so only about half of the infection events will happen at school. 
    * We assume that students in elementary schools are divided into `25` classes with `20` students each. Because students in elementary schools usually only interact with their classmates and teachers, we assume that each student will interact with the rest of the class (`19` students), `3` teachers, and `5` random students they interact at hall way or cafeterias.
    * We assume that students in high schools are well mixed so everyone can infect any other student, teacher, or staff.

* **Post-infection reaction**:
    * We assume that no testing will be performed and students will only be sent home and isolate for 10 days after they have shown symptoms of COVID-19.
    * Teachers and students will be allowed to go back to school after 10 days. They are assumed to be recovered, but can be infected again.
    
* **Efficacy of vaccine**: Most Houstonians were vaccinated with the Pfizer vaccine, which is reported to
    * Has `42%` efficacy for preventing infection
    * Reduces viral loads of infected people by around `50%`
    * Has around `50%` efficacy for preventing secondary infection.
 
* **Duration of simulation**: We simulate the operation of schools for a month (28 days, 20 working days).
 
* **Mask wearing**: We simulate three scenarios,
   * no mask
   * with mask, which causes 80% reduction in the transmission of the virus
   * some mask, which corresponds to the case of no mandated mask wearing, perhaps 50% of people will wear mask and lead to 40% reduction in the transmission of the virus.
 

We output the number of infections and symptomatic cases in both high schools and elementary schools.


In [70]:
import pandas as pd

school_spec = {
    'high': {'teachers': 'T=100', 'students': 'S=800', 'vac_coverage': 'T=0.6 S=0.3' },
    'elementary': {'teachers': 'T=60', 
                   'students': ' '.join([f'S{i+1}=20' for i in range(25)]),
                   'vac_coverage': 'T=0.6 ' + ' '.join([f'S{i+1}=0' for i in range(25)]) }
}

communities = {
    'C1600': dict(community_infection_rate=1600 / 1000000),
    'C400': dict(community_infection_rate=400 / 1000000),
}


variants = {
    'delta': dict(sym_r0=[5, 8], asym_r0=[5*0.75, 8*0.75], incu_period=4.4, name='B.1.617.2', prop_asym_carriers=0.3),
}

# 86/76 for alpha; 76/42 for delta; and 75-81% protection against hospitalizations.
vaccination_specs = {
    'pfizer': dict(
            delta=dict(vac_immunity=[0.42, 0.38], vac_infectivity=[0.5, 0.5]),
        ),    
}

distancing_spec = {
    'no_mask': 1,
    'with_mask': 0.2,
    'partial_mask': 0.6
}

def get_single_context(
    school,
    distancing,
    community,
):
    variant = 'delta'
    vac_name = 'pfizer'
    return dict(
        name=f'{community}_{school}_{distancing}',
        school=school,
        mask_wearing=distancing,
        #
        community_infection_rate=communities[community]['community_infection_rate'],
        #
        vac_proportion=school_spec[school]['vac_coverage'],
        
        vac_immunity=' '.join(str(x) for x in vaccination_specs[vac_name][variant]['vac_immunity']),
        vac_infectivity=' '.join(str(x) for x in vaccination_specs[vac_name][variant]['vac_infectivity']),
                
        # constant
        pop_size=school_spec[school]['teachers'] + ' ' + school_spec[school]['students'],
        prop_asym_carriers=variants[variant]["prop_asym_carriers"],
        
        sym_r0=f'{variants[variant]["sym_r0"][0]:.2f} {variants[variant]["sym_r0"][1]:.2f}',
        asym_r0=f'{variants[variant]["asym_r0"][0]:.2f} {variants[variant]["asym_r0"][1]:.2f}',
        incu_period=variants[variant]["incu_period"],        
        # use 0.5 to assume that half of the infections will be to
        # non-school members
        distancing_multiplier=distancing_spec[distancing] * 0.5,
        duration=20,
        immunity_of_recovered=0.5,
        infectivity_of_recovered=0.5,  
        prop_recovered = 0.2,
        #
        # use more replicates for lower community infection rate to save computing time
        num_replicates=10000)


def unique_contexts(contexts):
    names = set()
    res = []
    for context in contexts:
        if not context:
            continue
        if context['name'] in names:
            continue
        res.append(context)
        names.add(context['name'])
    return res


def get_contexts(
    school_cases=[],    
    distancing_cases=[],    
    community_cases=[],
):
    contexts = []
    for community in community_cases:
        for school in school_cases:
            for distancing in distancing_cases:            
                contexts.append(
                                get_single_context(school, distancing, community))
    # remove duplicate
    return unique_contexts(contexts)


In [71]:
%preview cases -l 400

high_contexts = get_contexts(
    school_cases=['high'],
    community_cases=[
        'C1600', 'C400'
    ],
    distancing_cases=['with_mask', 'partial_mask', 'no_mask'],
)

elementary_contexts = get_contexts(
    school_cases=['elementary'],
    community_cases=[
        'C1600', 'C400'
    ],
    distancing_cases=['with_mask', 'partial_mask', 'no_mask'],
)

all_contexts = get_contexts(
    school_cases=['high', 'elementary'],
    community_cases=[
        'C1600', 'C400'
    ],
    distancing_cases=['with_mask', 'partial_mask', 'no_mask'],
)


cases = pd.DataFrame(all_contexts)

Unnamed: 0,name,school,mask_wearing,community_infection_rate,vac_proportion,vac_immunity,vac_infectivity,pop_size,prop_asym_carriers,sym_r0,asym_r0,incu_period,distancing_multiplier,duration,immunity_of_recovered,infectivity_of_recovered,prop_recovered,num_replicates
0,C1600_high_with_mask,high,with_mask,0.0,T=0.6 S=0.3,0.42 0.38,0.5 0.5,T=100 S=800,0.3,5.00 8.00,3.75 6.00,4.4,0.1,20,0.5,0.5,0.2,10000
1,C1600_high_partial_mask,high,partial_mask,0.0,T=0.6 S=0.3,0.42 0.38,0.5 0.5,T=100 S=800,0.3,5.00 8.00,3.75 6.00,4.4,0.3,20,0.5,0.5,0.2,10000
2,C1600_high_no_mask,high,no_mask,0.0,T=0.6 S=0.3,0.42 0.38,0.5 0.5,T=100 S=800,0.3,5.00 8.00,3.75 6.00,4.4,0.5,20,0.5,0.5,0.2,10000
3,C1600_elementary_with_mask,elementary,with_mask,0.0,T=0.6 S1=0 S2=0 S3=0 S4=0 S5=0 S6=0 S7=0 S8=0 S9=0 S10=0 S11=0 S12=0 S13=0 S14=0 S15=0 S16=0 S17=0 S18=0 S19=0 S20=0 S21=0 S22=0 S23=0 S24=0 S25=0,0.42 0.38,0.5 0.5,T=60 S1=20 S2=20 S3=20 S4=20 S5=20 S6=20 S7=20 S8=20 S9=20 S10=20 S11=20 S12=20 S13=20 S14=20 S15=20 S16=20 S17=20 S18=20 S19=20 S20=20 S21=20 S22=20 S23=20 S24=20 S25=20,0.3,5.00 8.00,3.75 6.00,4.4,0.1,20,0.5,0.5,0.2,10000
4,C1600_elementary_partial_mask,elementary,partial_mask,0.0,T=0.6 S1=0 S2=0 S3=0 S4=0 S5=0 S6=0 S7=0 S8=0 S9=0 S10=0 S11=0 S12=0 S13=0 S14=0 S15=0 S16=0 S17=0 S18=0 S19=0 S20=0 S21=0 S22=0 S23=0 S24=0 S25=0,0.42 0.38,0.5 0.5,T=60 S1=20 S2=20 S3=20 S4=20 S5=20 S6=20 S7=20 S8=20 S9=20 S10=20 S11=20 S12=20 S13=20 S14=20 S15=20 S16=20 S17=20 S18=20 S19=20 S20=20 S21=20 S22=20 S23=20 S24=20 S25=20,0.3,5.00 8.00,3.75 6.00,4.4,0.3,20,0.5,0.5,0.2,10000
5,C1600_elementary_no_mask,elementary,no_mask,0.0,T=0.6 S1=0 S2=0 S3=0 S4=0 S5=0 S6=0 S7=0 S8=0 S9=0 S10=0 S11=0 S12=0 S13=0 S14=0 S15=0 S16=0 S17=0 S18=0 S19=0 S20=0 S21=0 S22=0 S23=0 S24=0 S25=0,0.42 0.38,0.5 0.5,T=60 S1=20 S2=20 S3=20 S4=20 S5=20 S6=20 S7=20 S8=20 S9=20 S10=20 S11=20 S12=20 S13=20 S14=20 S15=20 S16=20 S17=20 S18=20 S19=20 S20=20 S21=20 S22=20 S23=20 S24=20 S25=20,0.3,5.00 8.00,3.75 6.00,4.4,0.5,20,0.5,0.5,0.2,10000
6,C400_high_with_mask,high,with_mask,0.0,T=0.6 S=0.3,0.42 0.38,0.5 0.5,T=100 S=800,0.3,5.00 8.00,3.75 6.00,4.4,0.1,20,0.5,0.5,0.2,10000
7,C400_high_partial_mask,high,partial_mask,0.0,T=0.6 S=0.3,0.42 0.38,0.5 0.5,T=100 S=800,0.3,5.00 8.00,3.75 6.00,4.4,0.3,20,0.5,0.5,0.2,10000
8,C400_high_no_mask,high,no_mask,0.0,T=0.6 S=0.3,0.42 0.38,0.5 0.5,T=100 S=800,0.3,5.00 8.00,3.75 6.00,4.4,0.5,20,0.5,0.5,0.2,10000
9,C400_elementary_with_mask,elementary,with_mask,0.0,T=0.6 S1=0 S2=0 S3=0 S4=0 S5=0 S6=0 S7=0 S8=0 S9=0 S10=0 S11=0 S12=0 S13=0 S14=0 S15=0 S16=0 S17=0 S18=0 S19=0 S20=0 S21=0 S22=0 S23=0 S24=0 S25=0,0.42 0.38,0.5 0.5,T=60 S1=20 S2=20 S3=20 S4=20 S5=20 S6=20 S7=20 S8=20 S9=20 S10=20 S11=20 S12=20 S13=20 S14=20 S15=20 S16=20 S17=20 S18=20 S19=20 S20=20 S21=20 S22=20 S23=20 S24=20 S25=20,0.3,5.00 8.00,3.75 6.00,4.4,0.1,20,0.5,0.5,0.2,10000


In [55]:
input: for_each=high_contexts

task: queue='localhost', cores=4, walltime='1h', mem='4G', tags=name, workdir='.', trunk_size=1

sh: expand=True
  #rm -f {name}.log.lock
  outbreak_simulator --popsize {pop_size} \
      --stop-if 't>{duration}' -j 4 \
      --track-events INFECTION WARNING PLUGIN SHOW_SYMPTOM QUARANTINE \
      --repeat {num_replicates} \
      --symptomatic-r0 {sym_r0} T={distancing_multiplier} S={distancing_multiplier} \
      --asymptomatic-r0 {asym_r0} T={distancing_multiplier} S={distancing_multiplier} \
      --immunity-of-recovered {immunity_of_recovered} \
      --infectivity-of-recovered {infectivity_of_recovered} \
      --incubation-period {incu_period} \
      --prop-asym-carriers {prop_asym_carriers} \
      --handle-symptomatic 'quarantine?duration=10&infected=true' \
      --logfile {name}.log \
      --plugin init \
          --incidence-rate {community_infection_rate} \
          --seroprevalence {prop_recovered} --as-proportion \
      --plugin vaccinate \
          --start 0 --proportion {vac_proportion} --immunity {vac_immunity} \
          --infectivity {vac_infectivity} \
      --plugin community_infection --start 0 --interval 1 --probability {community_infection_rate} 

0,1,2,3,4
,t41e9eda25cab4272,C400_high_with_maskback-to-schoolbcddf2f845aa4bc8cell_6fe6ebeb,Ran for 3 min 39 sec,completed


0,1,2,3,4
,ta469b4cb3da6d5a4,C400_high_partial_maskback-to-schoolbcddf2f845aa4bc8cell_6fe6ebeb,Ran for 5 min 58 sec,completed


0,1,2,3,4
,t762f44fc45ece11f,C1600_high_with_maskback-to-schoolbcddf2f845aa4bc8cell_6fe6ebeb,Ran for 5 min 9 sec,completed


0,1,2,3,4
,ta3e29d86a8f53463,C1600_high_partial_maskback-to-schoolbcddf2f845aa4bc8cell_6fe6ebeb,Ran for 10 min 44 sec,completed


0,1,2,3,4
,tfec0e28cde5b24a3,C1600_high_no_maskback-to-schoolbcddf2f845aa4bc8cell_6fe6ebeb,Ran for 20 min 34 sec,completed


0,1,2,3,4
,tf430e1a993d4f335,C400_high_no_maskback-to-schoolbcddf2f845aa4bc8cell_6fe6ebeb,Ran for 6 min 59 sec,completed


In [56]:
input: for_each=elementary_contexts

task: queue='localhost', cores=4, walltime='1h', mem='4G', tags=name, workdir='.', trunk_size=1

sh: expand=True
  #rm -f {name}.log.lock
  outbreak_simulator --popsize {pop_size} \
      --stop-if 't>{duration}' -j 4 \
      --track-events INFECTION WARNING PLUGIN SHOW_SYMPTOM QUARANTINE \
      --vicinity 'T-S*=0.5' 'T-T=10' 'S*-!&=0.25' 'S*-&=19' 'S*-T=3' \
      --repeat {num_replicates} \
      --symptomatic-r0 {sym_r0} T={distancing_multiplier} 'S*={distancing_multiplier}' \
      --asymptomatic-r0 {asym_r0} T={distancing_multiplier} 'S*={distancing_multiplier}' \
      --immunity-of-recovered {immunity_of_recovered} \
      --infectivity-of-recovered {infectivity_of_recovered} \
      --incubation-period {incu_period} \
      --prop-asym-carriers {prop_asym_carriers} \
      --handle-symptomatic 'quarantine?duration=10&infected=true' \
      --logfile {name}.log \
      --plugin init \
          --incidence-rate {community_infection_rate} \
          --seroprevalence {prop_recovered} --as-proportion \
      --plugin vaccinate \
          --start 0 --proportion {vac_proportion} --immunity {vac_immunity} \
          --infectivity {vac_infectivity} \
      --plugin community_infection --start 0 --interval 1 --probability {community_infection_rate} 

0,1,2,3,4
,taced674afe1abd73,74f8b7075f8a3488C1600_elementary_with_maskback-to-schoolcell_4cf554b3,Ran for 22 min 38 sec,completed


0,1,2,3,4
,tb4fd858b61b844da,74f8b7075f8a3488C400_elementary_with_maskback-to-schoolcell_4cf554b3,Ran for 21 min 23 sec,completed


0,1,2,3,4
,t27fb75aaea764b88,74f8b7075f8a3488C1600_elementary_partial_maskback-to-schoolcell_4cf554b3,Ran for 24 min 10 sec,completed


0,1,2,3,4
,t1c0cd150491bd54b,74f8b7075f8a3488C400_elementary_partial_maskback-to-schoolcell_4cf554b3,Ran for 21 min 54 sec,completed


0,1,2,3,4
,t08b9bda7b7565c9e,74f8b7075f8a3488C1600_elementary_no_maskback-to-schoolcell_4cf554b3,Ran for 23 min 24 sec,completed


0,1,2,3,4
,tb7941abd8f5ccba9,74f8b7075f8a3488C400_elementary_no_maskback-to-schoolcell_4cf554b3,Ran for 16 min 17 sec,completed


In [57]:
import pickle
import os
import re
import csv
import pandas as pd
from collections import defaultdict

def get_result(name):
    print(f'Processing {name}', flush=True)

    res = {}
    
    file = open(name + '.log', 'rU')
    reader = csv.reader(file, delimiter='\t')
    headers = next(reader, None)
    IDs = set()
    
    n_s_workplace_infection = 0
    n_t_workplace_infection = 0
    n_s_community_infection = 0
    n_t_community_infection = 0
    n_s_quarantine_due_to_symptom = 0
    n_t_quarantine_due_to_symptom = 0
    
    n_workplace_infections = defaultdict(int)
    n_community_infections = defaultdict(int)
    
    for row in reader:
        IDs.add(row[0])
        if row[2] == 'INFECTION':
            if row[4].startswith('by=.'):
                if row[3].startswith('S'):
                    n_s_community_infection += 1                    
                else:
                    n_t_community_infection += 1
                n_community_infections[row[0]] += 1
            else:
                if row[3].startswith('S'):
                    n_s_workplace_infection += 1                    
                else:
                    n_t_workplace_infection += 1
                n_workplace_infections[row[0]] += 1
        elif row[2] == 'QUARANTINE':
            if 'reason=show symptom' in row[4]:
                if row[3].startswith('S'):
                    n_s_quarantine_due_to_symptom += 1                    
                else:
                    n_t_quarantine_due_to_symptom += 1      
        elif row[2] == 'REMOVAL':
            if row[3].startswith('S'):
                n_s_quarantine_due_to_symptom += 1                    
            else:
                n_t_quarantine_due_to_symptom += 1                      
    file.close()
    
    replicates = len(IDs)
    
    # we need to scale to 2000 person weeks
    # num_events = 8 weeks * 60 people * replicates
    # num_events * 2000/replicates/8/60 = 2000 week person
    school = "high" if 'high' in name else 'elementary'
    if school == "high":
        scaling_factor = 1 / replicates
        n_teachers = 100
        n_students = 800
    else:
        scaling_factor = 1 / replicates
        n_teachers = 60
        n_students = 500

    res['Proportion of schools with no infection %'] = 100 - len(n_workplace_infections) * scaling_factor * 100
    
    #res['Avg number of community acquired Infections'] = (n_s_community_infection + n_t_community_infection ) * scaling_factor    
    #res['Avg number of school acquired Infections'] = (n_s_workplace_infection + n_t_workplace_infection) * scaling_factor
    
    res['Avg proportion of students infected from community %'] = (n_s_community_infection * scaling_factor) / n_students * 100
    
    res['Avg proportion of students infected from school %'] = (n_s_workplace_infection * scaling_factor) / n_students * 100
    #res['Avg proportion of teacher infected from school %'] = (n_t_workplace_infection * scaling_factor) / n_teachers * 100
    
    res['Avg proportion of students isolated %'] = n_s_quarantine_due_to_symptom * scaling_factor  / n_students * 100
    #res['Avg proportion of symptomatic teachers %'] = n_t_quarantine_due_to_symptom * scaling_factor  / n_teachers * 100
    #res['num_replicates'] = replicates

    with open(name + '.pickle', 'wb') as outfile:
        pickle.dump(res, outfile)
    return res

In [58]:
for name in [x['name'] for x in all_contexts]:
    get_result(name)

Processing C1600_high_with_mask
Processing C400_high_with_mask
Processing C1600_high_partial_mask
Processing C400_high_partial_mask
Processing C1600_high_no_mask
Processing C400_high_no_mask
Processing C1600_elementary_with_mask
Processing C400_elementary_with_mask
Processing C1600_elementary_partial_mask
Processing C400_elementary_partial_mask
Processing C1600_elementary_no_mask
Processing C400_elementary_no_mask
