<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Wat-zijn-succesvolle-strategieen-om-positieve-mensen-op-te-sporen?" data-toc-modified-id="Wat-zijn-succesvolle-strategieen-om-positieve-mensen-op-te-sporen?-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Wat zijn succesvolle strategieen om positieve mensen op te sporen?</a></span></li></ul></div>

Deze notebook heeft tot doel om enkele datasets te combineren om meer te weten te komen over de effectiviteit van de CoronaMelder.

In [None]:
import matplotlib.pyplot as plt
import pandas as pd
import datetime as dt
import json
import config

## Wat zijn succesvolle strategieen om positieve mensen op te sporen? 

Doel: verschillende methoden vergelijken zoals:
* Percentage positieven uitslagen algemeen
* Percentage positieve uitslagen via BCO
* Percentage positieve uitslagen via CoronaMelder (met en zonder klachten)
* Eventueel prevalentie
* Eventueel uitslagen BCO Huisgenoten en BCO nauwe contacten

Er zijn verschillende datasets beschikbaar:
1. [GGD App statistics](https://github.com/minvws/nl-covid19-notification-app-statistics)
2. [COVID weekberichten van het RIVM](https://www.rivm.nl/coronavirus-covid-19/actueel/wekelijkse-update-epidemiologische-situatie-covid-19-in-nederland). In tabellen 10 en 12 zijn de resultaten van BCO uitgeplitst.
3. [RIVM uitgevoerde testen](https://data.rivm.nl/geonetwork/srv/dut/catalog.search#/metadata/0f3336f5-0f16-462c-9031-bb60adde4af1)

In [None]:
PATH = config.PATH_RIVM

In [None]:
# Eerst de algmene testcijfers van het RIVM inladen.

columns = ['Date_of_statistics', 'Security_region_name', 'Tested_with_result', 'Tested_positive']

algemeen = pd.read_csv(PATH + 'COVID-19_uitgevoerde_testen.csv', 
                    sep=';', 
                    usecols=columns, 
                    parse_dates=['Date_of_statistics'])

# Omzetten naar weeknummers

algemeen['Week_number'] = algemeen['Date_of_statistics'].dt.isocalendar().week

# Columns hernoemen

algemeen = algemeen.rename(columns={'Tested_with_result': 'RIVM_tested_with_result',
                                    'Tested_positive': 'RIVM_tested_positive'})

# Agreggeren naar weeknummer

algemeen = algemeen.groupby('Week_number').sum(['Tested_with_result', 'Tested_positive']).reset_index()

algemeen.head()

In [None]:
# Vervolgens csv van de CoronaMelder inladen

melder = pd.read_csv('data/nederland/coronamelder_positief.csv')
melder = melder.rename(columns={'Week': 'Week_number'})
melder.head()

In [None]:
# BCO csv inladen

bco = pd.read_csv(PATH + 'rivm_bco.csv')

bco.head()

In [None]:
contacten = pd.read_csv(PATH + 'rivm_contactonderzoek_nauwe_contacten.csv')

contacten.head()

In [None]:
# Dataframes samenvoegen

tests = melder.merge(algemeen, on='Week_number', how='left')

# Converteer week en jaar naar datums (eerste dag van de week). Dit komt later beter van pas

tests['start_week'] = pd.to_datetime(tests['Week_number'].astype(str) + ' 1' + ' ' + tests['Year'].astype(str) ,
                                format='%U %w %Y').dt.strftime('%d-%m-%Y')
tests.head()

In [None]:
df1 = tests.merge(bco, on='start_week', how='left')

In [None]:
df1.head()

In [None]:
df = df1.merge(contacten, on='start_week', how='left')

In [None]:
df.columns

In [None]:
df = df.rename(columns={'Test Requests': 'CM_testaanvragen',
                        'Total Test Results': 'CM_aantal_testresultaten',
                        'Total Positive': 'CM_positieve_testen',
                        'Total %Positive': 'CM_percentage_positieve testen',
                        'Asymptomatic Test Results': 'CM_asymp_testresultaten',
                        'Asymptomatic Positive': 'CM_asymp_positieve_testen',
                        'Asymptomatic %Positive': 'CM_asymp_percentage_positieve_testen',
                        'Symptomatic Test Results': 'CM_symp_testresultaten',
                        'Symptomatic Positive': 'CM_symp_positieve_testen',
                        'Symptomatic %Positive': 'CM_symp_percentage_positieve_testen',
                        'RIVM_tested_with_result': 'RIVM_testresultaten',
                        'RIVM_tested_positive': 'RIVM_positieve_testen',
                        'start_week': 'datum',
                        'nieuwe_meldingen': 'RIVM_nieuwe_meldingen',
                        'gevonden_via_bco': 'RIVM_gevonden_via_bco',
                        'gevonden_via_bco_%': 'RIVM_percentage_gevonden_via_bco',
                        'aantal': 'RIVM_nauwe_contacten_opgevolgd',
                        'huisgenoten_positief': 'RIVM_huisgenoten_positief',
                        'huisgenoten_positief_%': 'RIVM_percentage_huisgenoten_positief',
                        'overigen_nauwe_contacten_aantal': 'RIVM_andere_nauwe_contacten',
                        'overige_nauwe_contacten_aantal_positief': 'RIVM_andere_nauwe_contacten_positief',
                        'overige_nauwe_contacten_aantal_positief_%': 'RIVM_percentage_andere_nauwe_contacten_positief'})

df = df[['datum',
         'RIVM_testresultaten',
         'RIVM_positieve_testen',
         'RIVM_nieuwe_meldingen',
         'RIVM_gevonden_via_bco',
         'RIVM_percentage_gevonden_via_bco',
         'RIVM_nauwe_contacten_opgevolgd',
         'RIVM_huisgenoten_positief',
         'RIVM_percentage_huisgenoten_positief',
         'RIVM_andere_nauwe_contacten',
         'RIVM_andere_nauwe_contacten_positief',
         'RIVM_percentage_andere_nauwe_contacten_positief',
         'CM_testaanvragen',
         'CM_aantal_testresultaten',
        'CM_positieve_testen',
        'CM_percentage_positieve testen',
        'CM_asymp_testresultaten',
        'CM_asymp_positieve_testen',
        'CM_asymp_percentage_positieve_testen',
        'CM_symp_testresultaten',
        'CM_symp_positieve_testen',
        'CM_symp_percentage_positieve_testen']]
               
    

In [None]:
df.columns

In [None]:
df['CM_percentage_positieve testen'] = df['CM_percentage_positieve testen'].str.replace('%', '').astype(float)
df['CM_asymp_percentage_positieve_testen'] = df['CM_asymp_percentage_positieve_testen'].str.replace('%', '').astype(float)
df['CM_symp_percentage_positieve_testen'] = df['CM_symp_percentage_positieve_testen'].str.replace('%', '').astype(float)

In [None]:
df.plot(x='datum', y=['RIVM_percentage_gevonden_via_bco',
                    'RIVM_percentage_huisgenoten_positief',
                    'RIVM_percentage_andere_nauwe_contacten_positief',
                    'CM_percentage_positieve testen',
                    'CM_asymp_percentage_positieve_testen',
                    'CM_symp_percentage_positieve_testen'], figsize=(20,10),)

plt.grid(True)

In [None]:
# Enkele nieuwe kolommen berekenen

df['Tested_postive_after_warning'] = df['Total Positive'] / df['Tested_with_result'] * 100
df['Tested_positive_all'] = df['Tested_positive'] / df['Tested_with_result'] *100
df['Tested_positive_without_symptoms'] = df['Asymptomatic Positive'] / df['Tested_with_result'] * 100
df['Tested_positive_without_symptoms_as_perc_of_positives '] = df['Asymptomatic Positive'] / df['Tested_positive'] * 100
df['Tested_positive_with_symptoms'] = df['Symptomatic Positive'] / df['Tested_with_result'] * 100
df['Tested_positive_with_symptoms_as_perc_of_positives'] =  df['Symptomatic Positive'] / df['Tested_positive'] *100