# Getting RIVM stats on COVID-19 cases in NL and send it via Telegram 

Scrapping the page:

In [1]:
import cfscrape
from lxml import etree

url="https://www.rivm.nl/coronavirus-kaart-van-nederland-per-gemeente"

header = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9',
          'Accept-Encoding': 'gzip, deflate, sdch',
          'Accept-Language' : 'nl-NL,nl;q=0.8,en-US;q=0.6,en;q=0.4',
          'Cache-Control' : 'max-age=0',
          'Connection': 'keep-alive',
          'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.81 Safari/537.36'}

scraper = cfscrape.create_scraper()
scraped_html=scraper.get(url,headers=header).content
html = etree.HTML(scraped_html)

date = html.xpath("//div[@id='mapTitles']/text()")[0].split('tot en met ')[1].split('"')[0]
time = html.xpath("//p/text()")[0].split()[5]
data = html.xpath("//div[@id='csvData']/text()")

print("Last update from the RIVM page:",date,time)

Last update from the RIVM page: 21-3-2020 14.00


Loading the data in a dataframe:

In [2]:
import pandas as pd
import io
df = pd.read_csv(io.StringIO('\n'.join(str(data[0]).split('\n')[1:])), sep=';')

Workaround to get cases in 'unknown' municipalities:

In [3]:
import re
aantal_unknown_gemeente = int(re.findall(r'\d+',df['Gemeente'][0])[0])
df.loc[0,'Gemeente']='Unknown'
df.loc[0,'Aantal']= aantal_unknown_gemeente

Test whether Gemeente name exists:

In [134]:
df[df['Gemeente']=='Zoetermeer']

Unnamed: 0,Gemnr,Gemeente,Aantal,BevAant,Aantal per 100.000 inwoners
348,637,Zoetermeer,4,124944.0,3.2


Defining the list of municipalities that were requested in the Telegram group:

In [135]:
gemeentes_requested=['Utrecht',
                    'Enschede',
                    'Haarlemmermeer',
                    'Houten',
                    'Leiden',
                    'Arnhem',
                    'Ridderkerk',
                    'Zuidplas',
                    'Nieuwegein',
                    'Leusden',
                    'Rheden',
                    'Amersfoort',
                    'Woerden',
                    'Epe',
                    'Altena',
                    'Apeldoorn',
                    'Nijmegen',
                    'Zoetermeer']
gemeentes_requested.sort()

Querying the data for each municipality and composing the message:

In [139]:
message="RIVM last-update: "+ date + "\n"
message+="RIVM updates their Website at 14:00\n"
message+="https://www.rivm.nl/coronavirus-kaart-van-nederland-per-gemeente\n"
message+="We send you hourly messages (from 8-21)!\n"

aantal_total = df['Aantal'].sum()
message+="\n- TOTAL CASES IN NL: " + str(aantal_total) +"\n"

for gemeente in gemeentes_requested:
    message += "- Cases in "+gemeente+": "+str(df[df['Gemeente']==gemeente]['Aantal'].values[0])+"\n"

message+="\nNote: RIVM stated that \"the actual number of infections with COVID-19 is higher than the number of reports in this update because not everyone suspected of a COVID-19 infection is tested.\"\n"
message+="\nWhich municipality should I add here?\nYour request will appear in the next hour. Or not \U0001f600"
print(message)          


RIVM last-update: 21-3-2020
RIVM updates their Website at 14:00
https://www.rivm.nl/coronavirus-kaart-van-nederland-per-gemeente
We send you hourly messages (from 8-21)!

- TOTAL CASES IN NL: 3631
- Cases in Altena: 22
- Cases in Amersfoort: 37
- Cases in Apeldoorn: 16
- Cases in Arnhem: 15
- Cases in Enschede: 16
- Cases in Epe: 6
- Cases in Haarlemmermeer: 12
- Cases in Houten: 19
- Cases in Leiden: 16
- Cases in Leusden: 6
- Cases in Nieuwegein: 9
- Cases in Nijmegen: 60
- Cases in Rheden: 7
- Cases in Ridderkerk: 4
- Cases in Utrecht: 121
- Cases in Woerden: 6
- Cases in Zoetermeer: 4
- Cases in Zuidplas: 1

Note: RIVM stated that "the actual number of infections with COVID-19 is higher than the number of reports in this update because not everyone suspected of a COVID-19 infection is tested."

Which municipality should I add here?
Your request will appear in the next hour. Or not 😀


Function to send a Telegram message:

In [38]:
import requests
def telegram_bot_sendtext(bot_message,token,chatid):
    bot_token = token
    bot_chatID = chatid
    send_text = 'https://api.telegram.org/bot' + bot_token + '/sendMessage?chat_id=' + bot_chatID + '&parse_mode=Markdown&text=' + bot_message
    response = requests.get(send_text)
    if response.json()['ok']:
        return "Message Sent!"
    else:
        return "Message failed to be sent!"

Instantiating the function and sending a Telegram message:

In [132]:
bot_token = '735833549:AAH40xNnmV7MzSRNtssydxKzgCHmx63YGeo'
bot_chatID = '-348129225'
telegram_bot_sendtext(message,bot_token,bot_chatID)

'Message Sent!'

Saving the data if it is a new day:

In [140]:
import pathlib
file_name = pathlib.Path('/Users/santannajj/Desktop/crawling_stats_from_rivm_covid19/data/covid19-nl-'+date+'-'+time+'-'+str(aantal_total)+'.csv')
if not file_name.exists ():
    df.to_csv(file_name, index = False)
else:
    "Th

Crontab line:
```
5 08-21 * * * /usr/local/bin/python3 /Users/santannajj/Desktop/crawling_stats_from_rivm_covid19/crawling_stats_from_rivm_covid19.py
```