# Adjusting Timezones


In [None]:
from pandas import read_csv
tz = read_csv('../src/data/timezones.csv')
tz.head()

Unnamed: 0,CountryCode,CountryName,TimeZoneName,TimeZoneOffset
0,AD,Andorra,Europe/Andorra,UTC +02:00
1,AE,United Arab Emirates,Asia/Dubai,UTC +04:00
2,AF,Afghanistan,Asia/Kabul,UTC +04:30
3,AG,Antigua and Barbuda,America/Antigua,UTC -04:00
4,AI,Anguilla,America/Anguilla,UTC -04:00


Compute the UTC offset as a number to be able to perform math with it, e.g. UTC +02:00 -> 2.0

In [None]:
import re
tz.TimeZoneOffset = tz.TimeZoneOffset.apply(lambda x: 'UTC +00:00' if x == 'UTC' else x)
tz['UTCOffset'] = tz.TimeZoneOffset.apply(
    lambda x: re.sub(r'(\d\d):(\d\d)', lambda g: str(int(g.group(1)) + int(g.group(2)) / 60), x[4:]))
tz['UTCOffset'] = tz.UTCOffset.apply(lambda x: float(x))
tz.head()

Unnamed: 0,CountryCode,CountryName,TimeZoneName,TimeZoneOffset,UTCOffset
0,AD,Andorra,Europe/Andorra,UTC +02:00,2.0
1,AE,United Arab Emirates,Asia/Dubai,UTC +04:00,4.0
2,AF,Afghanistan,Asia/Kabul,UTC +04:30,4.5
3,AG,Antigua and Barbuda,America/Antigua,UTC -04:00,-4.0
4,AI,Anguilla,America/Anguilla,UTC -04:00,-4.0


CET is the timezone of the ECDC reports, to go from GMT (UTC) to CET just substract 2

In [None]:
tz['CETOffset'] = tz.UTCOffset - 2
tz.head()

Unnamed: 0,CountryCode,CountryName,TimeZoneName,TimeZoneOffset,UTCOffset,CETOffset
0,AD,Andorra,Europe/Andorra,UTC +02:00,2.0,0.0
1,AE,United Arab Emirates,Asia/Dubai,UTC +04:00,4.0,2.0
2,AF,Afghanistan,Asia/Kabul,UTC +04:30,4.5,2.5
3,AG,Antigua and Barbuda,America/Antigua,UTC -04:00,-4.0,-6.0
4,AI,Anguilla,America/Anguilla,UTC -04:00,-4.0,-6.0


ECDC reports come out at 10 AM CET, assuming that the data will be coming from a local authority reported by 8 PM local time we can compute the cutoff time for the data to be included in the ECDC report and derive a date offset

In [None]:
# 
tz['ReportOffset'] = tz.CETOffset + 10
tz['ReportOffsetDays'] = tz.ReportOffset.apply(lambda x: (x - 20) // 24).astype(int)
tz.head()

Unnamed: 0,CountryCode,CountryName,TimeZoneName,TimeZoneOffset,UTCOffset,CETOffset,ReportOffset,ReportOffsetDays
0,AD,Andorra,Europe/Andorra,UTC +02:00,2.0,0.0,10.0,-1
1,AE,United Arab Emirates,Asia/Dubai,UTC +04:00,4.0,2.0,12.0,-1
2,AF,Afghanistan,Asia/Kabul,UTC +04:30,4.5,2.5,12.5,-1
3,AG,Antigua and Barbuda,America/Antigua,UTC -04:00,-4.0,-6.0,4.0,-1
4,AI,Anguilla,America/Anguilla,UTC -04:00,-4.0,-6.0,4.0,-1
