Information for Vaccinated people that have contracted Covid-19 during the following periods:
* Period 1 = Alpha variant (dominant)
* Period 2 = Delta variant (available data for cross reference for all infections)
* Period 3= Delta variant (higher levels of infection than both Period 1 and 2)

Currently the Periods are set as:


1.   Period 1 = 15th Jan 2021 - 30th Jun 2021
2.   Period 2 = 01st JUL 2021 - 25th Nov 2021
3.   Period 3 = 26th Nov 2021 - current date

In [31]:
import pandas as pd

If you update the period ranges below:

```
period2_end = pd.to_datetime('2021-10-25')
```
to 
```
period2_end = pd.to_datetime('2022-01-04')
```
Then you will exptend the timeframe for Period 2 to include current data.

In [32]:
period1_begining = pd.to_datetime('2021-01-15')
period1_end = pd.to_datetime('2021-06-30')
period2_begining = pd.to_datetime('2021-07-01')
period2_end = pd.to_datetime('2021-10-25')


def generate_periods(date):
      if period1_begining <= date.to_pydatetime() <= period1_end:
              return "Period 1"
      if period2_begining <= date.to_pydatetime() <= period2_end:
              return "Period 2"
      if date.to_pydatetime() > period2_end:
              return "Period 3"
      return "Exclude"

In [33]:
infected_vacc = pd.read_csv('https://data.egov.bg/resource/download/e9f795a8-0146-4cf0-9bd1-c0ba3d9aa124/csv')
hospitalized_vacc = pd.read_csv('https://data.egov.bg/resource/download/6fb4bfb1-f586-45af-8dd2-3385499c3664/csv')
deceased_vacc = pd.read_csv('https://data.egov.bg/resource/download/e6a72183-28e0-486a-b4e4-b5db8b60a900/csv')

In [34]:
infected_vacc.rename(columns={'Дата': 'date', 
                              'Ваксина':'vaccine', 
                              'Пол': 'gender', 
                              'Възрастова група': 'age_group', 
                              'Брой заразени': 'count_infected_vaccinated'}, inplace=True)

hospitalized_vacc.rename(columns={'Дата': 'date', 
                                  'Ваксина':'vaccine', 
                                  'Пол': 'gender', 
                                  'Възрастова група': 'age_group', 
                                  'Брой хоспитализирани': 'count_hospitalized_vaccinated'}, inplace=True)

deceased_vacc.rename(columns={'Дата': 'date', 
                              'Ваксина':'vaccine', 
                              'Пол': 'gender', 
                              'Възрастова група': 'age_group', 
                              'Брой починали': 'count_deceased_vaccinated'}, inplace=True)

In [35]:
infected_vacc['date'] = pd.to_datetime(infected_vacc['date']) 
hospitalized_vacc['date'] = pd.to_datetime(hospitalized_vacc['date']) 
deceased_vacc['date'] = pd.to_datetime(deceased_vacc['date'])

In [36]:
infected_vacc['Period'] = infected_vacc.apply(lambda x: generate_periods(x['date']), axis=1)
hospitalized_vacc['Period'] = hospitalized_vacc.apply(lambda x: generate_periods(x['date']), axis=1)
deceased_vacc['Period'] = deceased_vacc.apply(lambda x: generate_periods(x['date']), axis=1)

In [37]:
infected_vacc.drop(['vaccine','gender', 'date'], axis=1, inplace=True)
hospitalized_vacc.drop(['vaccine','gender', 'date'], axis=1, inplace=True)
deceased_vacc.drop(['vaccine','gender', 'date'], axis=1, inplace=True)

In [38]:
infected_vacc = infected_vacc.groupby(['Period', 'age_group'], as_index=False).sum()
hospitalized_vacc = hospitalized_vacc.groupby(['Period', 'age_group'], as_index=False).sum()
deceased_vacc = deceased_vacc.groupby(['Period', 'age_group'], as_index=False).sum()

In [39]:
mask = infected_vacc['age_group'] == '-'
infected_vacc = infected_vacc[~mask]
mask = hospitalized_vacc['age_group'] == '-'
hospitalized_vacc = hospitalized_vacc[~mask]
mask = deceased_vacc['age_group'] == '-'
deceased_vacc = deceased_vacc[~mask]

In [40]:
age_groups = ['12 - 14', '15 - 16', '17 - 19', '20 - 29', '30 - 39', 
              '40 - 49', '50 - 59', '60 - 69', '70 - 79', '80 - 89', '90+']

full_data = pd.DataFrame({
  'Period': [period for period in age_groups for period in ['Period 1', 'Period 2', 'Period 3']],
  'age_group': [age for age in age_groups for period in ['Period 1', 'Period 2', 'Period 3']]
})

In [41]:
full_data = full_data.merge(infected_vacc, on=['Period', 'age_group'], how='left')
full_data = full_data.merge(hospitalized_vacc, on=['Period', 'age_group'], how='left')
full_data = full_data.merge(deceased_vacc, on=['Period', 'age_group'], how='left')

In [42]:
full_data = full_data.sort_values(['Period', 'age_group'])

In [43]:
full_data

Unnamed: 0,Period,age_group,count_infected_vaccinated,count_hospitalized_vaccinated,count_deceased_vaccinated
0,Period 1,12 - 14,,,
3,Period 1,15 - 16,,,
6,Period 1,17 - 19,,,
9,Period 1,20 - 29,25.0,2.0,
12,Period 1,30 - 39,58.0,2.0,
15,Period 1,40 - 49,83.0,13.0,
18,Period 1,50 - 59,132.0,17.0,
21,Period 1,60 - 69,121.0,26.0,2.0
24,Period 1,70 - 79,57.0,15.0,2.0
27,Period 1,80 - 89,34.0,16.0,1.0


In [44]:
full_data.to_csv('BG_Vaccinated_Infections_Periods[3].csv', index=False)

In [45]:
from google.colab import files
files.download("BG_Vaccinated_Infections_Periods[3].csv")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>