# Covid vs India 
One of the biggest question I have been wondering is , with every adult allowed now to register, Where are vaccines slots available in india ? How can I make this decision programmitcally because trying to find information on government website is rather "sisyphus task".

So with rather self involed goal in mind, We would using publicly avaialbe data try to answer following questions

1. Latest vaccine slots per district and State in India
2. Top 5 States Running behind on schedule
3. Top 5 States Running ahead of everyone else.
4. When can I latest find slots in Delhi, Bombay, Chennai and Bangalore for people between 18-45


- toc: false
- branch: master
- badges: false,
- comments: true,
- categories: [vaccine, covid, jupyter, python],
- image: images/statistics.png,
- hide: false

Thanks to https://github.com/bhattbhavesh91/cowin-vaccination-slot-availability/blob/main/cowin-api-availability.ipynb for doing actual work, I picked up loads of stuff from there.


In [212]:
!{sys.executable} -m pip install --user install requests
!{sys.executable} -m pip install --user install altair
!{sys.executable} -m pip install --user install pandas
!{sys.executable} -m pip install --user install geopandas
#hide

You should consider upgrading via the 'pip install --upgrade pip' command.[0m
You should consider upgrading via the 'pip install --upgrade pip' command.[0m
You should consider upgrading via the 'pip install --upgrade pip' command.[0m
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


Get All libraries in place

In [213]:
import geopandas as gpd
import pandas as pd
import requests
import json
from collections import defaultdict
from dataclasses import dataclass, asdict
import datetime
from typing import List
import uuid

Lets make a data class to store our geographical and vaccination data

In [214]:
@dataclass
class District:
    district_id:int = None
    district_name:str = None
    state_id:int = None


@dataclass
class Session:
    session_uuid:str = None
    date:datetime.datetime = None
    query_date:datetime.datetime= None
    available_capacity:int = None
    min_age_limit:int = None
    vaccine:str = None
    center_id:str = None
    district_id:str = None
        
@dataclass
class Center:
    session_uuid:str = None
    center_id:int = None
    center_name:str = None
    state_name:str = None
    district_name:str = None
    block_name:str = None
    pincode:str = None
    lat:int = None
    lng:int = None
    from_hour:datetime.datetime = None
    to_hour:datetime.datetime = None
    fee_type:str = None
    district_id:str = None

@dataclass
class NoSlotAvailable:
    district_id:str = None
    date:datetime.datetime = None

Lets call the API to get the populate geographical data

In [215]:
MOZILLA_HEADER = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'
GET_DISTRICT_DATA_API_URL = "https://cdn-api.co-vin.in/api/v2/admin/location/districts/{}"
GET_APOINTMENT_DATA_API_URL = "https://cdn-api.co-vin.in/api/v2/appointment/sessions/public/calendarByDistrict?district_id={}&date={}"

DISTRICTS = defaultdict(District)
CENTERS = defaultdict(Center)
SESSIONS = defaultdict(Session)
NO_SLOT_AVAILABLE = []

MAX_NUMBER_OF_STATES = 40

for state_code in range(1, MAX_NUMBER_OF_STATES):
    headers = {'User-Agent': MOZILLA_HEADER}
    response = requests.get(GET_DISTRICT_DATA_API_URL.format(state_code), headers=headers)
    districts_data = json.loads(response.content)
    for district in districts_data['districts']:
        district_name = district['district_name']
        district_id = district['district_id']
        district = District(district_name = district_name, district_id = district_id , state_id=state_code) 
        DISTRICTS[district_id] = district

In [216]:
DISTRICT_DF = pd.DataFrame.from_dict([asdict(district) for district in DISTRICTS.values()])

Lets now call actual api to get slots 

In [217]:
MAX_DAYS = 1

def get_days_in_future_from_today():
    base = datetime.datetime.today()
    date_list = [base + datetime.timedelta(days=x) for x in range(MAX_DAYS)]
    return [x.strftime("%d-%m-%Y") for x in date_list]

for district_id in DISTRICTS.keys():
    for slot_date in get_data_fiften_days_in_future_from_today():
        URL = GET_APOINTMENT_DATA_API_URL.format(district_id, slot_date)
        response = requests.get(URL)
        if response.ok:
            resp_json = response.json()
            if resp_json["centers"]:
                for center in resp_json["centers"]:
                    center_uuid = str(uuid.uuid4())
                    center_id = center["center_id"]
                    center_name = center["name"]
                    CENTERS[center_uuid] = Center(center_uuid=center_uuid,
                                                center_id=center_id,
                                                center_name=center_name,
                                                lat=center["lat"],
                                                lng=center["long"],
                                                from_hour=center["from"],
                                                to_hour=center["to"],
                                                district_id=district_id,
                                                state_name=center["state_name"],
                                                district_name=center["district_name"],
                                                block_name=center["block_name"],
                                                pincode=center["pincode"],
                                                fee_type=center["fee_type"])
                    for session in center["sessions"]:
                        session_id = session["session_id"]
                        SESSIONS[session_id] = Session(session_uuid=session_id,
                                                       date=session["date"],
                                                       query_date=slot_date,
                                                       available_capacity=session["available_capacity"],
                                                       min_age_limit=session["min_age_limit"],
                                                       vaccine=session["vaccine"],
                                                       district_id=district_id,
                                                       center_id=center_id)
            else:
                NO_SLOT_AVAILABLE.append(NoSlotAvailable(district_id=district_id, date=slot_date))
#                 print("No slot on {} in district {}".format(slot_date, district_id))

No slot on 01-05-2021 in district 22
No slot on 02-05-2021 in district 22
No slot on 02-05-2021 in district 20
No slot on 02-05-2021 in district 25
No slot on 02-05-2021 in district 42
No slot on 02-05-2021 in district 24
No slot on 02-05-2021 in district 27
No slot on 01-05-2021 in district 21
No slot on 02-05-2021 in district 21
No slot on 01-05-2021 in district 33
No slot on 02-05-2021 in district 33
No slot on 01-05-2021 in district 29
No slot on 02-05-2021 in district 29
No slot on 02-05-2021 in district 40
No slot on 02-05-2021 in district 31
No slot on 01-05-2021 in district 18
No slot on 02-05-2021 in district 18
No slot on 02-05-2021 in district 36
No slot on 01-05-2021 in district 39
No slot on 02-05-2021 in district 39
No slot on 02-05-2021 in district 35
No slot on 02-05-2021 in district 37
No slot on 01-05-2021 in district 26
No slot on 02-05-2021 in district 26
No slot on 01-05-2021 in district 34
No slot on 02-05-2021 in district 34
No slot on 02-05-2021 in district 28
N

In [218]:
CENTER_DF = pd.DataFrame.from_dict([asdict(district) for district in CENTERS.values()])
SESSION_DF = pd.DataFrame.from_dict([asdict(session) for session in SESSIONS.values()])
NO_SLOT_AVAILABLE_DF = pd.DataFrame.from_dict([asdict(no_slot_available) for no_slot_available in NO_SLOT_AVAILABLE])

In [219]:
NO_SLOT_AVAILABLE_DF.head()

Unnamed: 0,district_id,date
0,22,01-05-2021
1,22,02-05-2021
2,20,02-05-2021
3,25,02-05-2021
4,42,02-05-2021


In [220]:
SESSION_DF.head()

Unnamed: 0,session_id,date,query_date,available_capacity,min_age_limit,vaccine,center_id,district_id
0,51738255-efad-470e-8660-e9d829d83ccb,01-05-2021,01-05-2021,50.0,45,,570779,3
1,207fd35f-888d-460d-8304-f0b99f9a13a0,03-05-2021,02-05-2021,50.0,45,,570779,3
2,88373311-b0da-40f6-bbea-ee029f86eb1c,04-05-2021,02-05-2021,50.0,45,,570779,3
3,8ef1ec99-7318-4cb6-bc66-9ed3675f9a48,05-05-2021,02-05-2021,49.0,45,,570779,3
4,bd311aec-c4cf-45f3-8f06-2dbc7b634de4,06-05-2021,02-05-2021,50.0,45,,570779,3


In [221]:
CENTER_DF.head()

Unnamed: 0,center_id,name,state_name,district_name,block_name,lat,lng,from_hour,to_hour,fee_type,district_id
0,570779,BJR Hospital,Andaman and Nicobar Islands,Nicobar,Car Nicobar,9.0,92.0,09:00:00,17:00:00,Free,3
1,552108,Nancowry CHC,Andaman and Nicobar Islands,Nicobar,Nancowry,7.0,93.0,09:00:00,17:00:00,Free,3
2,552109,Campbellbay PHC,Andaman and Nicobar Islands,Nicobar,Campbell Bay,7.0,93.0,09:00:00,17:00:00,Free,3
3,639986,SMC 37 WING,Andaman and Nicobar Islands,Nicobar,Car Nicobar,9.0,92.0,09:00:00,18:00:00,Free,3
4,570779,BJR Hospital,Andaman and Nicobar Islands,Nicobar,Car Nicobar,0.0,0.0,09:00:00,17:00:00,Free,3


In [222]:
DISTRICT_DF.head()

Unnamed: 0,district_id,name,state_id
0,3,Nicobar,1
1,1,North and Middle Andaman,1
2,2,South Andaman,1
3,9,Anantapur,2
4,10,Chittoor,2


In [223]:
session_center_merged_df = pd.merge(SESSION_DF, CENTER_DF, on="center_id")
session_center_district_df = pd.merge(session_center_merged_df, DISTRICT_DF, left_on='district_id_x', right_on="district_id")

In [224]:
session_center_district_df.to_csv("vaccination_slot_data.csv")
session_center_district_df

Unnamed: 0,session_id,date,query_date,available_capacity,min_age_limit,vaccine,center_id,district_id_x,name_x,state_name,...,block_name,lat,lng,from_hour,to_hour,fee_type,district_id_y,district_id,name_y,state_id
0,51738255-efad-470e-8660-e9d829d83ccb,01-05-2021,01-05-2021,50.0,45,,570779,3,BJR Hospital,Andaman and Nicobar Islands,...,Car Nicobar,9.0,92.0,09:00:00,17:00:00,Free,3,3,Nicobar,1
1,51738255-efad-470e-8660-e9d829d83ccb,01-05-2021,01-05-2021,50.0,45,,570779,3,BJR Hospital,Andaman and Nicobar Islands,...,Car Nicobar,0.0,0.0,09:00:00,17:00:00,Free,3,3,Nicobar,1
2,207fd35f-888d-460d-8304-f0b99f9a13a0,03-05-2021,02-05-2021,50.0,45,,570779,3,BJR Hospital,Andaman and Nicobar Islands,...,Car Nicobar,9.0,92.0,09:00:00,17:00:00,Free,3,3,Nicobar,1
3,207fd35f-888d-460d-8304-f0b99f9a13a0,03-05-2021,02-05-2021,50.0,45,,570779,3,BJR Hospital,Andaman and Nicobar Islands,...,Car Nicobar,0.0,0.0,09:00:00,17:00:00,Free,3,3,Nicobar,1
4,88373311-b0da-40f6-bbea-ee029f86eb1c,04-05-2021,02-05-2021,50.0,45,,570779,3,BJR Hospital,Andaman and Nicobar Islands,...,Car Nicobar,9.0,92.0,09:00:00,17:00:00,Free,3,3,Nicobar,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
304199,6c2dcbda-4f77-4449-b908-ee26922157ba,03-05-2021,02-05-2021,10.0,45,,551644,139,PHC Vanakbara,Daman and Diu,...,Diu,20.0,70.0,09:00:00,17:00:00,Free,139,139,Diu,37
304200,5602f122-498d-4482-a0af-266e75f07f85,04-05-2021,02-05-2021,10.0,45,,551644,139,PHC Vanakbara,Daman and Diu,...,Diu,20.0,70.0,09:00:00,17:00:00,Free,139,139,Diu,37
304201,5602f122-498d-4482-a0af-266e75f07f85,04-05-2021,02-05-2021,10.0,45,,551644,139,PHC Vanakbara,Daman and Diu,...,Diu,20.0,70.0,09:00:00,17:00:00,Free,139,139,Diu,37
304202,e91c6b85-98a2-4757-ab94-bd7da8756e16,05-05-2021,02-05-2021,9.0,45,,551644,139,PHC Vanakbara,Daman and Diu,...,Diu,20.0,70.0,09:00:00,17:00:00,Free,139,139,Diu,37


In [225]:
# Top 5 States Running ahead of everyone else.
session_center_district_df[['date', 'state_name']].groupby('state_name').apply(lambda x : x.sort_values(by = 'date', ascending = True).head(3).reset_index(drop = True))

Unnamed: 0_level_0,Unnamed: 1_level_0,date,state_name
state_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Andaman and Nicobar Islands,0,01-05-2021,Andaman and Nicobar Islands
Andaman and Nicobar Islands,1,01-05-2021,Andaman and Nicobar Islands
Andaman and Nicobar Islands,2,01-05-2021,Andaman and Nicobar Islands
Andhra Pradesh,0,01-05-2021,Andhra Pradesh
Andhra Pradesh,1,01-05-2021,Andhra Pradesh
...,...,...,...
Uttarakhand,1,01-05-2021,Uttarakhand
Uttarakhand,2,01-05-2021,Uttarakhand
West Bengal,0,01-05-2021,West Bengal
West Bengal,1,01-05-2021,West Bengal


In [226]:
session_center_district_df\
    [['date', 'state_name']]\
    .groupby('state_name')\
    .apply(lambda x : x.sort_values(by = 'date', ascending = True)\
    .head(1)\
    .reset_index(drop = True)).head(40)

Unnamed: 0_level_0,Unnamed: 1_level_0,date,state_name
state_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Andaman and Nicobar Islands,0,01-05-2021,Andaman and Nicobar Islands
Andhra Pradesh,0,01-05-2021,Andhra Pradesh
Arunachal Pradesh,0,01-05-2021,Arunachal Pradesh
Assam,0,01-05-2021,Assam
Bihar,0,01-05-2021,Bihar
Chandigarh,0,01-05-2021,Chandigarh
Chhattisgarh,0,01-05-2021,Chhattisgarh
Dadra and Nagar Haveli,0,01-05-2021,Dadra and Nagar Haveli
Daman and Diu,0,01-05-2021,Daman and Diu
Delhi,0,01-05-2021,Delhi


In [227]:
session_center_district_df\
    [session_center_district_df["min_age_limit"]<45]\
    [session_center_district_df["fee_type"]=="Free"]\
    [['date', 'state_name']]\
    .groupby('state_name')\
    .apply(lambda x : x.sort_values(by = 'date', ascending = True)\
    .head(1)\
    .reset_index(drop = True)).head(40)

  session_center_district_df\


Unnamed: 0_level_0,Unnamed: 1_level_0,date,state_name
state_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Andhra Pradesh,0,01-05-2021,Andhra Pradesh
Assam,0,01-05-2021,Assam
Bihar,0,01-05-2021,Bihar
Chhattisgarh,0,01-05-2021,Chhattisgarh
Delhi,0,08-05-2021,Delhi
Gujarat,0,01-05-2021,Gujarat
Haryana,0,01-05-2021,Haryana
Jammu and Kashmir,0,01-05-2021,Jammu and Kashmir
Jharkhand,0,01-05-2021,Jharkhand
Karnataka,0,01-05-2021,Karnataka
