# COVID19-India-Analysis [Kaggle Notebook](https://www.kaggle.com/samacker77k/covid19-india-analysis)
A notebook dedicated to data visualization and analysis of COVID19 Pandemic in India.

---

This notebook visualizes the effects of COVID19 pandemic in India to help understand the effect of the outbreak demographically.

Maintained by:
* Shivani Tyagi [LinkedIn](https://www.linkedin.com/in/shivani-tyagi-09/) [Github](https://github.com/shivitg)
* Nitika Kamboj [LinkedIn](https://linkedin.com/in/nitika-kamboj) [Github](https://github.com/nitika-kamboj)
* Samar Srivastava [LinkedIn](https://linkedin.com/in/samacker77l) [Github](https://github.com/samacker77)
 


---

### Fetching Data 
---

In [82]:
import requests
import pandas as pd
import logging
import datetime

### Enable logging

In [83]:
loggers = {}

def get_logger(name):
    
    global loggers

    if loggers.get(name):
        return loggers.get(name)
    else:
        logger = logging.getLogger(name)
        logger.setLevel(logging.DEBUG)
        now = datetime.datetime.now()
        handler = logging.StreamHandler()
        formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s')
        handler.setFormatter(formatter)
        logger.addHandler(handler)
        loggers[name] = logger
        return logger
    
logger = get_logger('COVID19 India Analysis Logger')

In [84]:
def fetch_data():
    url = 'http://portal.covid19india.org/export?_export=json'
    response = requests.get(url=url)
    if response.status_code == 200:
        logger.info('Connection enabled. Fetching data...')
        fetched_data = response.json()
        data = pd.DataFrame(fetched_data)
        print("Data fetched.")
        return data
    else:
        print('Connection failed. Please retry.')
        return

In [85]:
data = fetch_data()

2020-03-24 14:20:05,020 INFO Connection enabled. Fetching data...
2020-03-24 14:20:05,020 INFO Connection enabled. Fetching data...


Data fetched.


---
#### Now we have fetched data successfully. Now we will inspect the data. 

In [86]:
print("Data Shape ~ Rows = {} | Columns = {}".format(data.shape[0],data.shape[1]))

Data Shape ~ Rows = 504 | Columns = 19


#### Checking dtypes

In [87]:
data.dtypes

ID                       int64
Unique id               object
Government id           object
Diagnosed date          object
Age                    float64
Gender                  object
Detected city           object
Detected city pt        object
Detected district       object
Detected state          object
Nationality             object
Current status          object
Status change date      object
Notes                   object
Current location        object
Current location pt     object
Created on              object
Updated on              object
Contacts                object
dtype: object

> On first look we see that the attributes 'ID' and 'Unique ID' are same. So we check if they have any values that are different. Before that we convert 'Unique ID' to int64 and compare. 

In [89]:
data['Unique id'] = data['Unique id'].astype('int64')

In [90]:
data[data['ID'] == data['Unique id']]

Unnamed: 0,ID,Unique id,Government id,Diagnosed date,Age,Gender,Detected city,Detected city pt,Detected district,Detected state,Nationality,Current status,Status change date,Notes,Current location,Current location pt,Created on,Updated on,Contacts
0,1,1,KL-TS-P1,30/Jan/2020,20.0,Female,Thrissur,SRID=4326;POINT (76.21325419999999 10.5256264),Thrissur,Kerala,India,Recovered,02/14/2020,Travelled from Wuhan.\nStudent from Wuhan,,SRID=4326;POINT (76.21325419999999 10.5256264),03/23/2020 12:20 p.m.,03/23/2020 12:20 p.m.,
1,2,2,KL-AL-P1,02/Feb/2020,,Unknown,Alappuzha,SRID=4326;POINT (76.333482 9.498000100000001),Alappuzha,Kerala,India,Recovered,02/14/2020,Travelled from Wuhan.\nStudent from Wuhan,,SRID=4326;POINT (76.333482 9.498000100000001),03/23/2020 12:20 p.m.,03/23/2020 12:20 p.m.,
2,3,3,KL-KS-P1,03/Feb/2020,,Unknown,Kasargode,SRID=4326;POINT (80 20),Kasaragod,Kerala,India,Recovered,02/14/2020,Travelled from Wuhan.\nStudent from Wuhan,,SRID=4326;POINT (80 20),03/23/2020 12:20 p.m.,03/23/2020 12:20 p.m.,
3,4,4,DL-P1,02/Mar/2020,45.0,Male,East Delhi (Mayur Vihar),SRID=4326;POINT (80 20),East Delhi,Delhi,India,Recovered,03/15/2020,"Travelled from Austria, Italy.\nTravel history...",,SRID=4326;POINT (80 20),03/23/2020 12:20 p.m.,03/23/2020 12:20 p.m.,"Patient 22:, Patient 23:, Patient 24:, Patient..."
4,5,5,TS-P1,02/Mar/2020,,Unknown,Hyderabad,SRID=4326;POINT (78.4349398685041 17.4263524),Hyderabad,Telangana,India,Recovered,03/02/2020,"Travelled from Dubai, Singapore contact.\nTrav...",,SRID=4326;POINT (78.4349398685041 17.4263524),03/23/2020 12:20 p.m.,03/23/2020 12:20 p.m.,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
499,500,500,,23/Mar/2020,25.0,Male,Chennai,SRID=4326;POINT (80.2838331 13.0801721),Chennai,Tamil Nadu,,Hospitalized,02/23/2020,Travelled from London - RGGH,,SRID=4326;POINT (80.2838331 13.0801721),03/24/2020 8:15 a.m.,03/24/2020 8:15 a.m.,
500,501,501,,23/Mar/2020,48.0,Male,Tiruppur,SRID=4326;POINT (77.52604780096844 10.78322705),Tiruppur,Tamil Nadu,,Hospitalized,02/23/2020,Travelled from London - ESI,,SRID=4326;POINT (77.52604780096844 10.78322705),03/24/2020 8:15 a.m.,03/24/2020 8:15 a.m.,
501,502,502,,23/Mar/2020,54.0,Male,Madurai,SRID=4326;POINT (78.11409829999999 9.926115299...,Madurai,Tamil Nadu,,Hospitalized,02/23/2020,Annanagar at Rajaji Hosp. No mention of travel...,,SRID=4326;POINT (78.11409829999999 9.926115299...,03/24/2020 8:15 a.m.,03/24/2020 8:15 a.m.,
502,503,503,,23/Mar/2020,24.0,Male,,SRID=4326;POINT (85.906508 25.6440845),Patna,Bihar,,Hospitalized,02/23/2020,Travelled from Scotland,,SRID=4326;POINT (85.906508 25.6440845),03/24/2020 8:15 a.m.,03/24/2020 8:15 a.m.,


> Since we have same values in both columns. We can drop one of them and make another as the index

In [92]:
data.drop('Unique id',axis=1,inplace=True)

In [96]:
data.set_index('ID',inplace=True)

Unnamed: 0,ID,Government id,Diagnosed date,Age,Gender,Detected city,Detected city pt,Detected district,Detected state,Nationality,Current status,Status change date,Notes,Current location,Current location pt,Created on,Updated on,Contacts
0,1,KL-TS-P1,30/Jan/2020,20.0,Female,Thrissur,SRID=4326;POINT (76.21325419999999 10.5256264),Thrissur,Kerala,India,Recovered,02/14/2020,Travelled from Wuhan.\nStudent from Wuhan,,SRID=4326;POINT (76.21325419999999 10.5256264),03/23/2020 12:20 p.m.,03/23/2020 12:20 p.m.,
1,2,KL-AL-P1,02/Feb/2020,,Unknown,Alappuzha,SRID=4326;POINT (76.333482 9.498000100000001),Alappuzha,Kerala,India,Recovered,02/14/2020,Travelled from Wuhan.\nStudent from Wuhan,,SRID=4326;POINT (76.333482 9.498000100000001),03/23/2020 12:20 p.m.,03/23/2020 12:20 p.m.,
2,3,KL-KS-P1,03/Feb/2020,,Unknown,Kasargode,SRID=4326;POINT (80 20),Kasaragod,Kerala,India,Recovered,02/14/2020,Travelled from Wuhan.\nStudent from Wuhan,,SRID=4326;POINT (80 20),03/23/2020 12:20 p.m.,03/23/2020 12:20 p.m.,
3,4,DL-P1,02/Mar/2020,45.0,Male,East Delhi (Mayur Vihar),SRID=4326;POINT (80 20),East Delhi,Delhi,India,Recovered,03/15/2020,"Travelled from Austria, Italy.\nTravel history...",,SRID=4326;POINT (80 20),03/23/2020 12:20 p.m.,03/23/2020 12:20 p.m.,"Patient 22:, Patient 23:, Patient 24:, Patient..."
4,5,TS-P1,02/Mar/2020,,Unknown,Hyderabad,SRID=4326;POINT (78.4349398685041 17.4263524),Hyderabad,Telangana,India,Recovered,03/02/2020,"Travelled from Dubai, Singapore contact.\nTrav...",,SRID=4326;POINT (78.4349398685041 17.4263524),03/23/2020 12:20 p.m.,03/23/2020 12:20 p.m.,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
499,500,,23/Mar/2020,25.0,Male,Chennai,SRID=4326;POINT (80.2838331 13.0801721),Chennai,Tamil Nadu,,Hospitalized,02/23/2020,Travelled from London - RGGH,,SRID=4326;POINT (80.2838331 13.0801721),03/24/2020 8:15 a.m.,03/24/2020 8:15 a.m.,
500,501,,23/Mar/2020,48.0,Male,Tiruppur,SRID=4326;POINT (77.52604780096844 10.78322705),Tiruppur,Tamil Nadu,,Hospitalized,02/23/2020,Travelled from London - ESI,,SRID=4326;POINT (77.52604780096844 10.78322705),03/24/2020 8:15 a.m.,03/24/2020 8:15 a.m.,
501,502,,23/Mar/2020,54.0,Male,Madurai,SRID=4326;POINT (78.11409829999999 9.926115299...,Madurai,Tamil Nadu,,Hospitalized,02/23/2020,Annanagar at Rajaji Hosp. No mention of travel...,,SRID=4326;POINT (78.11409829999999 9.926115299...,03/24/2020 8:15 a.m.,03/24/2020 8:15 a.m.,
502,503,,23/Mar/2020,24.0,Male,,SRID=4326;POINT (85.906508 25.6440845),Patna,Bihar,,Hospitalized,02/23/2020,Travelled from Scotland,,SRID=4326;POINT (85.906508 25.6440845),03/24/2020 8:15 a.m.,03/24/2020 8:15 a.m.,


In [97]:
print("Data Shape ~ Rows = {} | Columns = {}".format(data.shape[0],data.shape[1]))

Data Shape ~ Rows = 504 | Columns = 17


In [98]:
data.head()

Unnamed: 0_level_0,Government id,Diagnosed date,Age,Gender,Detected city,Detected city pt,Detected district,Detected state,Nationality,Current status,Status change date,Notes,Current location,Current location pt,Created on,Updated on,Contacts
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
1,KL-TS-P1,30/Jan/2020,20.0,Female,Thrissur,SRID=4326;POINT (76.21325419999999 10.5256264),Thrissur,Kerala,India,Recovered,02/14/2020,Travelled from Wuhan.\nStudent from Wuhan,,SRID=4326;POINT (76.21325419999999 10.5256264),03/23/2020 12:20 p.m.,03/23/2020 12:20 p.m.,
2,KL-AL-P1,02/Feb/2020,,Unknown,Alappuzha,SRID=4326;POINT (76.333482 9.498000100000001),Alappuzha,Kerala,India,Recovered,02/14/2020,Travelled from Wuhan.\nStudent from Wuhan,,SRID=4326;POINT (76.333482 9.498000100000001),03/23/2020 12:20 p.m.,03/23/2020 12:20 p.m.,
3,KL-KS-P1,03/Feb/2020,,Unknown,Kasargode,SRID=4326;POINT (80 20),Kasaragod,Kerala,India,Recovered,02/14/2020,Travelled from Wuhan.\nStudent from Wuhan,,SRID=4326;POINT (80 20),03/23/2020 12:20 p.m.,03/23/2020 12:20 p.m.,
4,DL-P1,02/Mar/2020,45.0,Male,East Delhi (Mayur Vihar),SRID=4326;POINT (80 20),East Delhi,Delhi,India,Recovered,03/15/2020,"Travelled from Austria, Italy.\nTravel history...",,SRID=4326;POINT (80 20),03/23/2020 12:20 p.m.,03/23/2020 12:20 p.m.,"Patient 22:, Patient 23:, Patient 24:, Patient..."
5,TS-P1,02/Mar/2020,,Unknown,Hyderabad,SRID=4326;POINT (78.4349398685041 17.4263524),Hyderabad,Telangana,India,Recovered,03/02/2020,"Travelled from Dubai, Singapore contact.\nTrav...",,SRID=4326;POINT (78.4349398685041 17.4263524),03/23/2020 12:20 p.m.,03/23/2020 12:20 p.m.,


#### Now the data is ready for analysis and preprocessing

---