## A basic S.I.R model for Covid Situation in Bangalore 

### Flattening the curve ? Infection is peaking ? Have you wondered what that actually means ? 
This notebook is an attempt to simplify and help understand how to model a pandemic using S.I.R Model(Suseptible,Infected,Recovered).The predicted statistics from the notebook maybe inaccurate with real life observations but the intention of this notebook is to convey how avaiable data can be used for the purpose.

### If you do not wish to download and run this notebook but want to view the results you can find the SIR curve plots [here](https://pythonista7.github.io/plot) and the geo-plot of Bengaluru map [here](https://pythonista7.github.io/kepler_infection_map) (be sure to view the last few cells of this notebook to use the kepler map for visualization.)

This notebook comprises of the 3 main parts:
    1. Data clean-up and extrapolation
    2. SIR Model
    3. Plotting using KeplerGL

## 1. Data Clean-up 

In [45]:
import pandas as pd
import numpy as np

<p> The below file consists of ward-wise data about the containtement zone and its severity as of early late-April/early-May,2020.<br> Any graph showing day-0,day-1...day-n and so on can be inffered to approximately May 1st +n days
</p>

In [46]:
df=pd.read_csv('BBMP.csv')

In [47]:
df.dropna(inplace=True) #Has a couple of rows containing na values which we can remove 
df.head()

Unnamed: 0,Ward Number,Ward Name,Zone,Assembly Constituency,Covid Zone,Containment Zone
0,135.0,Padarayanapura,West,Chamajpet,Red,Yes
1,189.0,Hongasandra,Bommanahalli,Bommanahalli,Red,Yes
2,93.0,Vasanth Nagar,East,Shivajinagar,Orange,Yes
3,133.0,Hampinagara,South,Vijayanagar,Orange,Yes
4,134.0,Bapuji Nagar,South,Vijayanagar,Orange,Yes


### Convert the ward numbers to integers instead of float

In [48]:
data=df
data['Ward Number']=data['Ward Number'].astype('int64') 

In [49]:
data.set_index('Ward Number',inplace=True)
data.sort_index(inplace=True)
data.head() # Now we have a cleaner representation

Unnamed: 0_level_0,Ward Name,Zone,Assembly Constituency,Covid Zone,Containment Zone
Ward Number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,Kempegowda,Yelahanka,Yelahanka,Green,No
2,Chowdeshwari,Yelahanka,Yelahanka,Green,No
3,Atturu,Yelahanka,Yelahanka,Green,No
4,Yelahanka Satellite To,Yelahanka,Yelahanka,Green,No
5,Jakkuru,Yelahanka,Bytarayanapura,Green,No



### Now that I have a list of all the places and their status , I would also like to know details about each of these wards/sectors.  I was able to find census data on a gov website which could aid me in this.This data is stored in cencus.csv file. 


In [50]:
# Lets import the data and perform a little cleanup before we use it 
census=pd.read_csv('cencus.csv',names=['ward_no', 'representative', 'female', 'population', 'category', 'area',
       'Ward Name', 'st', 'localities', 'sc', 'constituency', 'male'])
census.drop(0,inplace=True)
census['ward_no']=census['ward_no'].astype('int64') 

In [51]:
census.head()

Unnamed: 0,ward_no,representative,female,population,category,area,Ward Name,st,localities,sc,constituency,male
1,147,S.M.Murugesh Modaliyar (INC),14269,29945,General,1.6,Adugodi,411,"['Rajendra Nagar', ' Rajendra Nagar slum', ' V...",5381,B T M Layout,15676
2,114,Sarala (INC),15524,35632,Scheduled Caste,11.26,Agaram,167,"['Cambridge layout', ' Jeevan Kendra Layout', ...",12434,Shanthinagar,20108
3,105,Roopa Devi (BJP),12716,26857,Backward Category A (Women),0.8,Agrahara Dasarahalli,433,"['Indira Colony', ' West of Chord road 4th Sta...",3172,Govindrajnagar,14141
4,56,S.S.Prasad (BJP),14014,29420,Backward Category A,2.15,A Narayanapura,118,"['Jyothipuram', ' Mahadevapura', ' A Narayanpu...",5827,K R Puram,15406
5,196,S.Gangadhar (INC),10293,21080,General,11.92,Anjanapura,226,"['Soudhamini Layout', ' Muneshwar Nagar', ' Av...",2671,Bangalore South,10787


In [52]:
# Lets pick a few useful columns for now
census=census[['area', 'ward_no', 'male','female','population']]
census.head()

Unnamed: 0,area,ward_no,male,female,population
1,1.6,147,15676,14269,29945
2,11.26,114,20108,15524,35632
3,0.8,105,14141,12716,26857
4,2.15,56,15406,14014,29420
5,11.92,196,10787,10293,21080


In [53]:
#great now we have some background data about each ward to work with
census.set_index('ward_no',inplace=True)
census.sort_index(inplace=True)
census.head()

Unnamed: 0_level_0,area,male,female,population
ward_no,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,10.47,11490,10376,21866
2,7.06,10402,9224,19626
3,10.15,13129,10891,24020
4,4.9,13457,12325,25782
5,23.96,10906,10058,20964


In [54]:
#now that we have both the dataframes clean and indexed lets join them
all_data=data.join(census)
all_data

Unnamed: 0_level_0,Ward Name,Zone,Assembly Constituency,Covid Zone,Containment Zone,area,male,female,population
Ward Number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1,Kempegowda,Yelahanka,Yelahanka,Green,No,10.47,11490,10376,21866
2,Chowdeshwari,Yelahanka,Yelahanka,Green,No,7.06,10402,9224,19626
3,Atturu,Yelahanka,Yelahanka,Green,No,10.15,13129,10891,24020
4,Yelahanka Satellite To,Yelahanka,Yelahanka,Green,No,4.9,13457,12325,25782
5,Jakkuru,Yelahanka,Bytarayanapura,Green,No,23.96,10906,10058,20964
...,...,...,...,...,...,...,...,...,...
194,Gottigere,Bommanahalli,Bangalore South,Green,No,6.49,11402,10124,21526
195,Konanakunte,Bommanahalli,Bangalore South,Green,No,3.42,10662,9519,20181
196,Anjanapura,Bommanahalli,Bangalore South,Green,No,11.92,10787,10293,21080
197,Vasanthpura,Bommanahalli,Bangalore South,Green,No,5.34,13057,11465,24522


#### Let's export and save this dataframe 

In [55]:
all_data.to_csv('all_data.csv')

### The all_data dataframe contains a varity of information which we can use in different ways

    There may be a number ways to model this more accurately but here I'm going to take a simplistic approach.
    Let's say the infection rate in a ward can be represent by the below equations,

    Lets give numerical representation to the categorical values in Covid and containment zone columns.


In [56]:
# Explain the mapping
Zone_map={"Green":2,"Yellow":5,"Orange":7,"Red":10,"Yes":5,"No":1}

In [57]:
#We can apply this mapping to the above df by using dfreplace
data=all_data.replace(Zone_map)
data.head()

Unnamed: 0_level_0,Ward Name,Zone,Assembly Constituency,Covid Zone,Containment Zone,area,male,female,population
Ward Number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1,Kempegowda,Yelahanka,Yelahanka,2,1,10.47,11490,10376,21866
2,Chowdeshwari,Yelahanka,Yelahanka,2,1,7.06,10402,9224,19626
3,Atturu,Yelahanka,Yelahanka,2,1,10.15,13129,10891,24020
4,Yelahanka Satellite To,Yelahanka,Yelahanka,2,1,4.9,13457,12325,25782
5,Jakkuru,Yelahanka,Bytarayanapura,2,1,23.96,10906,10058,20964


# The S.I.R Model
Compartmental models simplify the mathematical modelling of infectious diseases. The population is assigned to compartments with labels - for example, S, I, or R. People may progress between compartments, like the suseptible may get infected or the infected may eventually recover etc.

In [58]:
#Initially everyone would be susceptible
data['susceptible']=data['population']

In [59]:
# We start with a probability of having an infection which can be a product of a rand and area of the ward
# considering the fact that covid spreads faster in densely populated areas.
data['infected']=np.random.rand(len(data)) * data['area'].astype('float32')

In [60]:
#Then we normalize to a value bw 0-1 , if real data is available it can be plugged in here 
data['infected']=(data['infected']-data['infected'].min())/(data['infected'].max()-data['infected'].mean())

In [61]:
#And number of recovered would be 0
data['recovered']=0

In [62]:
data

Unnamed: 0_level_0,Ward Name,Zone,Assembly Constituency,Covid Zone,Containment Zone,area,male,female,population,susceptible,infected,recovered
Ward Number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
1,Kempegowda,Yelahanka,Yelahanka,2,1,10.47,11490,10376,21866,21866,0.000147,0
2,Chowdeshwari,Yelahanka,Yelahanka,2,1,7.06,10402,9224,19626,19626,0.000204,0
3,Atturu,Yelahanka,Yelahanka,2,1,10.15,13129,10891,24020,24020,0.000290,0
4,Yelahanka Satellite To,Yelahanka,Yelahanka,2,1,4.9,13457,12325,25782,25782,0.000146,0
5,Jakkuru,Yelahanka,Bytarayanapura,2,1,23.96,10906,10058,20964,20964,0.000477,0
...,...,...,...,...,...,...,...,...,...,...,...,...
194,Gottigere,Bommanahalli,Bangalore South,2,1,6.49,11402,10124,21526,21526,0.000072,0
195,Konanakunte,Bommanahalli,Bangalore South,2,1,3.42,10662,9519,20181,20181,0.000060,0
196,Anjanapura,Bommanahalli,Bangalore South,2,1,11.92,10787,10293,21080,21080,0.000301,0
197,Vasanthpura,Bommanahalli,Bangalore South,2,1,5.34,13057,11465,24522,24522,0.000052,0


### Why is the infected a floating point value ?
It can be thought of as the initial probability of occurance of the 1st infection in that ward.

### Equations for modeling each compartment 
![title](https://wikimedia.org/api/rest_v1/media/math/render/svg/29728a7d4bebe8197dca7d873d81b9dce954522e)

Here dS represnt the rate of change in the population that is suseptible and it is characterized by count(Infected) represented by I , count(Suseptible) represented by S and N the total population count.
Similarly we use these to model dI the rate of change in infected population as well as dR the rate of change in recovered population. 


The Greek alphabets here can be thought of as scaling factors and are represented in this notebook as const_a and const_b(for beta) and const_r(for gamma). You can read more about this and how infection rate is calculated from [here](https://en.wikipedia.org/wiki/Compartmental_models_in_epidemiology)


In [63]:
#CONSTANT a --> rate of infection
#const_a

#CONSTANT b --> Rate of reduction of infected people
#const_b

#CONSTANT r --> Recovery Rate
#const_r

In [64]:
# Population = S(t) + I(t) + R(t)
# Now we need to calculate the rate of change each col in (S,I,R) wrt time/

In [65]:
#Feel free to change up these values and observe how the graph changes 
#CONSTANT a --> rate of infection
a=1.6247 
#CONSTANT b --> Rate of reduction of infected people
b=0.014286
#CONSTANT r --> Recovery Rate
r=0.3

In [66]:
def delta_S(S,I,const_a=a): 
    # here I= I[t-1] cause new susceptible number will be wrt prev infected count
    return -const_a * S * I


In [67]:
#delta_I = no of infected at time t, given by --> (const_a * S) - (const_b * I)
# whoever were susceptible and got infected are transfered from S to I
# but also there are people who are constantly out of I state,i.e-> recover or pass_away, that is captured by const_b
def delta_I(I,S,const_a=a,const_b=b):
    return  (const_a * S * I) - (const_b * I)

In [68]:
# Rate of recovery
def delta_R(I,const_r=b):
    return const_r*I

In [69]:
data[['susceptible','infected','recovered']]=data[['susceptible','infected','recovered']].astype('float32')

In [70]:
for c in data.columns:
    print(c,' -- ',data[c].dtype)

Ward Name  --  object
Zone  --  object
Assembly Constituency  --  object
Covid Zone  --  int64
Containment Zone  --  int64
area  --  object
male  --  object
female  --  object
population  --  object
susceptible  --  float32
infected  --  float32
recovered  --  float32


### Lets test out these equations for few ward's   

In [71]:
d=data.sample(frac=0.05)
wards=d.values.tolist()
d

Unnamed: 0_level_0,Ward Name,Zone,Assembly Constituency,Covid Zone,Containment Zone,area,male,female,population,susceptible,infected,recovered
Ward Number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
183,Chikkalsandra,South,Padmanabhanagar,2,1,1.09,12995,11682,24677,24677.0,2.5e-05,0.0
69,Laggere,RR Nagar,Rajarajeswarinagar,2,1,1.58,13425,11945,25370,25370.0,8e-06,0.0
30,Kadugondanahalli,East,Sarvagnanagar,2,1,0.71,17902,16940,34842,34842.0,1e-05,0.0
128,Nagarabhavi,West,Govindarjnagara,2,1,1.6,10408,9861,20269,20269.0,3.4e-05,0.0
43,Nandini Layout,West,Mahalakshmi Layout,2,1,1.41,18113,16205,34318,34318.0,1e-05,0.0
87,Hal Airport,Mahadevapura,K.R.Puram,2,1,6.8,17594,15472,33066,33066.0,0.000111,0.0
3,Atturu,Yelahanka,Yelahanka,2,1,10.15,13129,10891,24020,24020.0,0.00029,0.0
31,Kushal Nagar,East,Pulakeshinagar,5,1,0.65,17506,16540,34046,34046.0,0.0,0.0
142,Sunkenahalli,South,Chikpet,2,1,1.49,18645,17513,36158,36158.0,2e-05,0.0
115,Vannarpete,East,Shanthinagar,2,1,0.74,17991,17541,35532,35532.0,9e-06,0.0


In [72]:
#Example data to refer for below
wards[0]

['Chikkalsandra',
 'South',
 'Padmanabhanagar',
 2,
 1,
 '1.09',
 '12995',
 '11682',
 '24677',
 24677.0,
 2.483824937371537e-05,
 0.0]

In [73]:
#Let us setup a time-frame to model the spread 
days=90

In [74]:
def get_deltas_for_ward(ward,const_a= 0.00001,const_b=0.0000021,const_r=0.003,days=25):
    log=[ward]#[[ward[-1],ward[-2],ward[-3]]]
    delta_log=[]
    
    for day in range(0,days):
        d_s=delta_S(log[day][-3],log[day-1][-2],const_a = 0.00001) 
        d_i=delta_I(log[day][-2],log[day][-3] ,const_a=0.00001,const_b=0.0000021) 
        d_r=delta_R(log[day][-2],const_r=0.003)
        
        log.append([ log[day][-3] + d_s if log[day][-3] + d_s >= 0 else 0 ,
                    log[day][-2] + d_i if log[day][-2] + d_i >= 0 else 0,
                    log[day][-1] + d_r])
        
        delta_log.append([d_s,d_i,d_r])
        
    return delta_log,log

In [75]:
#Here we will store only the delta i.e the rate of change/spread for each data of each ward hence we use _  
ward_wise=[]
for ward in wards:
    dat,_=get_deltas_for_ward(ward,days=days)
    ward_wise.append(dat)

In [76]:
# we will recast this into a numpy array for now
dat=np.array(ward_wise)
# shape ==> (no_of_wards,no_of_days,SIR_values)
dat.shape

(10, 90, 3)

Here let's plot all the 3 parameters: 

In [77]:
import plotly.graph_objects as go
fig = go.Figure()
for wa in range(len(ward_wise)):
    #Infection rate - RED LINE
    fig.add_trace(go.Scatter(y=dat[wa][:,1],
                    mode='lines',
                    name='lines',line={"color":'red'}))
    #Sus rate - ORANGE LINE
    fig.add_trace(go.Scatter(y=dat[wa][:,0],
                    mode='lines',
                    name='lines',line={"dash":'dot','color':'orange'}))
    #Recovered rate - GREEN LINE
    fig.add_trace(go.Scatter(y=dat[wa][:,2],
                    mode='lines',
                    name='lines',line={"dash":'dashdot','color':'green'},fill='toself'))

In [78]:
fig.show()

In [79]:
fig.write_html('plot1.html')

### If you are not running this locally and still want to view the graph click [here](https://pythonista7.github.io/plot1.html)

You maybe consufed as to why we have a distribution on the -ve region , that is because we are only plotting the change per unit time of the quantity and as expect the number of suseptible people keep reducing at an increaring rate as the infection spreads , the suspetable people are transffered to infected category.

## Cleaning it up a bit and computing for all the wards

In [35]:
d=data
wards=d.values.tolist()
days=140 #change if needed
ward_wise=[]
for ward in wards:
    dat,_=get_deltas_for_ward(ward,days=days)
    ward_wise.append(dat)

In [36]:
dat=np.array(ward_wise)
dat.shape

(198, 140, 3)

In [37]:
fig = go.Figure()
for wa in range(len(ward_wise)):
    fig.add_trace(go.Scatter(y=dat[wa][:,1],
                    mode='lines'))

In [38]:
fig.show()

In [39]:
#Lets save this to for reference , uncomment the below line
#fig.write_html('plot.html')

# As we can see the above image is missing a lot of details that we maybe expecting to see such as the ward name etc . Lets add the details now

Remeber that these graphs that are generated are interactive plotly graphs , you may hover over them to see data , click on legend elements to isolate them on the display and more. Do check them out.

In [40]:
fig = go.Figure()
for wa in range(len(ward_wise)):
    fig.add_trace(go.Scatter(y=dat[wa][:,1],
                    mode='lines',
                    name='infection'+" "+wards[wa][0]+" "+wards[wa][1]))
    fig.add_trace(go.Scatter(y=dat[wa][:,2],
                    mode='markers', name='recovery'+ " " + wards[wa][0]+" "+wards[wa][1]))

In [41]:
#Be sure to use the interactive features of the plot by hovering, clicking on a legend item, zoom, pan etc.
fig.show()

In [42]:
fig.write_html('plot.html')

## Now here on lets pick one attribute , the Infection Rate , to plot as we have seen it can get real messy

In [42]:
d=data.sort_values(by="Ward Number")
wards=d.values.tolist()

days=100  #Change if needed

ward_wise_inf=[]
for ward in wards:
    dat,_=get_deltas_for_ward(ward,days=days)
    ward_wise_inf.append(np.array(dat)[:,1])

In [43]:
ward_wise_inf=np.array(ward_wise_inf)
ward_wise_inf.shape

(198, 100)

In [44]:
time_series=pd.DataFrame(ward_wise_inf,index=list(range(1,199)))

In [45]:
# Here we have a time-series representation of the data where the rows represent the WARDNUMBERS and columns are days
time_series.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,90,91,92,93,94,95,96,97,98,99
1,0.000192,0.000233,0.000285,0.000347,0.000423,0.000515,0.000628,0.000765,0.000932,0.001136,...,1138.699171,1009.55382,872.845978,737.474173,610.438241,496.355165,397.533509,314.411434,246.134753,191.105793
2,7.9e-05,9.4e-05,0.000113,0.000135,0.000161,0.000193,0.000231,0.000276,0.000331,0.000396,...,588.601378,665.801006,745.889745,826.431376,904.257442,975.552511,1036.073177,1081.515041,1108.007903,1112.674001
3,0.000148,0.000183,0.000227,0.000282,0.00035,0.000434,0.000538,0.000667,0.000827,0.001026,...,497.188818,382.065166,289.731542,217.440539,161.884479,119.789953,88.23396,64.766183,47.416598,34.646149
4,3e-06,4e-06,5e-06,6e-06,8e-06,1e-05,1.2e-05,1.5e-05,1.9e-05,2.4e-05,...,1608.728427,1776.268485,1909.421303,1990.813664,2006.152919,1948.122608,1819.305809,1632.669165,1408.959852,1171.810693
5,0.000532,0.000644,0.000778,0.000942,0.001139,0.001378,0.001667,0.002016,0.002439,0.00295,...,776.055644,659.040258,549.590535,451.125984,365.344441,292.556968,232.106966,182.765367,143.045043,111.419557


In [46]:
#time_series.to_csv('TS.csv')

## Now let us try combine this with geographical data and plot it on a map 
 We'll be using GeoJSON data of Bangalore and we will also be plotting the geo+stat data onto a map using keplerGL.

In [47]:
with open('BBMP.GeoJSON', 'r') as f:
    geojson = f.read()

In [48]:
# Pandas wrapper for handeling geograhical data 
import geopandas as gpd

In [49]:
gdf=gpd.read_file('BBMP.GeoJSON')
gdf.head(2)

Unnamed: 0,OBJECTID,ASS_CONST_,ASS_CONST1,WARD_NO,WARD_NAME,POP_M,POP_F,POP_SC,POP_ST,POP_TOTAL,AREA_SQ_KM,LAT,LON,RESERVATIO,geometry
0,1,150,Yelahanka,2.0,Chowdeswari Ward,10402.0,9224.0,2630.0,286.0,19626.0,7.06,13.121709,77.580422,General,"MULTIPOLYGON (((77.59229 13.09720, 77.59094 13..."
1,2,150,Yelahanka,3.0,Atturu,13129.0,10891.0,2921.0,665.0,24020.0,10.15,13.102805,77.560038,General (Women),"MULTIPOLYGON (((77.56862 13.12705, 77.57064 13..."


In [50]:
# Converting and setting the the WARD_NO as the row index as it will be easier to join the dataframe with our timeseries data
gdf['WARD_NO']=gdf['WARD_NO'].astype(int)

In [51]:
gdf.set_index('WARD_NO',inplace=True)

In [52]:
gdf.sort_index(inplace=True)
gdf.head(2)

Unnamed: 0_level_0,OBJECTID,ASS_CONST_,ASS_CONST1,WARD_NAME,POP_M,POP_F,POP_SC,POP_ST,POP_TOTAL,AREA_SQ_KM,LAT,LON,RESERVATIO,geometry
WARD_NO,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
1,186,150,Yelahanka,Kempegowda Ward,11490.0,10376.0,1912.0,430.0,21866.0,10.47,13.116188,77.599713,Backward Category - B,"MULTIPOLYGON (((77.61515 13.13117, 77.61494 13..."
2,1,150,Yelahanka,Chowdeswari Ward,10402.0,9224.0,2630.0,286.0,19626.0,7.06,13.121709,77.580422,General,"MULTIPOLYGON (((77.59229 13.09720, 77.59094 13..."


### Joining the geo and statistical data

In [53]:
final=gpd.GeoDataFrame(time_series[[2]].join(gdf['geometry']))
final.columns=["data","geometry"]

### Using KeplerGL 

In [54]:
import keplergl

In [64]:
#Add the data containing the geo-info first
map_1 = keplergl.KeplerGl(height=600)
map_1.add_data(geojson,"geodata")

User Guide: https://github.com/keplergl/kepler.gl/blob/master/docs/keplergl-jupyter/user-guide.md


In [65]:
# After running this cell you will find a html file named "infection_map in cwd.
# Open it in the web browser to see the data visualization
map_1.save_to_html(file_name="infection_map.html")

Map saved to infection_map.html!


In [66]:
from IPython.display import IFrame

IFrame(src='infection_map.html',width=700, height=600)

In [67]:
#Now lets add the infection data onto the map

In [88]:
#Here is a small helper class which will generate an output map view for any given day
# Note here day 0 is considered as some day in early May of 2020
class Covid_Mapper():
    
    def __init__(self,filename="kepler_infection_map.html"):
        with open('BBMP.GeoJSON', 'r') as f:
            self.geojson = f.read()
            
        #The follow is all the csv file which are stored representations of dataframe
        # Be sure to uncomment .to_csv() int the cells above incase you are missing any of these files.
        self.time_series=pd.read_csv('TS.csv',index_col="Unnamed: 0")
        self.gdf=gpd.read_file('BBMP.GeoJSON')
        self.gdf['WARD_NO']=self.gdf['WARD_NO'].astype(int)
        self.gdf.set_index('WARD_NO',inplace=True)
        self.gdf.sort_index(inplace=True)
        self.filename = filename
    
    def get_for_day(self,day):
        day=str(day)
        final=gpd.GeoDataFrame(self.time_series[[day]].join(self.gdf[['LAT','LON','geometry']]))
        final.columns=["data","Lat","Lon","geometry"]
        map_1 = keplergl.KeplerGl(height=600)
        map_1.add_data(self.geojson,"geoj")
        map_1.add_data(final,name='final')
        map_1.save_to_html(file_name=self.filename)

#### In the below cell, the helper class is instantiated and the day number for which the map plot is to be generated is passed as the "day" parameter to the get_for_day method. 

In [85]:
#Running this cell will automatically over-write any old file named "kepler_infection_map.html"
covid_map=Covid_Mapper()
covid_map.get_for_day(day=50)

User Guide: https://github.com/keplergl/kepler.gl/blob/master/docs/keplergl-jupyter/user-guide.md
Map saved to kepler_infection_map.html!


### At the time of making this notebook I couldnt find a way to preset the map setting.
### Please follow the below instructions to color the map after running the following cell :
1. Click on the arrow at the top-left of the map screen  
2. Select the 'geoj' tab and disable the "Fill Color" toggle button
3. Select the 'final' tab, in Fill Color and click on the select 3-bot-button beside the toggle button
4. Click on the "Color Based On" Field and set it to data.
<br><br>
The map should now be color as per the covid-data for that day
<p>NOTE : By default the map-colors are inverted, that is , darker=less,to invert it you can select the color bar below fill color and select an appropriate pallet.</p>

In [87]:
IFrame(src='kepler_infection_map.html',width=700, height=600)

# END