-
Notifications
You must be signed in to change notification settings - Fork 0
/
Banks.Rmd
316 lines (229 loc) · 18.3 KB
/
Banks.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
---
title: "Universal Banking, Cashless economy, Digital wallets, Jumla or a promise?"
author: "Vijayvithal"
date: "December 5, 2016"
output:
html_document: default
pdf_document: default
---
```{r setup, include=FALSE}
rm(list=ls())
knitr::opts_knit$set(root.dir = normalizePath('../'))
knitr::opts_chunk$set(fig.width=12, fig.height=8, echo=FALSE, warning=FALSE, message=FALSE)
options(digits=4,scipen = 9)
library(XML)
library(mapview)
library(dplyr)
library(tidyr)
library(lubridate)
library(rgdal)
library(leaflet)
library(xtable)
library(ggplot2)
library(readxl)
library(knitr)
lakh<-100000
crore<-100*lakh
lakhcrore<-lakh*crore
```
```{r getData,cache=T,include=FALSE }
#Check if data is already downloaded
if(!file.exists("data/Banks.xlsx")){
download.file("https://dbie.rbi.org.in/DBIE/MOFResultDisplayexcel.jsp?value_CENTERS=-1&value_BANK_CODE=-1&value_BANK_CODE_TYPE=-1&value_DISTRICTS=-1&value_STATES=19,80,09,01,06,39,71,69,67,68,54,34,46,44,07,84,96,89,70,60,15,02,03,14,29,16,99,30,50,17,90,81,18,20,21,10&value_BANKNAMES=-1&value_ADDRESS=-1&value_BANKWRK=rbi&sidx=C_REG_DESC&sord=asc",destfile = "data/Banks.xlsx")
}
banks<-read_excel("data/Banks.xlsx")
shp<-readOGR("../maps/Districts/Census_2011")
distProfile<-read_excel("data/A-1_NO_OF_VILLAGES_TOWNS_HOUSEHOLDS_POPULATION_AND_AREA.xlsx",skip=3)
```
```{r cleanup}
rbiDup<-read_excel("data/RBI_MAP_FIX.xlsx",3)
#Get District Details.
distProfile[16:17]<-NULL
censusrename<-read_excel("data/RBI_MAP_FIX.xlsx",5)
dp<-distProfile%>%
mutate(State=ifelse(`4`=="STATE",`5`,NA))%>%
rename(District=`5`,classification=`6`,inhabitedVillage=`7`,uninhabitedVillage=`8`,towns=`9`,households=`10`,persons=`11`,males=`12`,females=`13`,area=`13_0`,pop_per_sqkm=`14`)%>%
fill(State)%>%
mutate(District=toupper(District))%>%
filter(`4`=="DISTRICT")%>%
merge(rbiDup,all.x=T,by="District")%>%
mutate(District=ifelse(is.na(uniq),District,paste0(District,"(",State,")")),District=trimws(District,which="both"))%>%
left_join(censusrename,all.x=T,by=c("District"="Census District"))%>%
mutate(District=ifelse(is.na(`GIS DISTRICT`),District ,`GIS DISTRICT`))
total<-filter(dp,classification=="Total")
#Add an id field to enable reordering of the shape dataset.
id<-1:641
shp@data<-mutate(shp@data,id=id)
#Perform any required cleanup.
banks1<-banks%>%select(District,Center,State)%>%merge(rbiDup,all.x=T,by="District")%>%mutate(District=ifelse(is.na(uniq),District,paste0(District,"(",State,")")))
nameVar<-read_excel("data/RBI_MAP_FIX.xlsx",2)
renameVar<-read_excel("data/RBI_MAP_FIX.xlsx",4)%>%mutate(`Census District`=toupper(`Census District`),District=toupper(District))
banks1<-merge(banks1,nameVar,by="District",all.x=T)%>%
mutate(District=ifelse(is.na(`Census DISTRICT`),District,`Census DISTRICT`))%>%
merge(renameVar,all.x=T,by="District") %>%
mutate(District=ifelse(is.na(`Census District`),District,`Census District`))
banksum<-banks1%>%ungroup()%>%group_by(District)%>%summarise(numBanks=n())
```
# Introduction
The nation is trying a massive financial experiment, Initially called demonetisation and war on terrorism/black money. It has slowly morphed into a war on the informal sector and efforts to bring every transaction into the banking system.
To bring every transaction into the banking system we need a robust banking infrastructure. If every employer pays his staff via bank transfer's then the staff should be able to withdraw their salary without major expenditure of time.
For the conversion to a formal economy to be technically feasible and successful similar convenience should be available in the rural and non metro sector.
## Banking convinence
Any service, in order to be used, should be available and convenient. For a person currently doing cash transactions, once he receives his salary he can perform a cash transaction for obtaining goods and services with zero transaction delay. Once the employer of such a person is asked to be a part of formal economy and pay salary via bank transfer, then the employee needs to perform an additional step of first withdrawing the salary from the bank.
The very first question that needs to be answered is, is the new process(banking) as easy as the old? does the employee need to spend additional time to receive his salary? if yes, how much?
The additional time required to withdraw his salary depends on two things
1. Distance of the bank/ATM from his house.
1. Number of people standing in the queue at the Bank/ATM
The readers of this article may have 6-7 banks and around 8-10 ATM's within 1 KM radius of their home and office. Under normal circumstances their banking burden is at most a few minutes of effort to withdraw cash from the nearest ATM while running errands.
If visiting the bank required a bus journey to the district/taluka headquarters and/or standing in a queue for hours then, will we exhibit the same banking pattern?
A customer would prefer to have his Bank or ATM next door so that he can transact with minimum hassle.
## Bank viability
For the bank to provide a service point (either a branch or an ATM) the questions are...
1. Is it financially viable? i.e. How many customers will use it? What will be the minimum deposit from them?
1. Is it secure? Does the area have major incidents of lawlessness (Naxalism, insurgency, terrorism, dacoity, goondaism etc.)
If the answer to any of these question is negative, then the banks will not be willing to provide the required banking service in that region.
Banks are not in the business of saving/salary account maintenance. They are in the business of lending other peoples money. To setup a branch office staffed with a manager, accountant, cashier, teller, guard etc. requires investment in terms of lakhs of rupees per month for salary alone, to this we should also add the building rental, networking infrastructure, IT, electricity, transport and other costs.
An analysis of a profitable [rural bank](http://www.nehu.ac.in/Journals/NEHUJournalJan_June2014_Art4.pdf) finds:
1. The bank branch is serving a population of ~51,000 with 4 employees.
2. The branch has an annual expenditure of `r 7802*1000/lakh` lakhs.
```{r}
rural<-dp%>%filter(classification=="Rural")%>%filter(persons>0)%>%mutate(averagePopulation=persons/inhabitedVillage)
median_village_pop<-median(rural$persons/rural$inhabitedVillage)
ggplot(rural,aes(x=averagePopulation))+
geom_histogram(binwidth = 500)+ylab("number of district")+xlab("Average village population")
```
With 82% of Indian villages having a population of less than 2,000. (The districtwise median rural population is `r median_village_pop` persons.)
The branch in the above study would support `r 51000/median_village_pop` villages. Providing near home banking would mean increasing the banking density to say, one branch per village.
The expenditure required to run these additional branches will be more or less equal to the expenditure required to run a single branch. Since the population served has reduced from 51000 to `r median_village_pop` people it will result in a fall in the per branch income impacting profitability.
```{r CreateMapData}
#Read duplicate names and uniquify them.
dupNames<-read_excel("data/RBI_MAP_FIX.xlsx",1)
md<-merge(shp@data%>%mutate(DISTRICT=as.character(DISTRICT)),dupNames,by="censuscode",all.x=T)%>%
mutate(DISTRICT=ifelse(is.na(`district name`),DISTRICT,`district name`))%>%
mutate(DISTRICT=toupper(DISTRICT))%>%
left_join(total,by=c("DISTRICT"="District"))%>%
left_join(banksum,by=c("DISTRICT"="District"))%>%
mutate(areaPerBank=(area/numBanks),popPerBank=persons/numBanks)
# Read District names from RBI and match to Map.
mapdata<-anti_join(md,banksum,by=c("DISTRICT"="District"))
mapdata1<-anti_join(banksum,md,by=c("District"="DISTRICT"))
```
```{r createMap}
shp@data<-md%>%
arrange(id)%>%
select(DISTRICT,inhabitedVillage,uninhabitedVillage,towns,households,persons:pop_per_sqkm,area,State,numBanks:popPerBank)%>%
mutate(popup=paste0("<b>District</b>=",DISTRICT,
"<br/><b>Number of Banks</b>=",numBanks,
"<br/><b>Persons per Bank</b>=",popPerBank,
"<br/><b>Area (sqkm) / Bank</b>=",areaPerBank,
"<br/><b>Area (sqkm) </b>=",area,
"<br/><b>Population </b>=",persons,
"<br/><b>Population/ sqkm </b>=",pop_per_sqkm
))
md<-shp@data
```
# Area covered by each branch in a given district.
Having established nearness of the bank as one of the criteria for transition to formal economy, Let us see what is the average distance between bank branches in each district.
(Note the numbers calculated here are optimistic as a majority of bank branches are typically found in the district headquarters which will actually result in greater distance for rural places in the same district.)
```{r mappingArea,screenshot.force = TRUE}
#
#Area wise distribution of Banks
a<-quantile(shp$areaPerBank,probs = seq(0,1,.1),na.rm = T)
pal<-colorBin(topo.colors(10),shp$areaPerBank,bin=a,na.color="#ff0000")
map<-leaflet(shp)%>%
addTiles(urlTemplate = "http://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}.png") %>%
addPolygons(stroke=F,fillColor =pal(shp$areaPerBank), popup=shp$popup) %>%
addLegend(position = 'topright', pal=pal,values=shp$areaPerBank, opacity = 1.0, title = 'Area covered by each Branches/District')
#mapshot(map,file="areaPerBank.png")
map
```
Looking at the Branch density per district we find a few interesting things.
1. Banking density is good along the Gangatic plain. From Punjab, Delhi, Kanpur, Lucknow to Kolkatta.
1. Banking is again good in the south, in the area below Chennai and Bangalore.
1. The third zone is the stretch from Pune-Mumbai to Ahmadabad
1. The fourth zone is the coastal strip from Kolkatta to Hyderabad.
1. Other then these we have a few zones around cities like Nagpur, Indore, Bhopal etc. But as a whole the Banking density in the rest of India is low.
The median banking area in India is `r median(md$areaPerBank,na.rm=T)` sq km. per bank.
While the long tail goes on for thousands of Km, the histogram of the area covered by banks tells us that very few districts provide banks within 10 km radius to their citizens.
```{r}
ggplot(md%>%filter(areaPerBank<100),aes(x=areaPerBank)) + geom_histogram(binwidth = 10)+ylab("number of district")
```
## Urban and rural divide in distribution of banking.
Since banks are in the business of lending other people's money, you will typically find banks at places where their business is.
Taking the example of a place like Bhopal we find.
```{r}
bhopal<-filter(banks,District=="BHOPAL")
count_centers<-bhopal%>%group_by(Center)%>%summarise(k=n())%>%arrange(desc(k))
bhopal_profile<- total%>%filter(District=="BHOPAL")
hq<-banks%>%rowwise() %>% mutate(HQ=grepl(District,Center,ignore.case=T))
hqsum<-hq%>%group_by(District)%>%summarise(HQ_sum=sum(HQ),HQ_tot=n())%>%mutate(percentage=HQ_sum*100/HQ_tot)
bhopal_rural<-dp%>%filter(District=="BHOPAL")%>%filter(classification=="Rural")
bhopalRuralBankDensity<-bhopal_rural$persons/(sum(count_centers$k) - count_centers[1,]$k)
bhopal_urban<-dp%>%filter(District=="BHOPAL")%>%filter(classification=="Urban")
bhopalUrbanBankDensity<-bhopal_urban$persons/(count_centers[1,]$k)
```
Bhopal city(area 463 sq km) is a part of Bhopal District(2772 sq km) and out of the total `r sum(count_centers$k)` branches in the district, the city has `r count_centers[1,]$k` branches. The remaining `r bhopal_profile$inhabitedVillage` villages and `r bhopal_profile$towns -1` towns share the remaining `r sum(count_centers$k) - count_centers[1,]$k` branches.
Some of the districts with more than 70% bank branches in the District HQ are
```{r}
kable(hqsum%>%filter(percentage>70)%>% rename(Total_Banks=HQ_tot,Banks_at_HQ=HQ_sum))
```
In general Urban Centers, Metro's etc will have banks at every kilometer or even less and in every locality while tens of villages will share one bank among them. At the worst end of the spectrum, the nearest bank may be hundreds of Km away.
# Population covered by each branch in a given district.
To answer the second question, i.e. how long the line/wait time is we need to observe the theoretical count of the number of people served by Banks.
Note, This again suffers from an optimistic estimation error as there will be more banks in the urban sector and less in the rural. Taking the example of Bhopal the urban population/bank density is `r bhopalUrbanBankDensity` people/bank and the rural density is `r bhopalRuralBankDensity` people/bank.
```{r mappingPop,screenshot.force = TRUE}
#
#Population wise distribution of Banks
a<-quantile(shp$popPerBank,probs = seq(0,1,.1),na.rm = T)
pal<-colorBin(topo.colors(10),shp$popPerBank,bin=a,na.color="#ff0000")
map<-leaflet(shp)%>%
addTiles(urlTemplate = "http://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}.png") %>%
addPolygons(stroke=F,fillColor =pal(shp$popPerBank), popup=shp$popup) %>%
addLegend(position = 'topright', pal=pal,values=shp$popPerBank, opacity = 1.0, title = 'Population covered by each branch/District')
#mapshot(map,file="popPerBank.png")
map
```
If we expect every person to hold a bank account and transact at least once each month with his bank, then we need to estimate whether the banks can serve the population properly.
Currently the median population served per bank would be `r median(md$popPerBank,na.rm=T)` people/bank.
```{r}
ggplot(md%>%filter(popPerBank<25000)%>%filter(popPerBank>1),aes(x=popPerBank)) + geom_histogram(binwidth = 2000)+ylab("number of district")
```
When the population map is compared with the area map we find that While the north, the south and to some extent the west has a good ratio of bank to the population, the Gangatic region and the east which has a good area wise distribution cannot service its population.
Central India fails on both the metrics, area as well as population.
Currently some of the heavily overburdened banks are
```{r}
kable(tail(md%>%arrange(popPerBank)%>%filter(State !="NCT OF DELHI")%>%select(DISTRICT,State,numBanks:popPerBank,persons)),padding=2)
```
and the less burdened ones are
```{r}
kable(head(md%>%arrange(popPerBank)%>%filter(State !="NCT OF DELHI")%>%select(DISTRICT,State,numBanks:popPerBank,persons)))
```
# Bottomline.
From a purely economical point of view, urban centers generally have better facilities(all types not just banking) compared to rural centers because the high population density in urban centers results in more population served per sq km covered by a service infrastructure. Where as in rural centers to achieve the same population coverage a larger area should be covered by the service delivery point. Beyond a certain distance access to the service delivery point becomes inconvenient and the population decides not to use the service.
Specifically for banks, when residents of small hamlets with 100-400 households have to travel tens of km to access their bank(Note the same factors which lead to scarcity of bank branches also leads to scarcity of bus and other transport services) accessing banking services on regular basis becomes inconvenient.
Area's with low population density will also have police and other services at distance so security of manned or unmanned ATM's is also a problem.
*Cashless Economy*
The PM's promise of cashless economy is not possible in the near future, at best it is a Jumla to calm troubled minds in this period of economic stress. At worst it is another unplanned economic disaster.
*Universal Banking*
The banks need to invest a huge amount to create the infrastructure required for 'near home banking', recent efforts in Banking correspondents and setting up micro banks is a effort in the right direction, till then the accounts will continue remaining dormant.
Creating the required banking infrastructure for the poorest of the poor will require creative thinking to reduce the bank expenditure.
If instead of the current status where less than 25% of the population have an active bank account every citizen had an active bank account, with easy near home access to banks and a low deposit of a few thousands only then the banks will crumble under this load.
## Data Accuracy.
Two sources of inaccuracy may take place, the first in the data collection and reporting, e.g. when a district is divided the old banks may not have updated their address with RBI resulting in most of the banks in the new district still being associated with the old district. We see this with the Banks in Delhi.
The second is in cleaning up the data. Since around 90 new districts came into existence after the 2011 census, a call had to be taken on how to merge the various dataset. The decision taken was, to merge the new district data into the census districts. This merging process may result in inaccurate computation for those districts.
# Analysis method.
We are doing a spatial analysis of the bank locations across India. Instead of taking individual Bank locations we are aggregating the banks at the district level.
The data sources are
1. Population and Area data from the 2011 census
1. Bank location data from the most recent RBI Bank list.
1. District Maps from datameet based on 2011 census
1. Internet data from TRAI
1. List of Districts formed after 2011 and their reverse mapping from blog.socialcops.com
While this is better than taking the national average which would treats a metro like Bangalore or Mumbai to be equivalent to a remote village in Bihar. A district level analysis will still give a skewed result as the district headquarter or a major commercial zone will attract most of the banks.
# E-Wallet, Net Banking and Credit/Debit Cards.
When we speak of cashless economy, It requires certain components, A credit card or an e-wallet, (Mobile) Broadband connection etc.
Recent reports (Nov 30 2016 http://economictimes.indiatimes.com/tech/internet/indias-low-broadband-penetration-a-concern-trai-chairman/articleshow/55705295.cms ) put the average broadband penetration in India at 7%
If we count both broadband and narrowband internet this number raises to 24% (12% rural and 49% Urban) Places like Bihar register rural internet numbers as low as 6.77%. (Source http://www.trai.gov.in/WriteReadData/PIRReport/Documents/Indicator-Reports-Mar12082015.pdf)
Unless this number rises to near 100% from the current 7% Digital economy may remain just a dream.
# Opensource analysis.
The code used to create this report and associated files are available at https://github.com/DataDrivenPolicy/India/blob/master/src/Banks.Rmd