# Analysis of Schools in different regions of Hyderabad

<font size='3'>
    I have made this report as part of my Capstone Project for the Coursera Course
    </font>

## Introduction

<font size='3'>
It is important to have access to quality education at the school level for all people. <br>In an attempt to make it possible the government has to identify the various regions where a lot of infrastructure needs to be built in order to provide people living in those areas access to schooling at affordable rates.<br>
It also helps private companies to identify all those regions where they have a great potential to build schools.<br>
Also it helps families decide the neighbourhoods in which they can have access to good and quality education.<br>
As part of my capstone project, in this report I have tried to analayze the various regions in Hyderabad where there is lack of infrastructure at the school level.</font>

## Data

<font size='3'>
I have taken the list of neighbourhoods in Hyderabad from Wikipedia and have found their latitudes and longitudes using geocoder.<br>
Then using FourSquare API I have identified the location of various schools in Hyderabad and plotted them using FourSquare API to get a rough idea of distribution.<br>
I have identified 172 schools along with their locations in Hyderabad which was returned by FourSquare API.<br>
Basically the data consists of the Neighbourhood along with the School name, Latitude, Longitude and category of school.</br>
Also I have obtained the list of top schools in Hyderabad by scraping it from https://yellowslate.com/blog/best-schools-in-hyderabad-2020/ to highlight these schools in particular.
Using this information I have analyzed and identified the various regions lacking in infrastructure with respect to schools
</font>

In [2]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation
!pip install geopy
#!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize
!pip install folium
#!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library
%matplotlib inline

import matplotlib as mpl
import matplotlib.pyplot as plt
print('Folium installed')
print('Libraries imported.')

Collecting folium
[?25l  Downloading https://files.pythonhosted.org/packages/fd/a0/ccb3094026649cda4acd55bf2c3822bb8c277eb11446d13d384e5be35257/folium-0.10.1-py2.py3-none-any.whl (91kB)
[K     |████████████████████████████████| 92kB 6.9MB/s eta 0:00:011
Collecting branca>=0.3.0 (from folium)
  Downloading https://files.pythonhosted.org/packages/81/6d/31c83485189a2521a75b4130f1fee5364f772a0375f81afff619004e5237/branca-0.4.0-py3-none-any.whl
Installing collected packages: branca, folium
Successfully installed branca-0.4.0 folium-0.10.1
Folium installed
Libraries imported.


In [3]:
import urllib.request

In [4]:
url="https://en.wikipedia.org/wiki/Category:Neighbourhoods_in_Hyderabad,_India"
page=urllib.request.urlopen(url)

In [5]:
!pip install beautifulsoup4
from bs4 import BeautifulSoup



In [6]:
soup = BeautifulSoup(page, "lxml")
print(soup)

<!DOCTYPE html>
<html class="client-nojs" dir="ltr" lang="en">
<head>
<meta charset="utf-8"/>
<title>Category:Neighbourhoods in Hyderabad, India - Wikipedia</title>
<script>document.documentElement.className="client-js";RLCONF={"wgBreakFrames":!1,"wgSeparatorTransformTable":["",""],"wgDigitTransformTable":["",""],"wgDefaultDateFormat":"dmy","wgMonthNames":["","January","February","March","April","May","June","July","August","September","October","November","December"],"wgRequestId":"XqAPaQpAMNYAAeCYA@EAAABD","wgCSPNonce":!1,"wgCanonicalNamespace":"Category","wgCanonicalSpecialPageName":!1,"wgNamespaceNumber":14,"wgPageName":"Category:Neighbourhoods_in_Hyderabad,_India","wgTitle":"Neighbourhoods in Hyderabad, India","wgCurRevisionId":881961440,"wgRevisionId":881961440,"wgArticleId":3839100,"wgIsArticle":!0,"wgIsRedirect":!1,"wgAction":"view","wgUserName":null,"wgUserGroups":["*"],"wgCategories":["Neighbourhoods in Telangana","Geography of Hyderabad, India","Neighbourhoods in Andhra Prad

In [7]:
#print(soup)
neighbourhood_names=[]
a_tags=soup.find_all('a')
for a_tag in a_tags:
    neighbourhood_names.append(a_tag.text)
print(neighbourhood_names)

['', 'Help', 'Jump to navigation', 'Jump to search', 'learn more', 'next page', 'A. S. Rao Nagar', 'A.C. Guards', 'Abhyudaya Nagar', 'Abids', 'Adikmet', 'Afzal Gunj', 'Aghapura', 'Aliabad, Hyderabad', 'Alijah Kotla', 'Allwyn Colony', 'Alwal', 'Amberpet', 'Ameerpet', 'Ashok Nagar, Hyderabad', 'Asif Nagar', 'Attapur', 'Azamabad, Hyderabad', 'Azampura', 'Badichowdi', 'Bagh Lingampally', 'Bairamalguda', 'Balkampet', 'Banjara Hills', 'Bank Street, Hyderabad', 'Barkas, Hyderabad', 'Barkatpura', 'Basheerbagh', 'Bazarghat', 'Begum Bazaar', 'Begumpet', 'Bharat Nagar', 'BHEL Township, Hyderabad', 'BJR Nagar', 'Boggulkunta', 'Borabanda', 'Bowenpally', 'Brahman Vaadi', 'Chaderghat', 'Champapet', 'Chanchalguda', 'Chandrayan Gutta', 'Chatta Bazaar', 'Cherlapally', 'Chikkadpally', 'Chilkalguda', 'Chintal Basti', 'Chintalakunta', 'Dabirpura', 'Dar-ul-Shifa', 'Dhoolpet', 'Dilsukhnagar', 'Domalguda', "ECIL 'X' Roads", 'Edi Bazar', 'Erragadda', 'Fateh Nagar, Hyderabad', 'Ferozguda', 'Film Nagar', 'Gachib

In [8]:
start=neighbourhood_names.index('A. S. Rao Nagar')
end=neighbourhood_names.index('Somajiguda')
neighbourhood_names=neighbourhood_names[start:end+1]
#print(neighbourhood_names)

In [9]:
lst=['Srinagar Colony','Suchitra Center','Sultan Bazar','Tarnaka','Tilak Nagar','Tirumalagiri','Tolichowki','Uddengadda','Umdanagar','Uppal Kalan','Uppuguda','Vanasthalipuram','Vasavi Colony','Vidyanagar','Vikrampuri','Warsiguda','Yakutpura','Yapral','Yellareddyguda','Yousufguda']
for item in lst:
    neighbourhood_names.append(item)
print(neighbourhood_names)

['A. S. Rao Nagar', 'A.C. Guards', 'Abhyudaya Nagar', 'Abids', 'Adikmet', 'Afzal Gunj', 'Aghapura', 'Aliabad, Hyderabad', 'Alijah Kotla', 'Allwyn Colony', 'Alwal', 'Amberpet', 'Ameerpet', 'Ashok Nagar, Hyderabad', 'Asif Nagar', 'Attapur', 'Azamabad, Hyderabad', 'Azampura', 'Badichowdi', 'Bagh Lingampally', 'Bairamalguda', 'Balkampet', 'Banjara Hills', 'Bank Street, Hyderabad', 'Barkas, Hyderabad', 'Barkatpura', 'Basheerbagh', 'Bazarghat', 'Begum Bazaar', 'Begumpet', 'Bharat Nagar', 'BHEL Township, Hyderabad', 'BJR Nagar', 'Boggulkunta', 'Borabanda', 'Bowenpally', 'Brahman Vaadi', 'Chaderghat', 'Champapet', 'Chanchalguda', 'Chandrayan Gutta', 'Chatta Bazaar', 'Cherlapally', 'Chikkadpally', 'Chilkalguda', 'Chintal Basti', 'Chintalakunta', 'Dabirpura', 'Dar-ul-Shifa', 'Dhoolpet', 'Dilsukhnagar', 'Domalguda', "ECIL 'X' Roads", 'Edi Bazar', 'Erragadda', 'Fateh Nagar, Hyderabad', 'Ferozguda', 'Film Nagar', 'Gachibowli', 'Gaddiannaram', 'Golnaka', 'Goshamahal', 'Gudimalkapur', 'Gulzar Houz', 

In [10]:
n_df=pd.DataFrame(neighbourhood_names,columns=['Neighbourhood'])
n_df.head()
n_df.shape

(220, 1)

In [11]:
!pip install geocoder
import geocoder # import geocoder

# initialize your variable to None
latitudes=[]
longitudes=[]
count=0
# loop until you get the coordinates
for index,row in n_df.iterrows():
    print(count)
    lat_lng_coords = None
    neighbourhood=row['Neighbourhood']
    while(lat_lng_coords is None):
        print('2')
        g = geocoder.arcgis('{}, Hyderabad, India'.format(neighbourhood))
        lat_lng_coords = g.latlng
    count=count+1
    latitudes.append(lat_lng_coords[0])
    longitudes.append(lat_lng_coords[1])
n_df['Latitude']=latitudes
n_df['Longitude']=longitudes
n_df.head()

Collecting geocoder
[?25l  Downloading https://files.pythonhosted.org/packages/4f/6b/13166c909ad2f2d76b929a4227c952630ebaf0d729f6317eb09cbceccbab/geocoder-1.38.1-py2.py3-none-any.whl (98kB)
[K     |████████████████████████████████| 102kB 8.2MB/s ta 0:00:011
Collecting ratelim (from geocoder)
  Downloading https://files.pythonhosted.org/packages/f2/98/7e6d147fd16a10a5f821db6e25f192265d6ecca3d82957a4fdd592cad49c/ratelim-0.1.6-py2.py3-none-any.whl
Installing collected packages: ratelim, geocoder
Successfully installed geocoder-1.38.1 ratelim-0.1.6
0
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
2
10
2
11
2
12
2
13
2
14
2
15
2
16
2
17
2
18
2
19
2
20
2
21
2
22
2
23
2
24
2
25
2
26
2
27
2
28
2
29
2
30
2
31
2
32
2
33
2
34
2
35
2
36
2
37
2
38
2
39
2
40
2
41
2
42
2
43
2
44
2
45
2
46
2
47
2
48
2
49
2
50
2
51
2
52
2
53
2
54
2
55
2
56
2
57
2
58
2
59
2
60
2
61
2
62
2
63
2
64
2
65
2
66
2
67
2
68
2
69
2
70
2
71
2
72
2
73
2
74
2
75
2
76
2
77
2
78
2
79
2
80
2
81
2
82
2
83
2
84
2
85
2
86
2
87
2
88
2
89
2
90
2
91

Unnamed: 0,Neighbourhood,Latitude,Longitude
0,A. S. Rao Nagar,17.4112,78.50824
1,A.C. Guards,17.392977,78.456867
2,Abhyudaya Nagar,17.33765,78.56414
3,Abids,17.3898,78.47658
4,Adikmet,17.41061,78.51513


In [12]:
CLIENT_ID = 'VB4RCFLEFQLU3ODO20N54NBUETVANY34YKIDZ0QZ3INNOIVO' # your Foursquare ID
CLIENT_SECRET = 'SAOFLKSO3HP1ZTRUNFXO23MTCFHNU01IRMQ5T2TAAJKZ2EL0' # your Foursquare Secret
VERSION = '20200101' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: VB4RCFLEFQLU3ODO20N54NBUETVANY34YKIDZ0QZ3INNOIVO
CLIENT_SECRET:SAOFLKSO3HP1ZTRUNFXO23MTCFHNU01IRMQ5T2TAAJKZ2EL0


In [13]:
limit=100
radius=600
#lat=17.389800
#lng=78.476580
search_query='School'

In [14]:
column_names=['Neighbourhood','Latitude','Longitude','School','School_Latitude','School_Longitude','School_category']
mydf=pd.DataFrame(columns=column_names)
for name, lat, lng in zip(n_df['Neighbourhood'],n_df['Latitude'],n_df['Longitude']):
    url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, lat, lng, VERSION, search_query, radius, limit)
    results = requests.get(url).json()
    k=results['response']['venues']
    for school in k:
        if len(school['categories'])>0:
            mydf=mydf.append({'Neighbourhood':name,'Latitude':lat,'Longitude':lng,'School':school['name'],'School_Latitude':school['location']['lat'],'School_Longitude':school['location']['lng'],'School_category':school['categories'][0]['name']},ignore_index=True)
        else:
            mydf=mydf.append({'Neighbourhood':name,'Latitude':lat,'Longitude':lng,'School':school['name'],'School_Latitude':school['location']['lat'],'School_Longitude':school['location']['lng'],'School_category':'Not Mentioned'},ignore_index=True)


#venues_list.append([(
#            'Abids', 
#            lat, 
#            lng, 
#            v['venue']['name'], 
#            v['venue']['location']['lat'], 
#            v['venue']['location']['lng'],  
#            v['venue']['categories'][0]['name']) for v in results])

In [15]:
mydf

Unnamed: 0,Neighbourhood,Latitude,Longitude,School,School_Latitude,School_Longitude,School_category
0,A.C. Guards,17.392977,78.456867,Genesis high school,17.395249,78.457319,Student Center
1,A.C. Guards,17.392977,78.456867,"Govt High School, Vijaya Nagar Colony",17.394348,78.454649,Student Center
2,A.C. Guards,17.392977,78.456867,St. Anns Girls High School For Girls,17.394340,78.453383,College Academic Building
3,A.C. Guards,17.392977,78.456867,Royal motor driving school,17.394417,78.460744,General Travel
4,A.C. Guards,17.392977,78.456867,hyderabad international residential school,17.385760,78.460598,Trade School
5,Abids,17.389800,78.476580,PANACHE SCHOOL OF SOUND,17.391128,78.477290,Music Venue
6,Abids,17.389800,78.476580,Diamond Jubilee High School,17.389743,78.474340,High School
7,Abids,17.389800,78.476580,John's High School,17.393322,78.476040,Not Mentioned
8,Abids,17.389800,78.476580,Little Flower High School,17.393032,78.474703,College Lab
9,Abids,17.389800,78.476580,St. Georges Grammar Prep School,17.393869,78.477888,Student Center


In [16]:
mydf=mydf[~mydf.School.str.contains("Driving",case=False)]
mydf=mydf[~mydf.School.str.contains("Sound",case=False)]
mydf=mydf[~mydf.School.str.contains("office",case=False)]
mydf=mydf[~mydf.School_category.str.contains("bank",case=False)]
mydf=mydf[~mydf.School_category.str.contains("architecture",case=False)]
mydf=mydf[~mydf.School_category.str.contains("engineering",case=False)]
mydf=mydf[~mydf.School_category.str.contains("library",case=False)]
mydf=mydf.drop(21)
mydf=mydf.reset_index(drop=True)
mydf

Unnamed: 0,Neighbourhood,Latitude,Longitude,School,School_Latitude,School_Longitude,School_category
0,A.C. Guards,17.392977,78.456867,Genesis high school,17.395249,78.457319,Student Center
1,A.C. Guards,17.392977,78.456867,"Govt High School, Vijaya Nagar Colony",17.394348,78.454649,Student Center
2,A.C. Guards,17.392977,78.456867,St. Anns Girls High School For Girls,17.394340,78.453383,College Academic Building
3,A.C. Guards,17.392977,78.456867,hyderabad international residential school,17.385760,78.460598,Trade School
4,Abids,17.389800,78.476580,Diamond Jubilee High School,17.389743,78.474340,High School
5,Abids,17.389800,78.476580,John's High School,17.393322,78.476040,Not Mentioned
6,Abids,17.389800,78.476580,Little Flower High School,17.393032,78.474703,College Lab
7,Abids,17.389800,78.476580,St. Georges Grammar Prep School,17.393869,78.477888,Student Center
8,Abids,17.389800,78.476580,Little Flower High School,17.392965,78.474798,High School
9,Abids,17.389800,78.476580,Rosary convent high school,17.393670,78.476690,Not Mentioned


In [17]:
mydf1 = mydf.drop_duplicates(subset=['School_Latitude','School_Longitude'],
                                       keep='first').reset_index(drop=True)
mydf1.shape

(169, 7)

In [18]:
mydf1[(mydf1.School.str.contains('Play',case=False)) | (mydf1.School.str.contains('Primary',case=False)) | (mydf1.School_category.str.contains('elementary',case=False)) | (mydf1.School.str.contains('Kids',case=False))]

Unnamed: 0,Neighbourhood,Latitude,Longitude,School,School_Latitude,School_Longitude,School_category
19,Ameerpet,17.43535,78.44861,Sister Niveditha School,17.439701,78.451701,Elementary School
24,"Ashok Nagar, Hyderabad",17.40784,78.4915,Abhignaa Play School,17.403712,78.494568,General College & University
33,Bagh Lingampally,17.39931,78.49964,Burgula Ramakrishna Rao Girls Primary School,17.39525,78.49878,General College & University
34,Bagh Lingampally,17.39931,78.49964,Arundathy Upper Primary School,17.394876,78.49918,School
35,Banjara Hills,17.41535,78.43435,OI Play School,17.417365,78.43189,Nursery School
41,Banjara Hills,17.41535,78.43435,Kangaroo Kids School,17.422061,78.435957,High School
89,"Kothapet, Hyderabad",17.36883,78.54229,Saint Martin's Play & Primary School,17.36478,78.548756,School
90,"Krishna Nagar, Hyderabad",17.42754,78.42063,Oi Play School Jubilee Hills,17.428062,78.416108,Preschool
102,Madannapet,17.35788,78.50171,Monarch School,17.354096,78.498862,Elementary School
114,Mir Alam Tank,17.355109,78.454123,Madina High school,17.355099,78.45683,Elementary School


In [19]:
mydf1.loc[(mydf1.School.str.contains('Secondary',case=False)) | (mydf1.School.str.contains('High',case=False)) | (mydf1.School_category.str.contains('High',case=False)),'School_category']='High School'
mydf1.loc[(mydf1.School.str.contains('Play',case=False)) | (mydf1.School.str.contains('Primary',case=False)) | (mydf1.School_category.str.contains('elementary',case=False)) | (mydf1.School.str.contains('Kids',case=False)),'School_category']='Primary/Play'
mydf1.loc[mydf1.School_category.str.contains('college',case=False),'School_category']='Not Mentioned'
mydf1.loc[~((mydf1.School_category.str.contains('Not Mentioned',case=False))|(mydf1.School_category.str.contains('High School',case=False))|(mydf1.School_category.str.contains('Student Center',case=False))|mydf1.School_category.str.contains('Primary/Play',case=False)),'School_category']='Not Mentioned'
mydf1

Unnamed: 0,Neighbourhood,Latitude,Longitude,School,School_Latitude,School_Longitude,School_category
0,A.C. Guards,17.392977,78.456867,Genesis high school,17.395249,78.457319,High School
1,A.C. Guards,17.392977,78.456867,"Govt High School, Vijaya Nagar Colony",17.394348,78.454649,High School
2,A.C. Guards,17.392977,78.456867,St. Anns Girls High School For Girls,17.394340,78.453383,High School
3,A.C. Guards,17.392977,78.456867,hyderabad international residential school,17.385760,78.460598,Not Mentioned
4,Abids,17.389800,78.476580,Diamond Jubilee High School,17.389743,78.474340,High School
5,Abids,17.389800,78.476580,John's High School,17.393322,78.476040,High School
6,Abids,17.389800,78.476580,Little Flower High School,17.393032,78.474703,High School
7,Abids,17.389800,78.476580,St. Georges Grammar Prep School,17.393869,78.477888,Student Center
8,Abids,17.389800,78.476580,Little Flower High School,17.392965,78.474798,High School
9,Abids,17.389800,78.476580,Rosary convent high school,17.393670,78.476690,High School


In [20]:
h_latitude=17.3850
h_longitude=78.4867

school_map = folium.Map(location=[h_latitude, h_longitude], zoom_start=12) # generate map centred around the Conrad Hotel
# add the Italian restaurants as blue circle markers
for lat, lng, label in zip(mydf1['Latitude'],mydf1['Longitude'],mydf1['School']):
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(school_map)

# display map
school_map

In [None]:
mydf1