# PUNE TREE CENSUS DATA 2019

According to Maharashtra (Urban Areas) Protection & Preservation Of Trees Act 1975 chapter four section 7 ( b ) once before December 1996 and thereafter once in every five years, carrying out a census of the existing trees in all land within its jurisdiction is mandatory.

Pune Municipal Corporation (PMC) has planned to conduct geo-enabled Tree census survey and plot the same on city map. SAAR IT was on boarded to develop a survey dashboard, which should be cross browser, cross platform compatible. This will give garden department a holistic view of survey work being conducted within PMC jurisdiction. The dashboard provides idea about: Current progress of the project, Quality checks status, layout of different varieties/specifies, distribution under different wards, etc.

Some of the high level scope includes:

Geo tagging of all the trees under PMC jurisdiction
Query Search for trees by species, location, or advanced filters such as diameter, date planted, or tree characteristics, etc.
Tree photos.
Monitor the progress on real time basis
Get optional integrated tree key to assist in identifying tree species

Questions to guide data analysis
 - What is ward-wise ranking of trees per sq. km?
 - How does canopy, height, girth, species stats vary across the city and between wards?
 
According to the census the civic body has counted 40,09,623 trees in 2019. In 2013 census PMC counted 38.60 lakh trees.
According to the Maharashtra Protection and Prevention of Trees Act 1975, it is mandatory to carry out tree census for every civic body once in every five years. Hence, the PMC allocated the tender of the tree census to Saar, a private information technology (IT) firm in 2016, for a period of seven years.
The main objective of geo-tagging is to check falsification of plantation claims.

In [1]:
import numpy as np
import pandas as pd

In [2]:
types = {'FID': np.object, 'id': np.int64, 'geom': np.object, 'oid': np.float64, 'sr_no': np.float64, 'girth_cm': np.int64, 
       'height_m': np.int64, 'canopy_dia_m': np.int64, 'condition': np.object, 'other_remarks': np.object, 'ownership': np.str,
       'society_name': np.str, 'road_name': np.str, 'northing': np.float64, 'easting': np.float64, 'balanced': np.bool,
       'remarks': np.object, 'special_collar': np.object, 'ward_name': np.int64, 'botanical_name': np.object, 'saar_uid': np.int64,
       'common_name': np.object, 'local_name': np.object, 'economic_i': np.object, 'phenology': np.object, 'flowering': np.object,
       'ward': np.int64, 'is_rare': np.object}
tree = pd.read_csv('E:\datasets\p1.zip', skipinitialspace = True, dtype=types, compression='infer')
tree.head()

Unnamed: 0,FID,id,geom,oid,sr_no,girth_cm,height_m,canopy_dia_m,condition,other_remarks,...,ward_name,botanical_name,saar_uid,common_name,local_name,economic_i,phenology,flowering,ward,is_rare
0,trees_display.fid--1672c0a_16c67517881_67e6,14038741,POINT (73.89543905900004 18.48682755200003),,,10,2,1,Healthy,,...,61,Tecoma stans (L.) Juss.ex Kunth,303912,Yellow Bells,Tecoma,Ornamental,Throughout year,Throughout year,61,False
1,trees_display.fid--1672c0a_16c67517881_67e7,14038742,POINT (73.81630115200007 18.557149113000033),,,115,10,4,Healthy,,...,8,Caryota urens Linn.,303913,Fish Tail Palm,Bherali Mad,Medicinal,Throughout year,Throughout year,8,False
2,trees_display.fid--1672c0a_16c67517881_67e8,14038743,POINT (73.79202280200006 18.505883990000022),,,15,2,2,Healthy,,...,29,Leucaena leucocephala (Lamk.)De.wit.,303914,Subabul,Subabul,Timber wood,Seasonal,July-January,29,False
3,trees_display.fid--1672c0a_16c67517881_67e9,14038744,POINT (73.81626896600004 18.557148160000054),,,13,2,1,Healthy,,...,8,Mangifera indica Linn.,303915,Mango,Amba,Fruit,Seasonal,January-March,8,False
4,trees_display.fid--1672c0a_16c67517881_67ea,14038745,POINT (73.81626192500005 18.55715006700007),,,25,7,1,Healthy,,...,8,Leucaena leucocephala (Lamk.)De.wit.,303916,Subabul,Subabul,Timber wood,Seasonal,July-January,8,False


In [3]:
tree.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000000 entries, 0 to 999999
Data columns (total 28 columns):
 #   Column          Non-Null Count    Dtype  
---  ------          --------------    -----  
 0   FID             1000000 non-null  object 
 1   id              1000000 non-null  int64  
 2   geom            1000000 non-null  object 
 3   oid             0 non-null        float64
 4   sr_no           0 non-null        float64
 5   girth_cm        1000000 non-null  int64  
 6   height_m        1000000 non-null  int64  
 7   canopy_dia_m    1000000 non-null  int64  
 8   condition       1000000 non-null  object 
 9   other_remarks   0 non-null        object 
 10  ownership       1000000 non-null  object 
 11  society_name    977186 non-null   object 
 12  road_name       996771 non-null   object 
 13  northing        1000000 non-null  float64
 14  easting         1000000 non-null  float64
 15  balanced        1000000 non-null  bool   
 16  remarks         572467 non-null   obj

In [4]:
tree.isnull().sum()/1000000 * 100

FID                 0.0000
id                  0.0000
geom                0.0000
oid               100.0000
sr_no             100.0000
girth_cm            0.0000
height_m            0.0000
canopy_dia_m        0.0000
condition           0.0000
other_remarks     100.0000
ownership           0.0000
society_name        2.2814
road_name           0.3229
northing            0.0000
easting             0.0000
balanced            0.0000
remarks            42.7533
special_collar     98.2052
ward_name           0.0000
botanical_name      1.3733
saar_uid            0.0000
common_name         0.0000
local_name          0.0000
economic_i          1.3733
phenology           1.9243
flowering           1.9402
ward                0.0000
is_rare             1.3733
dtype: float64

In [5]:
tree.columns

Index(['FID', 'id', 'geom', 'oid', 'sr_no', 'girth_cm', 'height_m',
       'canopy_dia_m', 'condition', 'other_remarks', 'ownership',
       'society_name', 'road_name', 'northing', 'easting', 'balanced',
       'remarks', 'special_collar', 'ward_name', 'botanical_name', 'saar_uid',
       'common_name', 'local_name', 'economic_i', 'phenology', 'flowering',
       'ward', 'is_rare'],
      dtype='object')

In [6]:
tree.nunique()

FID               1000000
id                1000000
geom               998324
oid                     0
sr_no                   0
girth_cm              685
height_m               24
canopy_dia_m           98
condition               4
other_remarks           0
ownership              13
society_name        18503
road_name            3096
northing           344318
easting            385831
balanced                2
remarks                 3
special_collar          1
ward_name              64
botanical_name        373
saar_uid          1000000
common_name           376
local_name            376
economic_i             12
phenology               2
flowering              92
ward                   64
is_rare                 2
dtype: int64

In [7]:
tree.describe()

Unnamed: 0,id,oid,sr_no,girth_cm,height_m,canopy_dia_m,northing,easting,ward_name,saar_uid,ward
count,1000000.0,0.0,0.0,1000000.0,1000000.0,1000000.0,1000000.0,1000000.0,1000000.0,1000000.0,1000000.0
mean,14538740.0,,,44.093714,5.892253,3.242886,18.527854,73.859968,23.024293,503581.1,23.024293
std,288675.3,,,47.504227,3.297705,2.882861,0.024574,0.042494,15.459895,290592.5,15.459895
min,14038740.0,,,8.0,2.0,0.0,18.44693,73.764704,1.0,100.0,1.0
25%,14288740.0,,,13.0,3.0,1.0,18.511846,73.82151,11.0,251936.8,11.0
50%,14538740.0,,,28.0,5.0,2.0,18.533034,73.848525,20.0,503435.5,20.0
75%,14788740.0,,,56.0,8.0,4.0,18.545761,73.902741,33.0,755147.2,33.0
max,15038740.0,,,1400.0,25.0,98.0,18.576291,73.94636,76.0,1047217.0,76.0


In [8]:
tree[tree['ward_name'] != tree['ward']]

Unnamed: 0,FID,id,geom,oid,sr_no,girth_cm,height_m,canopy_dia_m,condition,other_remarks,...,ward_name,botanical_name,saar_uid,common_name,local_name,economic_i,phenology,flowering,ward,is_rare


In [9]:
tree[['northing', 'easting']]

Unnamed: 0,northing,easting
0,18.486828,73.895439
1,18.557149,73.816301
2,18.505884,73.792023
3,18.557148,73.816269
4,18.557150,73.816262
...,...,...
999995,18.559890,73.925185
999996,18.566330,73.944088
999997,18.566343,73.944097
999998,18.559866,73.925478


In [10]:
tree['geom'].head()

0     POINT (73.89543905900004 18.48682755200003)
1    POINT (73.81630115200007 18.557149113000033)
2    POINT (73.79202280200006 18.505883990000022)
3    POINT (73.81626896600004 18.557148160000054)
4     POINT (73.81626192500005 18.55715006700007)
Name: geom, dtype: object

In [11]:
split_geom = tree['geom'].apply(lambda x: x[x.find('(') + 1 : x.find(')')].split())
split_geom.head()

0     [73.89543905900004, 18.48682755200003]
1    [73.81630115200007, 18.557149113000033]
2    [73.79202280200006, 18.505883990000022]
3    [73.81626896600004, 18.557148160000054]
4     [73.81626192500005, 18.55715006700007]
Name: geom, dtype: object

if you sure that this string have only some letters within brackets with no letters or numbers after or before the brackets like: “(text1)” , “(text2)” , …

you can use this

x='(text)'
print(x[1 : len(x)-1])

In [12]:
tree.columns

Index(['FID', 'id', 'geom', 'oid', 'sr_no', 'girth_cm', 'height_m',
       'canopy_dia_m', 'condition', 'other_remarks', 'ownership',
       'society_name', 'road_name', 'northing', 'easting', 'balanced',
       'remarks', 'special_collar', 'ward_name', 'botanical_name', 'saar_uid',
       'common_name', 'local_name', 'economic_i', 'phenology', 'flowering',
       'ward', 'is_rare'],
      dtype='object')

In [13]:
tree['geom_latitude'] = split_geom.apply(lambda mem: mem[1])
tree['geom_longitude'] = split_geom.apply(lambda mem: mem[0])

In [14]:
len(tree.columns)

30

In [15]:
tree.columns

Index(['FID', 'id', 'geom', 'oid', 'sr_no', 'girth_cm', 'height_m',
       'canopy_dia_m', 'condition', 'other_remarks', 'ownership',
       'society_name', 'road_name', 'northing', 'easting', 'balanced',
       'remarks', 'special_collar', 'ward_name', 'botanical_name', 'saar_uid',
       'common_name', 'local_name', 'economic_i', 'phenology', 'flowering',
       'ward', 'is_rare', 'geom_latitude', 'geom_longitude'],
      dtype='object')

In [16]:
tree[tree['geom_longitude'] == tree['easting']]

Unnamed: 0,FID,id,geom,oid,sr_no,girth_cm,height_m,canopy_dia_m,condition,other_remarks,...,saar_uid,common_name,local_name,economic_i,phenology,flowering,ward,is_rare,geom_latitude,geom_longitude


In [17]:
geo = tree[['geom_latitude', 'northing','geom_longitude', 'easting']]
geo.head(10)

Unnamed: 0,geom_latitude,northing,geom_longitude,easting
0,18.48682755200003,18.486828,73.89543905900004,73.895439
1,18.557149113000037,18.557149,73.81630115200007,73.816301
2,18.505883990000022,18.505884,73.79202280200006,73.792023
3,18.55714816000005,18.557148,73.81626896600004,73.816269
4,18.55715006700007,18.55715,73.81626192500005,73.816262
5,18.50596442800003,18.505964,73.79193596500006,73.791936
6,18.55714275600008,18.557143,73.81624482600007,73.816245
7,18.55714275600008,18.557143,73.81622739100004,73.816227
8,18.557138306000056,18.557138,73.81620124000008,73.816201
9,18.557209821000065,18.55721,73.81629779900004,73.816298
