<a href="https://colab.research.google.com/github/wnyeo/ExploreSCIS_data/blob/main/ExploreSCIS_ans_notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<table class="table table-bordered">
    <tr>
        <th style="text-align:center; width:60%"><img src='https://drive.google.com/uc?export=view&id=1jQtfAEFJ8tsuikRWYQXizafHksXV2ka9' style="width: 600px; height: 100px; "></th>
    <th style="text-align:center;"><font size="3"> <br/>ExploreSCIS 2021 <br /> <br>EDA and Basic Geospatial Visualisation Workshop by SMU Anthill Society</font></th>
    </tr>
</table> 

In this notebook, we will be using dataset of hawker centres retrieved from data.gov.sg (https://data.gov.sg/dataset/dates-of-hawker-centres-closure) to do an exploratory data analysis with Pandas and geospatial analysis with Folium. At the end of the workshop, we will generate a map like the one below:

<img src="https://drive.google.com/uc?export=view&id=1aZjERlpMM_S_sdKDa1ftFffkpAlRtRW-" 
     width="700" 
     height="400"/>

#Exploratory Data Analysis
Exploratory Data Analysis (EDA) is an approach to analyse datasets to summarise their main characteristics, usually early in an analytical process. Other than giving us a better understanding of the data, it also helps us to detect mistakes and check assumptions. We can also see relationships among the variables in the dataset.

In this workshop, we will be doing a simple form of EDA with the Pandas library.

#Introduction to Pandas
<center>
<img src='https://upload.wikimedia.org/wikipedia/commons/thumb/e/ed/Pandas_logo.svg/1200px-Pandas_logo.svg.png'
width = '350'
height = '130'/>
</center>
<br>
Pandas is a Python library mainly used for data analysis. It is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language. 

## 1. Importing the Pandas Library 

To use Pandas, we will need to install and import the pandas library. The Pandas library is already available in Google Colaboratory hence, we just need to import it. While it is not necessary to import pandas as pd, it is a convention to do so (and you can type less).

In [1]:
import pandas as pd

## 2. Reading the data into a Pandas DataFrame

###2a. Reading in the data

We will load our data into the Pandas DataFrame, which is a 2-dimensional labeled data structure with rows and columns. It is a good visual representation of data as it is like a table.

To load the data we have, we will use the **`.read_csv()`** function since our data is in a csv file format


In [2]:
url = 'https://raw.githubusercontent.com/wnyeo/ExploreSCIS_data/main/hawker_final.csv' # link to the cleaned dataset

hawker_df = pd.read_csv(url)  # df represents DataFrame

In [3]:
# show the data
hawker_df

Unnamed: 0,name,latitude_hc,longitude_hc,photourl,address_myenv,no_of_market_stalls,no_of_food_stalls,description_myenv,status,google_3d_view
0,Adam Road Food Centre,1.324132,103.814163,http://www.nea.gov.sg/images/default-source/Ha...,"2, Adam Road, Singapore 289876",0,32,"Built in 1974, Adam Food Centre comprises 32 c...",Existing,https://goo.gl/maps/ZMLJeP8STKpDP9so9
1,Aljunied Ave 2 Blk 117 (Blk 117 Aljunied Marke...,1.320710,103.887093,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 117, Aljunied Ave 2, Singapore 380117",82,79,Next to Geylang East Swimming Complex and othe...,Existing,https://goo.gl/maps/XVjxu6hgpcwt15JC8
2,Amoy Street Food Centre (Telok Ayer Food Centre),1.279129,103.846985,http://www.nea.gov.sg/images/default-source/Ha...,"National Development Building, Annex B, Telok ...",1,134,"Built in 1983, the two-storey food centre is l...",Existing,https://goo.gl/maps/RiX319zQXRFeHWPE6
3,Ang Mo Kio Ave 1 Blk 226D (Kebun Baru Market a...,1.366950,103.839188,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 226D, Ang Mo Kio Ave 1, Singapore 564226",101,10,"Located next to a food centre, the hawker cent...",Existing,https://goo.gl/maps/CAJGbiEU4o46xfbR8
4,Ang Mo Kio Ave 1 Blk 341 (Teck Ghee Court),1.364170,103.848320,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 341, Ang Mo Kio Ave 1, Singapore 560341",86,32,Located in the heart of Teck Ghee estate and o...,Existing,https://goo.gl/maps/quL2H4mojGZfZsHA6
...,...,...,...,...,...,...,...,...,...,...
109,Whampoa Drive Blk 90 (Whampoa Drive Makan Plac...,1.323140,103.855041,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 90, Whampoa Drive, Singapore 320090",12,80,"Known also as Whampoa Drive Makan Place, this ...",Existing,https://goo.gl/maps/PQXB1ifE7PrnMbAf7
110,Whampoa Drive Blk 91/92 (Whampoa Drive Makan P...,1.323490,103.854134,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 91/92, Whampoa Drive, Singapore 320091/320092",112,52,"Known also as Whampoa Drive Makan Place, this ...",Existing,https://goo.gl/maps/vERiFeXoGi8jDfjJ8
111,Yishun Park Hawker Centre,1.424911,103.844992,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 51, Yishun Avenue 11, Singapore 768867",0,45,"Built in 2017, the hawker centre at Yishun Par...",Existing (new),https://goo.gl/maps/jqWXYVF17BiAz7bP9
112,Yishun Ring Road Blk 104/105 (Chong Pang Marke...,1.431658,103.828072,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 104/105, Yishun Ring Road, Singapore 76010...",123,56,"Built in 1984, the hawker centre comprises 56 ...",Existing,https://goo.gl/maps/dPjV2SSaX9nEXV1u8


### 2b. `head()` Function

To show only the first few rows of data, we can make use of the **`.head()`** function. By default, `df.head()` will return the first 5 rows of the DataFrame. If you want more/less number of rows, you can specify an integer.

In [4]:
hawker_df.head(3)

Unnamed: 0,name,latitude_hc,longitude_hc,photourl,address_myenv,no_of_market_stalls,no_of_food_stalls,description_myenv,status,google_3d_view
0,Adam Road Food Centre,1.324132,103.814163,http://www.nea.gov.sg/images/default-source/Ha...,"2, Adam Road, Singapore 289876",0,32,"Built in 1974, Adam Food Centre comprises 32 c...",Existing,https://goo.gl/maps/ZMLJeP8STKpDP9so9
1,Aljunied Ave 2 Blk 117 (Blk 117 Aljunied Marke...,1.32071,103.887093,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 117, Aljunied Ave 2, Singapore 380117",82,79,Next to Geylang East Swimming Complex and othe...,Existing,https://goo.gl/maps/XVjxu6hgpcwt15JC8
2,Amoy Street Food Centre (Telok Ayer Food Centre),1.279129,103.846985,http://www.nea.gov.sg/images/default-source/Ha...,"National Development Building, Annex B, Telok ...",1,134,"Built in 1983, the two-storey food centre is l...",Existing,https://goo.gl/maps/RiX319zQXRFeHWPE6


### 2c. `tail()` Function [Mini Exercise! - 2 mins]

To show the last few rows of data, we can make use of the **`.tail()`** function. Similarly, by default, `df.tail()` will return the last 5 rows of the DataFrame. If you want more/less number of rows, you can specify an integer.
<br>
<br>**Exercise: Show the last 7 rows of hawker_df** 

In [5]:
# Enter your code here
hawker_df.tail(7)

Unnamed: 0,name,latitude_hc,longitude_hc,photourl,address_myenv,no_of_market_stalls,no_of_food_stalls,description_myenv,status,google_3d_view
107,West Coast Drive Blk 502 (Ayer Rajah Market),1.31196,103.759102,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 502, West Coast Drive, Singapore 120502",40,0,"Also known as Ayer Rajah Market, the standalon...",Existing,https://goo.gl/maps/dWso9emez66FrUyJ8
108,West Coast Drive Blk 503 (Ayer Rajah Food Centre),1.31187,103.759804,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 503, West Coast Drive, Singapore 120503",0,80,"Also known as Ayer Rajah Food Centre, the stan...",Existing,https://goo.gl/maps/HzZpoN9MxHR4ie9v6
109,Whampoa Drive Blk 90 (Whampoa Drive Makan Plac...,1.32314,103.855041,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 90, Whampoa Drive, Singapore 320090",12,80,"Known also as Whampoa Drive Makan Place, this ...",Existing,https://goo.gl/maps/PQXB1ifE7PrnMbAf7
110,Whampoa Drive Blk 91/92 (Whampoa Drive Makan P...,1.32349,103.854134,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 91/92, Whampoa Drive, Singapore 320091/320092",112,52,"Known also as Whampoa Drive Makan Place, this ...",Existing,https://goo.gl/maps/vERiFeXoGi8jDfjJ8
111,Yishun Park Hawker Centre,1.424911,103.844992,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 51, Yishun Avenue 11, Singapore 768867",0,45,"Built in 2017, the hawker centre at Yishun Par...",Existing (new),https://goo.gl/maps/jqWXYVF17BiAz7bP9
112,Yishun Ring Road Blk 104/105 (Chong Pang Marke...,1.431658,103.828072,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 104/105, Yishun Ring Road, Singapore 76010...",123,56,"Built in 1984, the hawker centre comprises 56 ...",Existing,https://goo.gl/maps/dPjV2SSaX9nEXV1u8
113,Zion Riverside Food Centre,1.29234,103.831184,http://www.nea.gov.sg/images/default-source/Ha...,"70, Zion Road, Singapore 247792",0,32,"Located opposite Great World City, Zion Rivers...",Existing,https://goo.gl/maps/8kwxWX9gy3J3Mna58


####Solution
Click below for the solution.

In [6]:
hawker_df.tail(7)

Unnamed: 0,name,latitude_hc,longitude_hc,photourl,address_myenv,no_of_market_stalls,no_of_food_stalls,description_myenv,status,google_3d_view
107,West Coast Drive Blk 502 (Ayer Rajah Market),1.31196,103.759102,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 502, West Coast Drive, Singapore 120502",40,0,"Also known as Ayer Rajah Market, the standalon...",Existing,https://goo.gl/maps/dWso9emez66FrUyJ8
108,West Coast Drive Blk 503 (Ayer Rajah Food Centre),1.31187,103.759804,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 503, West Coast Drive, Singapore 120503",0,80,"Also known as Ayer Rajah Food Centre, the stan...",Existing,https://goo.gl/maps/HzZpoN9MxHR4ie9v6
109,Whampoa Drive Blk 90 (Whampoa Drive Makan Plac...,1.32314,103.855041,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 90, Whampoa Drive, Singapore 320090",12,80,"Known also as Whampoa Drive Makan Place, this ...",Existing,https://goo.gl/maps/PQXB1ifE7PrnMbAf7
110,Whampoa Drive Blk 91/92 (Whampoa Drive Makan P...,1.32349,103.854134,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 91/92, Whampoa Drive, Singapore 320091/320092",112,52,"Known also as Whampoa Drive Makan Place, this ...",Existing,https://goo.gl/maps/vERiFeXoGi8jDfjJ8
111,Yishun Park Hawker Centre,1.424911,103.844992,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 51, Yishun Avenue 11, Singapore 768867",0,45,"Built in 2017, the hawker centre at Yishun Par...",Existing (new),https://goo.gl/maps/jqWXYVF17BiAz7bP9
112,Yishun Ring Road Blk 104/105 (Chong Pang Marke...,1.431658,103.828072,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 104/105, Yishun Ring Road, Singapore 76010...",123,56,"Built in 1984, the hawker centre comprises 56 ...",Existing,https://goo.gl/maps/dPjV2SSaX9nEXV1u8
113,Zion Riverside Food Centre,1.29234,103.831184,http://www.nea.gov.sg/images/default-source/Ha...,"70, Zion Road, Singapore 247792",0,32,"Located opposite Great World City, Zion Rivers...",Existing,https://goo.gl/maps/8kwxWX9gy3J3Mna58


###2d. Checking the size of data

We can check the size of the data using the **`.shape`** attribute

In [7]:
hawker_df.shape

(114, 10)

We can see that there are 114 rows and 10 columns in our dataset

## 3. Accessing data in the Dataframe

### 3a. Accessing individual columns/columns in the Pandas DataFrame

We can access columns in the DataFrame based on the column names directly using square brackets

In [8]:
hawker_df['name']

0                                  Adam Road Food Centre
1      Aljunied Ave 2 Blk 117 (Blk 117 Aljunied Marke...
2       Amoy Street Food Centre (Telok Ayer Food Centre)
3      Ang Mo Kio Ave 1 Blk 226D (Kebun Baru Market a...
4             Ang Mo Kio Ave 1 Blk 341 (Teck Ghee Court)
                             ...                        
109    Whampoa Drive Blk 90 (Whampoa Drive Makan Plac...
110    Whampoa Drive Blk 91/92 (Whampoa Drive Makan P...
111                            Yishun Park Hawker Centre
112    Yishun Ring Road Blk 104/105 (Chong Pang Marke...
113                           Zion Riverside Food Centre
Name: name, Length: 114, dtype: object

In [9]:
hawker_df[['name', 'address_myenv']] #access multiple columns

Unnamed: 0,name,address_myenv
0,Adam Road Food Centre,"2, Adam Road, Singapore 289876"
1,Aljunied Ave 2 Blk 117 (Blk 117 Aljunied Marke...,"Blk 117, Aljunied Ave 2, Singapore 380117"
2,Amoy Street Food Centre (Telok Ayer Food Centre),"National Development Building, Annex B, Telok ..."
3,Ang Mo Kio Ave 1 Blk 226D (Kebun Baru Market a...,"Blk 226D, Ang Mo Kio Ave 1, Singapore 564226"
4,Ang Mo Kio Ave 1 Blk 341 (Teck Ghee Court),"Blk 341, Ang Mo Kio Ave 1, Singapore 560341"
...,...,...
109,Whampoa Drive Blk 90 (Whampoa Drive Makan Plac...,"Blk 90, Whampoa Drive, Singapore 320090"
110,Whampoa Drive Blk 91/92 (Whampoa Drive Makan P...,"Blk 91/92, Whampoa Drive, Singapore 320091/320092"
111,Yishun Park Hawker Centre,"Blk 51, Yishun Avenue 11, Singapore 768867"
112,Yishun Ring Road Blk 104/105 (Chong Pang Marke...,"Blk 104/105, Yishun Ring Road, Singapore 76010..."


###3b. Accessing rows in the Pandas DataFrame
To access rows in the Pandas DataFrame, we access it via their index using square brackets as well. Counting in Python starts from 0, hence, the first row would have an index of 0. Note that the second number in the indexing is exclusive.

<center>
<img src='https://cdn.programiz.com/sites/tutorial2program/files/python-list-index.png'
width = '500'
height = '220'/>
</center>

In [10]:
hawker_df[0:2] # the second index is exclusive, in this case, the first 2 rows are accessed

Unnamed: 0,name,latitude_hc,longitude_hc,photourl,address_myenv,no_of_market_stalls,no_of_food_stalls,description_myenv,status,google_3d_view
0,Adam Road Food Centre,1.324132,103.814163,http://www.nea.gov.sg/images/default-source/Ha...,"2, Adam Road, Singapore 289876",0,32,"Built in 1974, Adam Food Centre comprises 32 c...",Existing,https://goo.gl/maps/ZMLJeP8STKpDP9so9
1,Aljunied Ave 2 Blk 117 (Blk 117 Aljunied Marke...,1.32071,103.887093,http://www.nea.gov.sg/images/default-source/Ha...,"Blk 117, Aljunied Ave 2, Singapore 380117",82,79,Next to Geylang East Swimming Complex and othe...,Existing,https://goo.gl/maps/XVjxu6hgpcwt15JC8


Note that to access individual rows in a Pandas DataFrame, we can make use of `.loc` or `iloc`, which is not covered in this workshop. Feel free to explore the extensive [indexing and selection](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html) methods that Pandas has. 

###3c. Accessing values in the Pandas DataFrame
We can access a single value in the Pandas DataFrame via the column name and the row index, using square brackets.

In [11]:
hawker_df['name'][0]

'Adam Road Food Centre'

For a range of values, we can make use of slicing. Note that in slicing, the second number in the indexing is also exclusive.

In [12]:
hawker_df['name'][0:4] #in this case, items at index 0, 1, 2, 3 are included

0                                Adam Road Food Centre
1    Aljunied Ave 2 Blk 117 (Blk 117 Aljunied Marke...
2     Amoy Street Food Centre (Telok Ayer Food Centre)
3    Ang Mo Kio Ave 1 Blk 226D (Kebun Baru Market a...
Name: name, dtype: object

In [13]:
#Another example
hawker_df[['name', 'status']][0:4]

Unnamed: 0,name,status
0,Adam Road Food Centre,Existing
1,Aljunied Ave 2 Blk 117 (Blk 117 Aljunied Marke...,Existing
2,Amoy Street Food Centre (Telok Ayer Food Centre),Existing
3,Ang Mo Kio Ave 1 Blk 226D (Kebun Baru Market a...,Existing


###[Mini Exercise] - 3 mins
**Exercise: Show the number of market stalls and the number of food stalls for the first 3 hawker centre in the dataset**


In [14]:
# Enter your code here
hawker_df[['no_of_market_stalls', 'no_of_food_stalls']][0:3]

Unnamed: 0,no_of_market_stalls,no_of_food_stalls
0,0,32
1,82,79
2,1,134


#### Solution
Click below for the solution.

In [15]:
hawker_df[['no_of_market_stalls', 'no_of_food_stalls']][0:3]

Unnamed: 0,no_of_market_stalls,no_of_food_stalls
0,0,32
1,82,79
2,1,134


In [16]:
# Another answer
hawker_df[['no_of_market_stalls', 'no_of_food_stalls']].head(3)

Unnamed: 0,no_of_market_stalls,no_of_food_stalls
0,0,32
1,82,79
2,1,134


##4. Understanding the descriptive statistics of the data
<center>
<img src='https://miro.medium.com/max/707/1*w2hGJO5gUD6se5yQ6Efsdw.png'
width = '350'
height = '200'/>
</center>

Descriptive statistics summarise the information of the characteristics and distribution of values in the dataset. It allows us to learn about the central tendency and variability of the data. Examples include the mean, median, mode, standard deviation, etc.

### 4a. `mean()` function

In [17]:
hawker_df.mean()

latitude_hc              1.325880
longitude_hc           103.844182
no_of_market_stalls     72.745614
no_of_food_stalls       56.105263
dtype: float64

###4b. `min()` function

In [18]:
hawker_df.min()

name                                               Adam Road Food Centre
latitude_hc                                                      1.27266
longitude_hc                                                     103.697
photourl               http://www.nea.gov.sg/images/default-source/Ha...
address_myenv                          1 Tampines Walk, Singapore 528523
no_of_market_stalls                                                    0
no_of_food_stalls                                                      0
description_myenv      A popular supper destination for night owls fr...
status                                                          Existing
google_3d_view                     https://goo.gl/maps/1Prs8uB9sBv4AfZj7
dtype: object

### 4c. `max()` function

In [19]:
hawker_df.max()

name                                          Zion Riverside Food Centre
latitude_hc                                                      1.44399
longitude_hc                                                     103.988
photourl               http://www.nea.gov.sg/images/default-source/Ha...
address_myenv          National Development Building, Annex B, Telok ...
no_of_market_stalls                                                  477
no_of_food_stalls                                                    226
description_myenv      Within walking distance from the famous Chomp ...
status                                                Under Construction
google_3d_view         https://www.google.com/maps/@1.3072454,103.856...
dtype: object

### 4d. `median()` function

In [20]:
hawker_df.median()

latitude_hc              1.320323
longitude_hc           103.845428
no_of_market_stalls     69.500000
no_of_food_stalls       50.000000
dtype: float64

### 4e. `count()` function [Mini Exercise!] - 2 mins
We can find the number of values for each column using the **`.count()`** function
<br>
<br>**Exercise: Find the count of values for each column in the dataframe** 

In [21]:
# Enter your code here
hawker_df.count()

name                   114
latitude_hc            114
longitude_hc           114
photourl               114
address_myenv          114
no_of_market_stalls    114
no_of_food_stalls      114
description_myenv      114
status                 114
google_3d_view         114
dtype: int64

####Solution
Click below for the solution.

In [22]:
hawker_df.count()

name                   114
latitude_hc            114
longitude_hc           114
photourl               114
address_myenv          114
no_of_market_stalls    114
no_of_food_stalls      114
description_myenv      114
status                 114
google_3d_view         114
dtype: int64

### 4d. `describe()` function
Alternatively, we can understand the descriptive statistics of the data using the 
**`.describe()`** function

In [23]:
hawker_df.describe()

Unnamed: 0,latitude_hc,longitude_hc,no_of_market_stalls,no_of_food_stalls
count,114.0,114.0,114.0,114.0
mean,1.32588,103.844182,72.745614,56.105263
std,0.037271,0.05699,72.218549,32.928823
min,1.27266,103.697374,0.0,0.0
25%,1.302355,103.808046,0.0,36.0
50%,1.320323,103.845428,69.5,50.0
75%,1.344818,103.883722,107.25,72.0
max,1.44399,103.988182,477.0,226.0



Now that we have a basic idea of the dataset that we are working with, we can start to plot the locations of Hawker Centres onto the map

# Introduction to Foilum
<center>
<img src='http://python-visualization.github.io/folium/_images/folium_logo.jpg'
width = '150'
height = '150'/>
</center>
<br>

Folium is a data visualization library in Python that was built primarily to help people visualize geospatial data. With Folium, one can create a map of any location in the world if its latitude and longitude values are known.

## 5. Importing the Folium library
Similarly to use Folium, we will need to install and import the Folium library. The Folium library is already available in Google Colaboratory hence, we just need to import it. 

In [24]:
import folium

##6. Initialising a map


### 6a. Map() function
We can generate a base map using the folium library using the **`Map()`** function. 

In [25]:
map = folium.Map(location=[1.3437459, 103.8240449], # coordinates (latitude and longitude) of the map, these are the coordinates for Singapore
                 zoom_start=12,                     # initial zoom level for the map
                 control_scale=True)                # whether to add a control scale on the map.

### 6b. Displaying the map
Display the map using the **`display()`** function

In [26]:
display(map)

## 7. Adding the locations onto the map

### 7a. Retrieving the relevant rows for the geospatial visualisation

In [27]:
hc_locations = hawker_df[["latitude_hc", "longitude_hc", 'name']]

In [28]:
hc_locations

Unnamed: 0,latitude_hc,longitude_hc,name
0,1.324132,103.814163,Adam Road Food Centre
1,1.320710,103.887093,Aljunied Ave 2 Blk 117 (Blk 117 Aljunied Marke...
2,1.279129,103.846985,Amoy Street Food Centre (Telok Ayer Food Centre)
3,1.366950,103.839188,Ang Mo Kio Ave 1 Blk 226D (Kebun Baru Market a...
4,1.364170,103.848320,Ang Mo Kio Ave 1 Blk 341 (Teck Ghee Court)
...,...,...,...
109,1.323140,103.855041,Whampoa Drive Blk 90 (Whampoa Drive Makan Plac...
110,1.323490,103.854134,Whampoa Drive Blk 91/92 (Whampoa Drive Makan P...
111,1.424911,103.844992,Yishun Park Hawker Centre
112,1.431658,103.828072,Yishun Ring Road Blk 104/105 (Chong Pang Marke...


### 7b. Adding a location onto the base map

Using **`Marker()`**, we can add a location onto the base map that we have generated, by providing the coordinates. There is a parameter named popup where we can assign a text to. The text will be shown when we click on the marker.

In [29]:
folium.Marker([1.324132, 103.814163], # latitude and longitude
              popup='Adam Road Food Centre' # name of hawker centre, what the popup will display when the marker is clicked
              ).add_to(map) # add it to base map that was initialised

<folium.map.Marker at 0x7faf9adf59d0>

In [30]:
display(map)

### 7c. Adding multiple locations to the base map with the aid of a for loop
However, it is quite a hassle to add each and every location onto the map individually. But we can speed up the process with the help of a for loop.

In [31]:
for index, location_info in hc_locations.iterrows(): # iterrows is used to iterate through the rows of the dataframe
    folium.Marker([location_info["latitude_hc"], location_info["longitude_hc"]], # latitude and longitude
                  popup=location_info["name"]  # name of hawker centre
                  ).add_to(map) # add it to the base map

For loops are used for iterating over iterable objects, such as the rows of the DataFrame.



**For a better idea of what `iterrows()` does.** <br> `DataFrame.iterrows` is a generator which yields both the index and row (as a Series)

In [32]:
for index, location_info in hc_locations.iterrows(): 
  print(index)
  print('-----')
  print(location_info) # location_info is a Series
  print('-----')
  print(location_info['latitude_hc'])
  print(location_info['longitude_hc'])
  print(location_info['name'])
  break

0
-----
latitude_hc                   1.32413
longitude_hc                  103.814
name            Adam Road Food Centre
Name: 0, dtype: object
-----
1.324131966
103.8141632
Adam Road Food Centre


### 7d. Display the map [Mini Exercise!] - 2 mins
**Exercise: Display the map that has the location added to it**

In [33]:
# Enter your code here
display(map)

####Solution
Click below for the solution.

In [34]:
display(map)

#Extra Exercises!

## Plotting a heatmap

To understand the density of the hawker centres spatially, we can use the folium heat map to plot their locations. There are 2 main steps to creating a heatmap with the folium library:

1) Creating a folium base map of the location <br>
2) Adding the heatmap layer

### Importing the folium library and HeatMap plugin

In [35]:
import folium
from folium.plugins import HeatMap

### Preparing the data needed
To plot the heatmap, we will need the list of latitudes and longtitudes in a list. In Python, lists are used to store collections of data in a variable. They are created by square brackets. In this case, we will have a nested list (list of lists) for the list of latitudes and longitudes. The list will look something like this:
<br> [ [1.324131966, 103.8141632],
 [1.320709944, 103.8870926],
 [1.279129028, 103.84698490000001],
 [1.366950035, 103.83918759999999] ]

In [36]:
lat_long = []
for index, location_info in hc_locations.iterrows():
  lat_long.append([location_info['latitude_hc'], location_info['longitude_hc']])

### Creating the base map

In [37]:
map_folium = folium.Map([1.35255,103.82580], zoom_start=11, control_scale=True)

### Adding the heatmap layer
Now that we've created the base map, we will create the heatmap layer using **`HeatMap()`**

In [38]:
HeatMap(lat_long, # nested list of latitude and longitude
        radius = 8, # radius of the points
        gradient = {0.2:'blue', 0.3:'purple', 0.5:'orange', 1.0:'red'} # color gradient for the density of heatmaps
        ).add_to(map_folium) # add it to the base map

<folium.plugins.heat_map.HeatMap at 0x7faf9ac82110>

### Displaying the heatmap

In [39]:
display(map_folium)

## Geospatial visualisation with other datasets
Here is another dataset that you can try to visualise
<br> https://raw.githubusercontent.com/wnyeo/ExploreSCIS_data/main/mrt_lrt_final.csv
<br> You can follow the steps outlined to plot a map of the distribution of mrts/lrts in Singapore.

### Importing the Pandas Library

In [40]:
import pandas as pd

### Reading in the data into a Pandas DataFrame

In [41]:
mrt_df = pd.read_csv('https://raw.githubusercontent.com/wnyeo/ExploreSCIS_data/main/mrt_lrt_final.csv')
mrt_df

Unnamed: 0,station_name,type,lat,lng
0,Jurong East,MRT,1.333207,103.742308
1,Bukit Batok,MRT,1.349069,103.749596
2,Bukit Gombak,MRT,1.359043,103.751863
3,Choa Chu Kang,MRT,1.385417,103.744316
4,Yew Tee,MRT,1.397383,103.747523
...,...,...,...,...
152,Punggol Point,LRT,1.416932,103.906680
153,Samudera,LRT,1.415955,103.902185
154,Nibong,LRT,1.411865,103.900321
155,Sumang,LRT,1.408501,103.898605


### Some EDA

In [42]:
mrt_df.head(3)

Unnamed: 0,station_name,type,lat,lng
0,Jurong East,MRT,1.333207,103.742308
1,Bukit Batok,MRT,1.349069,103.749596
2,Bukit Gombak,MRT,1.359043,103.751863


### Retrieving the relevant data needed

In [43]:
mrt_loc = mrt_df[['station_name','lat', 'lng']]
mrt_loc.head(3)

Unnamed: 0,station_name,lat,lng
0,Jurong East,1.333207,103.742308
1,Bukit Batok,1.349069,103.749596
2,Bukit Gombak,1.359043,103.751863


### Importing the Folium library

In [44]:
import folium

### Creating the base map

In [45]:
map = folium.Map(location=[1.3437459, 103.8240449],
                 zoom_start=12,                     
                 control_scale=True)  

### Adding in the locations

In [46]:
for index, location_info in mrt_loc.iterrows():
    folium.Marker([location_info["lat"], location_info["lng"]], # latitude and longtitude
                  popup=location_info["station_name"], # the text that appears when you click on the icon
                  icon=folium.Icon(color="green", icon='train', prefix='fa'),  # changing the color of icon to green, changing the icon on the marker to a train
                  tooltip='Click me!' # text that appears when you hover over the icon
                  ).add_to(map) # adding it to the base map

### Displaying the map

In [47]:
display(map)