<a href="https://colab.research.google.com/github/Nazif05/Devops/blob/main/python_solutions_project_Nazif.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## PYTHON SOLUTIONS FINAL PROJECT - GEOCODING

By Nazif Umar

Geocoding is the process of converting a physical address or location, such as a street address or landmark, into geographic coordinates, typically latitude and longitude. This allows the location to be plotted on a map and used in various applications and analyses, such as mapping, navigation, and geospatial analysis.

## PROBLEM STATEMENT

The problem statement for the  project is given a list of mall addresses in Putrajaya, Cyberjaya, Kajang, and Bandar Bangi in Selangor, Malaysia.
How can we use geocoding with Geopy to obtain their respective latitude and longitude coordinates?

## STEP 1 - GATHERING ADDRESSES

Gather a list of mall addresses in the whole locations mentioned above. This has been done manually by searching for mall addresses online.

In [None]:
import csv
import pandas as pd

In [None]:
import os
data_pkg_path = 'data'
filename = 'mall.csv'
path = os.path.join(data_pkg_path, filename)

In [None]:
path

'data/mall.csv'

## STEP 2 - INSTALL GEOPY

Use Geopy's geocoding functionality to convert the mall addresses to their respective latitude and longitude coordinates.

In [None]:
import geopy

In [None]:
dir(geopy)

['AlgoliaPlaces',
 'ArcGIS',
 'AzureMaps',
 'BANFrance',
 'Baidu',
 'BaiduV3',
 'Bing',
 'DataBC',
 'GeoNames',
 'GeocodeEarth',
 'Geocodio',
 'Geolake',
 'GoogleV3',
 'Here',
 'HereV7',
 'IGNFrance',
 'LiveAddress',
 'Location',
 'MapBox',
 'MapQuest',
 'MapTiler',
 'Nominatim',
 'OpenCage',
 'OpenMapQuest',
 'Pelias',
 'Photon',
 'PickPoint',
 'Point',
 'Timezone',
 'TomTom',
 'What3Words',
 'What3WordsV3',
 'Yandex',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '__version__',
 '__version_info__',
 'adapters',
 'compat',
 'exc',
 'format',
 'geocoders',
 'get_geocoder_for_service',
 'get_version',
 'location',
 'point',
 'timezone',
 'units',
 'util']

### List of Mall Names and Address

In [None]:
names = [['Alamanda Shopping Centre','Jalan Alamanda, Presint 1, 62000 Putrajaya'],
         ['IOI city mall Putrajaya','XP96+VJ,IOI Resort, 43000 Seri Kembangan, Selangor'],
         ['Suria mall Putrajaya', 'Jalan Tun Abdul Razak, Presint 3, 62100 Putrajaya'],
         ['Dõpulze shopping centre','Persiaran Multimedia, Cyberjaya, 63000 Cyberjaya, Selangor'],
         ['Gem in mall', 'Persiaran Sepang, Cyber 11, 63000 Cyberjaya, Selangor'],
         ['Shaftsbury square', 'Shaftsbury Square, Persiaran Multimedia, Cyber 6, 63000 Cyberjaya'],
         ['Malakat mall', 'Lingkaran Cyber Point Timur, Cyberjaya, 63000 Cyberjaya, Selangor'],
         ['Bangi gateway', 'Seksyen 15, 43650 Bandar Baru Bangi, Selangor'],
         ['De Centrum city', 'Jalan Ikram Uniten, 43000 Kajang, Selangor'],
         ['Evo Bangi', 'Jalan Medan 2D, Seksyen 9, 43650 Bandar Baru Bangi, Selangor'],
         ['Kip mall Bangi', 'Jalan Medan Bangi, Seksyen 6, 43650 Bandar Baru Bangi, Selangor'],
         ['Metro point Kajang', 'Jalan Jelok Indah 9, Taman Jelok Indah, 43000 Kajang, Selangor'],
         ['Plaza metro Kajang', 'Jalan Tun Abdul Aziz, Seksyen 1, 43000 Kajang, Selangor']

]

## STEP 3 - PANDAS DATAFRAME

##### Creating a Pandas dataframe is important because it allows you to organize and manipulate data in a structured manner.

In [None]:
df = pd.DataFrame(names, columns = ['NAME OF MALL', 'Address'])
df

Unnamed: 0,NAME OF MALL,Address
0,Alamanda Shopping Centre,"Jalan Alamanda, Presint 1, 62000 Putrajaya"
1,IOI city mall Putrajaya,"XP96+VJ,IOI Resort, 43000 Seri Kembangan, Sela..."
2,Suria mall Putrajaya,"Jalan Tun Abdul Razak, Presint 3, 62100 Putrajaya"
3,Dõpulze shopping centre,"Persiaran Multimedia, Cyberjaya, 63000 Cyberja..."
4,Gem in mall,"Persiaran Sepang, Cyber 11, 63000 Cyberjaya, S..."
5,Shaftsbury square,"Shaftsbury Square, Persiaran Multimedia, Cyber..."
6,Malakat mall,"Lingkaran Cyber Point Timur, Cyberjaya, 63000 ..."
7,Bangi gateway,"Seksyen 15, 43650 Bandar Baru Bangi, Selangor"
8,De Centrum city,"Jalan Ikram Uniten, 43000 Kajang, Selangor"
9,Evo Bangi,"Jalan Medan 2D, Seksyen 9, 43650 Bandar Baru B..."


There is also a info() method that shows basic information about the dataframe, such as number of rows/columns and data types of each column.

In [None]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 13 entries, 0 to 12
Data columns (total 2 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   NAME OF MALL  13 non-null     object
 1   Address       13 non-null     object
dtypes: object(2)
memory usage: 336.0+ bytes


In [None]:
from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter
# tip: https://towardsdatascience.com/geocode-with-python-161ec1e62b89
# Using Nominatim Geocoding service

locator = Nominatim(user_agent='myGeocoder')


A function to delay geocoding calls is important because it helps prevent overloading the geocoding service with too many requests at once, which could result in errors or being temporarily blocked from accessing the service.

In [None]:
# function to delay geocoding calls
geocode_fn = RateLimiter(locator.geocode, min_delay_seconds=2)

# Step 4: Use Geopy To Geocode Mall Addresses

In [None]:
# Geocoding
df['coordinates'] = df['Address'].apply(geocode_fn)
df

In [None]:
df.coordinates

0     (Jalan Alamanda, Presint 1, Putrajaya, 62000, ...
1                                                  None
2                                                  None
3     (Persiaran Multimedia, Cyber 12, Cyberjaya, Se...
4     (Persiaran Sepang, Cyber 9, Cyberjaya, Sepang,...
5     (Shaftsbury Square Shop & Retail, Persiaran Mu...
6     (Lingkaran Cyber Point Timur, Cyber 12, Cyberj...
7     (Seksyen 15, Bandar Baru Bangi, Majlis Perband...
8                                                  None
9                                                  None
10    (Jalan Medan Bangi, Seksyen 16, Bandar Baru Ba...
11                                                 None
12                                                 None
Name: coordinates, dtype: object

In [None]:
df.coordinates[0]

Location(Jalan Alamanda, Presint 1, Putrajaya, 62000, Malaysia, (2.9375959, 101.7119678, 0.0))

In [None]:
df.coordinates[0].latitude

2.9375959

In [None]:
df.coordinates[0].longitude

101.7119678

Hereby, we add two column specifically for Latitude and Longitude

In [None]:
df['Latitude']=df['coordinates'].apply(lambda x: x.latitude
                                       if x != None else None)
df['Longitude']=df['coordinates'].apply(lambda x: x.longitude
                                        if x != None else None)
df

Unnamed: 0,NAME OF MALL,Address,coordinates,Latitude,Longitude
0,Alamanda Shopping Centre,"Jalan Alamanda, Presint 1, 62000 Putrajaya","(Jalan Alamanda, Presint 1, Putrajaya, 62000, ...",2.937596,101.711968
1,IOI city mall Putrajaya,"XP96+VJ,IOI Resort, 43000 Seri Kembangan, Sela...",,,
2,Suria mall Putrajaya,"Jalan Tun Abdul Razak, Presint 3, 62100 Putrajaya",,,
3,Dõpulze shopping centre,"Persiaran Multimedia, Cyberjaya, 63000 Cyberja...","(Persiaran Multimedia, Cyber 12, Cyberjaya, Se...",2.92222,101.645441
4,Gem in mall,"Persiaran Sepang, Cyber 11, 63000 Cyberjaya, S...","(Persiaran Sepang, Cyber 9, Cyberjaya, Sepang,...",2.916354,101.634379
5,Shaftsbury square,"Shaftsbury Square, Persiaran Multimedia, Cyber...","(Shaftsbury Square Shop & Retail, Persiaran Mu...",2.923262,101.660718
6,Malakat mall,"Lingkaran Cyber Point Timur, Cyberjaya, 63000 ...","(Lingkaran Cyber Point Timur, Cyber 12, Cyberj...",2.921339,101.650019
7,Bangi gateway,"Seksyen 15, 43650 Bandar Baru Bangi, Selangor","(Seksyen 15, Bandar Baru Bangi, Majlis Perband...",2.929863,101.768207
8,De Centrum city,"Jalan Ikram Uniten, 43000 Kajang, Selangor",,,
9,Evo Bangi,"Jalan Medan 2D, Seksyen 9, 43650 Bandar Baru B...",,,


In [None]:
df = df[df['Latitude'].notnull()].copy()
df

Unnamed: 0,NAME OF MALL,Address,coordinates,Latitude,Longitude
0,Alamanda Shopping Centre,"Jalan Alamanda, Presint 1, 62000 Putrajaya","(Jalan Alamanda, Presint 1, Putrajaya, 62000, ...",2.937596,101.711968
3,Dõpulze shopping centre,"Persiaran Multimedia, Cyberjaya, 63000 Cyberja...","(Persiaran Multimedia, Cyber 12, Cyberjaya, Se...",2.92222,101.645441
4,Gem in mall,"Persiaran Sepang, Cyber 11, 63000 Cyberjaya, S...","(Persiaran Sepang, Cyber 9, Cyberjaya, Sepang,...",2.916354,101.634379
5,Shaftsbury square,"Shaftsbury Square, Persiaran Multimedia, Cyber...","(Shaftsbury Square Shop & Retail, Persiaran Mu...",2.923262,101.660718
6,Malakat mall,"Lingkaran Cyber Point Timur, Cyberjaya, 63000 ...","(Lingkaran Cyber Point Timur, Cyber 12, Cyberj...",2.921339,101.650019
7,Bangi gateway,"Seksyen 15, 43650 Bandar Baru Bangi, Selangor","(Seksyen 15, Bandar Baru Bangi, Majlis Perband...",2.929863,101.768207
10,Kip mall Bangi,"Jalan Medan Bangi, Seksyen 6, 43650 Bandar Bar...","(Jalan Medan Bangi, Seksyen 16, Bandar Baru Ba...",2.952677,101.756885


# Step 5: Visualize the Data

Once the coordinates have been obtained, I have used a mapping library such as Folium to plot the mall locations on a map.

In [None]:
import os
import folium

In [None]:
# Mapping using folium
m = folium.Map()
m

In [None]:
from folium import Figure
fig = Figure(width=800, height=400)
m = folium.Map(location=[2.9796, 101.7386], zoom_start=15)
fig.add_child(m)

NameError: ignored

In [None]:
my_map = folium.Map(
    location=[2.9796, 101.7386],
    tiles='Stamen Toner',
    zoom_start=11)

df.apply(lambda row:folium.Marker(location=[row['Latitude'], row['Longitude']]).add_to(my_map), axis=1)

# Display Mall Location in MAP
my_map


NameError: ignored