# Table of Contents

## [1. Libraries](#section_1)

## [2. Wrangling-Part1](#section_2)

* ### [2.1. Function to retrieve latitude and longitude based on business name](#section_3)
* ### [2.2. Function to retrieve correct address based on latitude and longitude](#section_4)

## [3. Wrangling-Part2](#section_5)

## [4. Saving the csv](#section_6)

## [5. References](#section_7)

### 1. Libraries used<a id='section_1'></a>

In [22]:
import googlemaps
from datetime import datetime
from geopy.geocoders import Nominatim
import os
import pandas as pd
import warnings

warnings.filterwarnings("ignore")
import re

### 2. Wrangling-Part1<a id='section_2'></a>
#### 2.1. Writing Function to retrieve latitude and longitude based on business name<a id='section_3'></a>

In [2]:

def get_lat_lng(apiKey, address):
    """
    Returns the latitude and longitude of a location using the Google Maps Geocoding API. 
    API: https://developers.google.com/maps/documentation/geocoding/start

    # INPUT -------------------------------------------------------------------
    apiKey                  [str]
    address                 [str]

    # RETURN ------------------------------------------------------------------
    lat                     [float] 
    lng                     [float] 
    """
    bad_address=[]
    import requests
    url = ('https://maps.googleapis.com/maps/api/geocode/json?address={}&key={}'
           .format(address.replace(' ','+'), apiKey))
    
    try:
        response = requests.get(url)
        resp_json_payload = response.json()
        lat = resp_json_payload['results'][0]['geometry']['location']['lat']
        lng = resp_json_payload['results'][0]['geometry']['location']['lng']
    except:
        print('ERROR: {}'.format(address))
        bad_address.append(address)
        lat = 0
        lng = 0
        
    return lat, lng
    #return bad_address

#### 2.2. Function to retrieve correct address based on latitude and longitude <a id='section_4'></a>

In [3]:
def find_address(address):
    apiKey="AIzaSyCi3A6d0id_DRID27Z_MHWsfCdvXNJ_yyU"
    address=("{},VIC".format(address))
    lat, lng = get_lat_lng(apiKey, address)

    geolocator=Nominatim(timeout=None)
    latlng=("{},{}".format(lat,lng))
    location = geolocator.reverse(latlng)
    return pd.Series([lat,lng,location.address])

In [27]:
# Testing the function--- find_address()

lat,long,postal_address=find_address('The East West Overseas Aid Foundation Melbourne')
print(lat)
print(long)
print(postal_address)

-37.8168643
144.9661048
Ross House, 247-251, Flinders Lane, Melbourne City, City of Melbourne, Victoria, 3000, Australia


### 3. Wrangling-Part2<a id='section_5'></a>

In [5]:
df=pd.read_csv("final_clean_vic_activities.csv")

In [6]:
df_test=df.head()

#### From the previous joined data, applying the functions to retrieve the correct address using google api

In [7]:
df[['lat','lng','address']]=df.apply(lambda x:find_address(x['Name']),axis=1)

ERROR: Dance of Life yoga Studio,VIC
ERROR: Jubilee Park Aquatic Centre,VIC
ERROR: Barwon Older Adults Recreation Network,VIC
ERROR: Charlton Swimming Pool,VIC
ERROR: A.B.C. Swim Centre,VIC


#### Finding the postcode on newly retrieved address

In [64]:
def find_postcode(text):
    regex=r'\d{4}'
    pc=re.findall(regex,text)
    if pc:
        value=int(pc[0])
    else:
        value=0
    return value

In [85]:
clean_df['postcode']=clean_df.apply(lambda x:find_postcode(x['address']),axis=1 )



#### Cleaning data which has False or no information

In [84]:
# Removing data with no latitude and longitude information

clean_df = df[df['lat']!=0]

In [87]:
# Cleaning the data which has different country other than Australia

clean_df=clean_df[clean_df['address'].str.contains('Australia')]

In [91]:
#Cleaning the data which has no postcode information

clean_df = clean_df[clean_df['postcode']!=0]

In [86]:
# Selecting final required columns

clean_df=clean_df[['postcode','address','lat', 'lng','Name','activity_1', 'activity_2', 'activity_3','activity_4']]

In [92]:
#Total rows present in the data

clean_df.shape

(1416, 9)

### 4. Saving the file to csv <a id='section_6'></a>


In [93]:
clean_df.to_csv("final_vic_activities.csv",index=False)


### 5. References <a id='section_7'></a>


https://github.com/googlemaps/google-maps-services-python 

https://matthewkudija.com/blog/2018/11/19/google-maps-api/#targetText=Save%20this%20API%20key%20in,APIs%20and%20select%20%22Enable%22.
