# Advanced Data Mining and Analysis in Python
## Using Google Distance Matrix API
### In this exercise, we will be writing a Python code to retrieve information about a list of destinations, using Google API Distance Matrix and Geocode Services.

**The information we will retrieve includes the distance, time, longitude, and latitude of each destination in relation to Tel Aviv.**  

**We will then store the data in a dataframe and print its contents. Finally, we will find the three cities that are furthest from Tel Aviv.**

In [4]:
import pandas as pd
import numpy as np
import requests
import json
import urllib
from urllib.error import HTTPError
import pprint

In [5]:
# Read the file and insert the cities into 'destinations' list.
FILE_PATH = '/data/notebook_files/dests.txt'
file =  open(FILE_PATH, 'r')
destinations = [i.rstrip() for i in file.readlines()]

### First, we'll write a function to get the distance and the duration between the destination and Tel Aviv.

In [6]:
api_key = 'YOUR_API_KEY' #HERE PUT YOUR OWN API_KEY
def get_distance_from_tlv(destination,api_key):
    origin = 'Tel Aviv'
    try:
        url = 'https://maps.googleapis.com/maps/api/distancematrix/json?destinations=%s&origins=%s&units=metric&key=%s'\
            % (destination,origin,api_key)
        response = requests.get(url).json()  # if the response is of json format the .json() will load the json into a python object
        distance = response['rows'][0]['elements'][0]['distance']['text']
        time = response['rows'][0]['elements'][0]['duration']['value']/60
        time = int(time)
        hours, minutes = divmod(time, 60)  # calculate hours and remaining minutes
        duration = f"{hours} hours, {minutes} minutes"  # format the duration string
        return distance, duration
    except:
        print(f'Something went wrong')

**Let's see an example:**

In [7]:
get_distance_from_tlv('Barcelona', api_key)

('4,748 km', '50 hours, 22 minutes')

### Now, we'll get the longitude and latitude of the destination.

In [8]:
def get_lat_lng(address_string,api_key):
    url="https://maps.googleapis.com/maps/api/geocode/json?address=%s&key=%s" % (address_string,api_key)
    response = requests.get(url).json()
    longitude = response['results'][0]['geometry']['location']['lng']
    latitude = response['results'][0]['geometry']['location']['lat']
    return longitude, latitude

**Let's see an example:**

In [9]:
get_lat_lng('Barcelona',api_key)

(2.168568, 41.3873974)

### Now, we'll use our functions in order to create the desired DataFrame:

In [10]:
dataset = pd.DataFrame(columns=['Target','Distance_km','Duration','Longitude','Latitude'])
dataset.Target = destinations
distance_features = ['Distance_km','Duration']
loc_features = ['Longitude','Latitude']
dataset[distance_features] = dataset['Target'].apply(lambda x: pd.Series(get_distance_from_tlv(x,api_key)))
dataset[loc_features] = dataset['Target'].apply(lambda x: pd.Series(get_lat_lng(x,api_key)))

In [11]:
dataset

Unnamed: 0,Target,Distance_km,Duration,Longitude,Latitude
0,Istanbul,"1,815 km","21 hours, 2 minutes",28.978359,41.008238
1,Amsterdam,"4,533 km","48 hours, 3 minutes",4.904139,52.367573
2,Valletta,"3,793 km","50 hours, 51 minutes",14.5141,35.899237
3,Basel,"4,093 km","44 hours, 2 minutes",7.588576,47.559599
4,Doha,"2,164 km","22 hours, 38 minutes",51.53104,25.285447


### Finally, we'll retrieve the three furthest cities from Tel Aviv.

In [12]:
dataset.sort_values(by = 'Distance_km', ascending = False, inplace = True)

In [13]:
furthest_cities = [city for city in dataset.head(3).Target]
furthest_cities = ', '.join(furthest_cities)
print(f'The three furthest cities from Tel Aviv are: {furthest_cities}')

The three furthest cities from Tel Aviv are: Amsterdam, Basel, Valletta
