# Automating Google searches to extract information

Google can directly show us certain information when we make a search. It can show the weather, a location's address, the distance between two locations, and more.

In this notebook, we will retrieve the distances between a few locations by automating Google searches and extracting the information.

Just like before, let us firstly open a browser using Selenium:

In [1]:
from selenium import webdriver

# Creates an options object.
options = webdriver.ChromeOptions()

# Changes the user agent.
user_agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
options.add_argument("--user-agent=%s" % user_agent)

# Proxy?
# proxy = "<IP>:<PORT>"
# options.add_argument("--proxy-server=%s" % proxy)

# Removes certain fields that can be used to detect WebDriver.
options.add_argument("--disable-blink-features=AutomationControlled")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option("useAutomationExtension", False)

# Sets English as the only accepted language.
options.add_experimental_option("prefs", {"intl.accept_languages": "en;q=0.9"})

# Opens the browser in the incognito mode.
options.add_argument("--incognito")

# Makes the browser headless.
# options.add_argument("--headless")

# Opens the browser window with the provided options.
driver = webdriver.Chrome(options=options)

# Sets window size.
driver.set_window_size(1400, 900)

# Sets window position.
driver.set_window_position(500, 0)

# Sets timeout threshold.
driver.set_page_load_timeout(30)

Now, assume that we want to find the distances between İstanbul, Ankara, and İzmir. Let us create a location list:

In [2]:
locations = ["İstanbul", "Ankara", "İzmir"]

We can obtain the unique location pairs from this list using a nested loop. Alternatively, we can use the [itertools](https://docs.python.org/3/library/itertools.html#itertools.combinations) module:

In [3]:
from itertools import combinations

location_pairs = list(combinations(locations, 2))

location_pairs

[('İstanbul', 'Ankara'), ('İstanbul', 'İzmir'), ('Ankara', 'İzmir')]

In [4]:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import random
import time

for pair in location_pairs:
    query = f"{pair[0]} to {pair[1]} kilometers".replace(" ", "+")
    
    driver.get(f"https://www.google.com/search?q={query}")
    
    wait = WebDriverWait(driver, 5)
    directions = wait.until(EC.presence_of_all_elements_located((By.XPATH, '//div[@data-async-type="editableDirectionsSearch"]')))
    if len(directions) > 0:
        routes = driver.find_elements_by_xpath('//span[@class="UdvAnf"]')
        
        if len(routes) > 0:
            distance = routes[0].text.split("(")[1].replace(")", "")
            if " km" in distance:
                distance_processed = float(distance.replace(",", ".").replace(" km", ""))
            elif " m" in distance:
                distance_processed = float(distance.replace(",", ".").replace(" m", ""))/1000
            else:
                continue
            
            print(f"{pair[0]} to {pair[1]}: {distance_processed} km")
    
    sleep_duration = random.uniform(5, 7)
    time.sleep(sleep_duration)
    

İstanbul to Ankara: 443.9 km
İstanbul to İzmir: 479.0 km
Ankara to İzmir: 587.9 km
