# Route automation 

When I'm planning a trip, I usually like knowing the distance between all the places, I will be visiting in order to plan a route and just get a general idea of how much it will cost, etc.

Picking out the distance between all the locations gets annoying fast, so I wrote this script.

Works with almost all places (even if you make spelling mistakes, as google is smart) as long as your spelling isn't TOO bad, it will work as google will probably display a result.




In [64]:
import requests 
import bs4 
import re
import pandas as pd

In [189]:
def google_search(start,destination):
    
    """
    This function sends a search request to google and takes extracts out the answer from the quick answer box,
    code is written such that it works for distances between locations with the format google uses,
    as of when the code was written.
    
    Will only work if google has auto complete result for the distance, thus cities and area will work,
    but not going too complex, given string formatting then is a bloody nightmare
    
    """
    
    question = f"distance from {start.lower()} to {destination.lower()} in km" 
    


    url = "https://www.google.com/search?hl=en&q=" + question


    request_result = requests.get( url )


    soup = bs4.BeautifulSoup( request_result.text 
                             , "html.parser" )



    temp = soup.find( "div" , class_='BNeawe' ).text 

    
    
    find_distance = lambda temp_string: ''.join(re.findall(r'([,.\d]+)\s*(?:km)',temp_string)) 
    
    
    def find_time(temp_string):
        
        split_string = temp_string.split('\n')
        
        
        if len(split_string) < 3:
            return ''
        else:
            relevant_part = split_string[2]
            to_ret = re.findall(r'(.*?)\(', relevant_part)
            return ''.join(to_ret)
    
    find_route = lambda temp_string: ''.join(re.findall('(?<=via ).*$', temp_string))
    
    
    distance = find_distance(temp) if find_distance(temp) != None else ''
    
    time = find_time(temp) if find_time(temp) != None else ''
    
    route = find_route(temp) if find_route(temp) != None else ''
    
    return (distance,time,route)
    

    

In [182]:
df = pd.read_csv('initial_file.csv')

In [183]:
df

Unnamed: 0,From,To
0,NJ,"Lancester, PA"
1,NJ,Washington DC
2,NJ,"St. Agatha, Maine"
3,NJ,"Niagara Falls, NY"
4,NJ,"Ashville, NC"
5,NJ,
6,NJ,Florida
7,NJ,"Dallas, TX"
8,NJ,"Los Angels, CA"


In [184]:
#Process
df['From '] = df['From '].replace('NJ','new jersey')


df = df.dropna()
df = df.reset_index(drop=True)
df = df.astype("string")

In [185]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8 entries, 0 to 7
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   From    8 non-null      string
 1   To      8 non-null      string
dtypes: string(2)
memory usage: 256.0 bytes


In [186]:
df['distance_km'] = df.apply(lambda x: google_search(x['From '], x['To'])[0], axis=1)
df['time'] = df.apply(lambda x: google_search(x['From '], x['To'])[1], axis=1)
df['via'] = df.apply(lambda x: google_search(x['From '], x['To'])[2], axis=1)

In [187]:
df

Unnamed: 0,From,To,distance_km,time,via
0,new jersey,"Lancester, PA",211.9,2 hr 22 min,I-76 W
1,new jersey,Washington DC,311.0,3 hr 22 min,NJ Tpke S and I-95 S
2,new jersey,"St. Agatha, Maine",1151.5,11 hr 31 min,I-95 N
3,new jersey,"Niagara Falls, NY",720.2,7 hr 23 min,I-476 N
4,new jersey,"Ashville, NC",1068.1,10 hr 17 min,I-81 S
5,new jersey,Florida,1791.4,16 hr 35 min,I-95 S
6,new jersey,"Dallas, TX",2448.1,22 hr 42 min,I-40 W
7,new jersey,"Los Angels, CA",4444.7,41 hr,I-40 W
