For each hotel listed in the "Manhattan_selected_to_SLR" workbook, we'll calculate distances from it to all other hotels, and store the result in a new workbook entitled "Hotel Pairwise Distances".

## Imports...

In [1]:
# data manipulation package (provides "dataframe" data structure)
import pandas as pd

# linear algebra / matrix package (provides "ndarray" data structure)
import numpy as np

# vincenty distance formula
from geopy.distance import vincenty

## Getting the Data

First, let's read in the "Names, Addresses, ID, and Coordinates" workbook I put together earlier.

In [2]:
# read in the data
hotel_data = pd.read_excel('../data/Hotel Names, Addresses, ID, and Coordinates.xlsx')

# take a look at the result
hotel_data

Unnamed: 0,Share ID,Name,Address,Coordinates
0,107972,1 Hotel Central Park,1414 Avenue Of The Americas,"(40.7647485, -73.9764725)"
1,5639782,11 Howard,11 Howard St,"(40.7191766, -73.99992759999999)"
2,120886,Ace Hotel New York,20 W 29th St,"(40.7457432, -73.9882603)"
3,2252008,Ameritania Hotel,230 W 54th St,"(40.7640045, -73.9829075)"
4,105038,Andaz Wall Street,75 Wall St,"(40.7051404, -74.0079954)"
5,119324,Archer Hotel New York,45 W 38th St,"(40.752027, -73.9847224)"
6,91126,Arlo Hudson Square,231 Hudson St,"(40.7244391, -74.00823749999999)"
7,112473,Ascend Collection Distrikt Hotel New York City,342 W 40th St,"(40.7566314, -73.9928546)"
8,7221375,Autograph Collection Carlton Hotel,88 Madison Ave,"(40.7444291, -73.9857968)"
9,8341862,Autograph Collection The Lexington New York City,511 Lexington Ave,"(40.7548069, -73.9731814)"


## Calculate Pairwise Distances

For each hotel, calculate the distance from it to all other hotels. The distances will be stored in a 178 x 178 matrix, output in an Excel workbook.

In [3]:
distances = np.zeros((178, 178), dtype=np.float32)

for i, hotel1 in hotel_data.iterrows():
    for k, hotel2 in hotel_data.iterrows():
        distances[i, k] = vincenty(hotel1['Coordinates'], hotel2['Coordinates']).miles

# caste the numpy matrix of distances into a pandas dataframe (indexing / columns with hotel addresses)
distances = pd.DataFrame(distances, index=hotel_data['Name'], columns=hotel_data['Name'])

In [4]:
# write the distances dataframe to an Excel workbook
writer = pd.ExcelWriter('../data/Hotel Pairwise Distances.xlsx')
distances.to_excel(writer, 'Pairwise Distances')
writer.close()