For each hotel listed in the "Manhattan_selected_to_SLR" workbook, we'll calculate distances from it to all other hotels, and store the result in a new workbook entitled "Hotel Pairwise Distances".

## Imports...

In [28]:
# data manipulation package (provides "dataframe" data structure)
import pandas as pd

# linear algebra / matrix package (provides "ndarray" data structure)
import numpy as np

# vincenty distance formula
from geopy.distance import vincenty

## Getting the Data

First, let's read in the "Names, Addresses, ID, and Coordinates" workbook I put together earlier.

In [20]:
# read in the data
hotel_data = pd.read_excel('../data/Hotel Names, Addresses, ID, and Coordinates.xlsx')

# take a look at the result
hotel_data

Unnamed: 0,Name,Address,Share ID,Coordinates
0,Homewood Suites New York Midtown Manhattan Tim...,312 W 37th St New York NY 10018-4208,80307,"(40.7542902, -73.9930667)"
1,Hilton New York Fashion District,152 W 26th St New York NY 10001-6801,81620,"(40.7455199, -73.993757)"
2,Holiday Inn New York City Times Square,585 8th Ave New York NY 10018-3003,82671,"(40.755229, -73.99171199999999)"
3,Doubletree New York Times Square South,341 W 36th St New York NY 10018-6401,83645,"(40.7543497, -73.99411900000001)"
4,Hampton Inn ManhattanTimes Square South,337 W 39th St New York NY 10018-1401,83706,"(40.7562619, -73.9928494)"
5,The Ludlow Hotel,180 Ludlow St New York NY 10002-1514,84151,"(40.7218313, -73.98722149999999)"
6,Hotel Le Soleil,38 W 36th St New York NY 10018-8078,84346,"(40.7502658, -73.98548)"
7,W Hotel New York Downtown,8 Albany St New York NY 10006-1001,84511,"(40.7091948, -74.0135809)"
8,Hyatt Times Square,135 W 45th St New York NY 10036-4004,84541,"(40.7575491, -73.98418389999999)"
9,Hilton Garden Inn New York West 35th Street,63 W 35th St New York NY 10001-2202,85007,"(40.7504261, -73.9865346)"


## Calculate Pairwise Distances

For each hotel, calculate the distance from it to all other hotels. The distances will be stored in a 178 x 178 matrix, output in an Excel workbook.

In [47]:
distances = np.zeros((178, 178), dtype=np.float32)

for i, hotel1 in hotel_data.iterrows():
    for k, hotel2 in hotel_data.iterrows():
        distances[i, k] = vincenty(hotel1['Coordinates'], hotel2['Coordinates']).miles

# caste the numpy matrix of distances into a pandas dataframe (indexing / columns with hotel addresses)
distances = pd.DataFrame(distances, index=hotel_data['Name'], columns=hotel_data['Name'])

In [48]:
# write the distances dataframe to an Excel workbook
writer = pd.ExcelWriter('../data/Hotel Pairwise Distances.xlsx')
distances.to_excel(writer, 'Pairwise Distances')
writer.close()