# Flight distances

We have latitude or longitude values for all 81 provinces and we have to calculate distances between them.

## Get Coordinates

Copy this method below to load _coordinates.csv_ file into the flights _DataFrame_ first.

In [6]:
coordinates <- read.csv("coordinates.csv", row.names = 1)
head(coordinates)

Unnamed: 0,lat,lng
Adana,36.99142,35.33083
Adıyaman,37.76365,38.27726
Afyonkarahisar,38.75689,30.5387
Ağrı,39.71907,43.05059
Amasya,40.65646,35.83735
Ankara,39.93336,32.85974


## Calculate Distance

In order to calculate distance between two points on earth we use the function below:

\begin{align}
a &= sin^2(\Delta\varphi/2) + cos\varphi_1 * cos\varphi_2 * sin^2(\Delta\lambda/2) \\
c &= 2 * atan2(\sqrt{a}, \sqrt{1-a}) \\
d &= R * c
\end{align}

where $\varphi$ is latitude and $\lambda$ is longitude, and $R$ represents radius which is  6.371km.

We will write a function named *calculate\_distance* which will calculate the distance between two points on earth using the formula above.

 - $\Delta$ means the difference (between latitude or longitude values)
 - $d$ is the distance we try to find
 
**Note**: Function above works on radians but given coordinates are in degrees. In order to conver them, you should use $r = d * \pi / 180$ where $d$ is degree and $r$ is radians.

**Hint**:

```R
# Function in R
atan2()
```
#### Example usage
```R
> calculate_distance(coordinates["Adana",], coordinates["Adıyaman",])
274.130329948729
```

In [2]:
# http://www.movable-type.co.uk/scripts/latlong.html
calculate_distance <- function(prov1, prov2)
{
  rad_earth <- 6371 # radius of earth
  
  coor1 <- prov1 * pi / 180
  coor2 <- prov2 * pi / 180
  
  lat1 <- coor1[[1]]
  lng1 <- coor1[[2]]
  
  lat2 <- coor2[[1]]
  lng2 <- coor2[[2]]
  
  dlat <- lat1 - lat2
  dlng <- lng1 - lng2
  
  a <- sin(dlat / 2)^2 + cos(lat1) * cos(lat2) * sin(dlng / 2)^2
  
  cc <- 2 * atan2(sqrt(a), sqrt(1-a))
  
  distance <- rad_earth * cc
  
  return(distance)
}

## Generate Distance Matrix

We will write the function *distance_mat_gen* which will calculate all the distances of a given coordinates and return them as a data frame. In our case, column and row names should be corresponding provinces.

#### Example Usage

In [4]:
distance_mat <- distance_mat_gen(coordinates)
head(distance_mat)

Unnamed: 0,Adana,Adıyaman,Afyonkarahisar,Ağrı,Amasya,Ankara,Antalya,Artvin,Aydın,Balıkesir,⋯,Batman,Şırnak,Bartın,Ardahan,Iğdır,Yalova,Karabük,Kilis,Osmaniye,Düzce
Adana,0.0,274.1303,464.0904,737.967,409.8865,391.4989,410.4493,728.1713,611.2858,713.0693,⋯,521.485,632.982,577.2951,783.8771,825.12815,663.2922,517.2614,161.6306,81.79594,559.3729
Adıyaman,274.1303,0.0,684.4088,467.5222,384.225,527.4184,675.4953,486.6287,860.0036,924.8506,⋯,250.873,368.7094,666.1153,531.9325,553.83735,838.3575,608.6093,155.4003,195.02316,700.8734
Afyonkarahisar,464.0904,684.4088,0.0,1081.9708,499.9536,238.6276,207.3897,997.6861,207.6774,248.9987,⋯,928.3631,1050.3429,354.7323,1068.6621,1167.4624,236.7608,329.0461,620.9925,534.37865,237.5446
Ağrı,737.967,467.5222,1081.9708,0.0,621.3479,870.1178,1120.2095,193.0006,1278.9226,1295.7407,⋯,262.863,250.0735,927.8855,157.7716,87.70378,1172.8182,886.8424,616.5411,661.61071,1015.3268
Amasya,409.8865,384.225,499.9536,621.3479,0.0,265.009,609.7558,506.0193,707.416,684.2904,⋯,549.5353,668.8897,312.9597,579.1866,700.58974,552.6755,266.6926,451.8998,399.85096,394.2761
Ankara,391.4989,527.4184,238.6276,870.1178,265.009,0.0,385.9364,769.2796,446.0049,425.658,⋯,750.3888,873.9542,194.6059,841.7343,953.04311,313.6586,139.8082,515.3534,433.39402,175.4831


In [3]:
distance_mat_gen <- function(coordinates)
{
  nprov <- nrow(coordinates)
  # Generate an empty dataframe with proper row names
  distance_mat = as.data.frame(matrix(nrow = nprov, ncol = nprov),
                               row.names=row.names(coordinates))
  # Set column names of generated empty data frame
  colnames(distance_mat) <- row.names(coordinates)
  # Iterate over provinces
  for(row in 1:nprov) {
    # Since symmetric matrix, calculating one side is enough
    for(col in row:nprov) {
      actual_dist <- calculate_distance(coordinates[row,], coordinates[col,])
      # since distances are symmetric...
      distance_mat[row, col] <- actual_dist
      distance_mat[col, row] <- actual_dist
    }
  }
  
  return(distance_mat)
}

## Another Solution with New Representation

In [8]:
calculate_distance2 <- function(prov1, prov2, coordinates1 = coordinates)
{
    rad_earth <- 6371 # radius of earth

    coor1 <- coordinates1[prov1,] * pi / 180
    coor2 <- coordinates1[prov2,] * pi / 180

    lat1 <- coor1[[1]]
    lng1 <- coor1[[2]]

    lat2 <- coor2[[1]]
    lng2 <- coor2[[2]]

    dlat <- lat1 - lat2
    dlng <- lng1 - lng2

    a <- sin(dlat / 2)^2 + cos(lat1) * cos(lat2) * sin(dlng / 2)^2

    cc <- 2 * atan2(sqrt(a), sqrt(1-a))

    distance <- rad_earth * cc

    return(distance)
}

calculate_distanceV <- Vectorize(calculate_distance2)

distance_mat_gen2 <- function(coordinates1 = coordinates)
{
    nprov <- nrow(coordinates1)
    distance_mat <- outer(1:nprov, 1:nprov, calculate_distanceV)

    return(distance_mat)
}

melt_distances <- function(distance_mat)
{
    nprov <- nrow(distance_mat)
    combinations <- t(combn(nprov, 2))
    
    dists <- apply(combinations, 1, function(x) distance_mat[x[1],x[2]])

    distance_mat_long <- cbind(combinations, dists)

    return(distance_mat_long)
}

distance_matt2 <- distance_mat_gen2()
melted_dist <- melt_distances(distance_matt2)
head(distance_matt2)
head(melted_dist)

0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
0.0,274.1303,464.0904,737.967,409.8865,391.4989,410.4493,728.1713,611.2858,713.0693,⋯,521.485,632.982,577.2951,783.8771,825.12815,663.2922,517.2614,161.6306,81.79594,559.3729
274.1303,0.0,684.4088,467.5222,384.225,527.4184,675.4953,486.6287,860.0036,924.8506,⋯,250.873,368.7094,666.1153,531.9325,553.83735,838.3575,608.6093,155.4003,195.02316,700.8734
464.0904,684.4088,0.0,1081.9708,499.9536,238.6276,207.3897,997.6861,207.6774,248.9987,⋯,928.3631,1050.3429,354.7323,1068.6621,1167.4624,236.7608,329.0461,620.9925,534.37865,237.5446
737.967,467.5222,1081.9708,0.0,621.3479,870.1178,1120.2095,193.0006,1278.9226,1295.7407,⋯,262.863,250.0735,927.8855,157.7716,87.70378,1172.8182,886.8424,616.5411,661.61071,1015.3268
409.8865,384.225,499.9536,621.3479,0.0,265.009,609.7558,506.0193,707.416,684.2904,⋯,549.5353,668.8897,312.9597,579.1866,700.58974,552.6755,266.6926,451.8998,399.85096,394.2761
391.4989,527.4184,238.6276,870.1178,265.009,0.0,385.9364,769.2796,446.0049,425.658,⋯,750.3888,873.9542,194.6059,841.7343,953.04311,313.6586,139.8082,515.3534,433.39402,175.4831


Unnamed: 0,Unnamed: 1,dists
1,2,274.1303
1,3,464.0904
1,4,737.967
1,5,409.8865
1,6,391.4989
1,7,410.4493
