# Weird Al Concert Location Analysis

People live all over the place so choosing where to meet up is always difficult. This hopefully alleviates any hesitation associated with choosing the "wrong" Weird Al venue.

In [1]:
import pandas as pd
from geopy.distance import distance
from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter
from shapely.geometry import shape

## Load and prepare the datasets

`people.csv` contains the approximate closest large city for each person. `weirdal.csv` contains the location and date of each Weird Al concert this year.

In [2]:
people_df = pd.read_csv("people.csv")
people_df["city_state"] = people_df[["City", "State"]].agg(', '.join, axis=1)
weirdal_df = pd.read_csv("weirdal.csv")
weirdal_df["city_state"] = weirdal_df[["City", "State"]].agg(', '.join, axis=1)

[geopy](https://geopy.readthedocs.io/en/latest) makes API calls to geolocate. To perform bulk operations while gracefully handling error responses and adding delays when needed, we need to add a rate-limiter.
More information [here](https://geopy.readthedocs.io/en/latest/#module-geopy.extra.rate_limiter).

In [3]:
geolocator = Nominatim(user_agent="pc_tracker")
geocode = RateLimiter(geolocator.geocode, min_delay_seconds=1)

Using the geolocator, figure out the latitude and longitude of each location so that we can use that information for our analysis.

In [4]:
people_df["location"] = people_df["city_state"].apply(geocode)
people_df[["latitude", "longitude", "altitude"]] = people_df["location"].apply(lambda loc: tuple(loc.point) if loc else None).tolist()
weirdal_df["location"] = weirdal_df["city_state"].apply(geocode)
weirdal_df[["latitude", "longitude", "altitude"]] = weirdal_df["location"].apply(lambda loc: tuple(loc.point) if loc else None).tolist()

Figure out the mean location off all people. 

In [5]:
mean_latitude = people_df.latitude.mean()
mean_longitude = people_df.longitude.mean()
mean_location = geolocator.reverse(f"{mean_latitude}, {mean_longitude}")
print(f"The mean location is: {mean_location.address}")
print(f"Coordinates: ({mean_location.latitude}, {mean_location.longitude})")

The mean location is: County Road V, Amos, Vernon County, Missouri, United States
Coordinates: (38.04241013607125, -94.57380978106599)


Now, for each concert location, figure out how far from the mean location we are.

In [6]:
def calculate_distance_to_mean(x):
    return distance((mean_location.latitude, mean_location.longitude),
                    (x[0], x[1]))
weirdal_df["distance_to_mean"] = weirdal_df[["latitude", "longitude"]].apply(calculate_distance_to_mean, axis=1)

Let's get the top 20 closest concert locations to the mean.

In [7]:
weirdal_df[["City", "State", "Date", "distance_to_mean"]].sort_values(by=['distance_to_mean']).head(20)

Unnamed: 0,City,State,Date,distance_to_mean
84,Kansas City,MO,20220902,117.41248014649945 km
81,Springfield,MO,20220830,145.60323663893539 km
82,Topeka,KS,20220831,147.45709111821236 km
56,Columbia,MO,20220720,218.52997863299154 km
25,Tulsa,OK,20220601,244.44580424688246 km
83,Wichita,KS,20220901,246.25894625141873 km
80,Chesterfield,MO,20220828,357.12373673581175 km
57,Lincoln,NE,20220722,357.88042187739086 km
85,Midwest City,OK,20220904,382.51544963523804 km
24,Little Rock,AR,20220531,419.1914515574612 km


There you have it, Kansas City, MO better be ready.