In [183]:
import pandas as pd
import numpy as np
import matplotlib as plt
import warnings
%matplotlib inline

## Read in the data

In [184]:
bombing = pd.read_csv("WW2_Bombing_Data.csv")
weather = pd.read_csv("SummaryOfWeather.csv")
stations = pd.read_csv("WeatherStationsGermany.csv")

#### Merging data sets
We need to determine which weather stations were in the vicinity of a given bombing mission to extract what the weather was like at that station.  To do this, we'll consider that latitude and longitude at which the attack took place, and then create a region around that location representing what the weather was like there.

At first glance, it makes sense to consider the range that the aircraft can see to the horizon.  The formula for distance in miles is 1.22 times the square root of the height (in feet) of the aircraft.

\begin{equation}
Distance_{Horizon}(Miles) = 1.22\times\sqrt{Height(Feet)}
\end{equation}

WW2 bombing aircraft generally flew at an altitude of 10,000 feet when delivering their payload.  Using 10,000 feet for height, we see that aircraft can see approximately 122 miles to the horizon in all directions.

Each degree of lattitude is approximately 69 miles.  Therefore it's safe to say that the weather will likely be unchanged in a 1 degree lattitude diameter (69 miles).

In [188]:
# Define a function that will list all station IDs within a 1 degree wide box centered around the attack
def FindStations(Latitude, Longitude):
    nearby_stations = stations[(Latitude+0.5 >= stations["Latitude"]) & (Latitude-0.5 <= stations["Latitude"]) &
                              (Longitude+0.5 >= stations["Longitude"]) & (Longitude-0.5 <= stations["Longitude"])]
    return(list(nearby_stations["StationID"]))

In [189]:
# Suppress warnings because they're annoying.
warnings.simplefilter(action="ignore")

In [190]:
bombing_stations = bombing[bombing["DefCountry"] == "Germany"]  # Only look at bombings in Germany
bombing_stations.drop(["Details"], axis=1, inplace = True)  # We don't need the Details column, so drop it.
bombing_stations["StationIDs"] = ""  # This is just a place holder so the next line will work
# Iterate through each row, and pass the lat and long into the FindStations function
for index, row in bombing_stations.iterrows():
    bombing_stations.at[index, "StationIDs"]= FindStations(float(row["Latitude"]), float(row["Longitude"]))