In [1]:
import location_handling
import h5pyd

This notebook explores the location_handling.py module for PlanIt. First, let's load our dataset from the API.

In [2]:
f = h5pyd.File("/nrel/wtk-us.h5", 'r')  

`location_handling.get_loc(city, state)` takes a city and state identifier (e.g. OR instead of Oregon) as an input and checks a directory for that particular city's latitude and longitude coordinates.

In [3]:
salem = location_handling.get_loc('Salem', 'OR')
salem

(44.9232, -123.0245)

In order to interface with the API, we need to convert a city's latitude and longitude to its respective nearest neighbor's indices in the API. The API did not necessarily have data for every location we could come up with so our goal with this function was to find the nearest weather station where data was collected to perform our wind and solar energy calculations.

At first, we used Scipy's `distance` package to do this. Each latitude and longitude was fed into a two `for` loops that checked for the smallest euclidean distance between points. However, when running this function, it would take anywhere from 4 to 7 minutes to return a tuple of the indices in question.

When troubleshooting this wait time, we realized that calling `distance.euclidean` every time for nearly 5 million possible neighbors was unwieldy. We decided to hard code the square (instead of the square root) of the distance between points. This shaved the time down to approximately 3 minutes per calculation.

Almost ready to give up, we did some digging into how the API handled this. While they did not use the same latitude and longitude directory we did, they still had a way of calculating the nearest neighbor in their dataset. Turns out they were using a library called `pyproj` which specifically took latitude and longitude coordinates and used the Lamber Conformal Conic function to take into account the curviture of the earth and calculate very efficiently the nearest coordinates. We implemented this into the `location_handling.wtk_locator(f, location)` function and run time went down to a few seconds per calculation. This made a huge difference for our user's experience. 

The function takes the h5pyd file that the API datasets are stored under and the `location`, a tuple of (latitude, longitude). In the function, we had to reverse this tuple because `pyproj` operates with (longitude, latitude). 

In [4]:
loc = location_handling.wtk_locator(f, salem)
loc

(1320, 480)

The final function that dealt with our location directories was `location_handling.get_pop(location)`. This function grabbed the population of a given city by taking the latitude and longitude tuple that was an output from the `get_loc(city, state)` function. It used the same .csv directory as that function so we did not need to worry about interfacing with the API.

In [6]:
pop = location_handling.get_pop(salem)
pop

259816