# Queens Library Finder
## Part 1: Which libraries are open?

Hello, and welcome to my first post in Jupyter. I am attempting to write a Python program that tells which Queens Public Library branches are open at the time it is run. To do it, I use the NYC Open Data portal. 

In [112]:
import pandas as pd
import json
from datetime import datetime

The NYC Open Data website has tons of useful data about New York City. Today, we will be using their [Queens Library Branches](https://data.cityofnewyork.us/Recreation/Queens-Library-Branches/kh3d-xhq7) SODA API. In Python, we can use their JSON endpoint with pandas and `.read_json()`.

In [113]:
response = pd.read_json("https://data.cityofnewyork.us/resource/b67a-vkqb.json")
response.head(5)

Unnamed: 0,:@computed_region_92fq_4b7q,:@computed_region_efsh_h5xi,:@computed_region_f5dn_yrer,:@computed_region_sbqj_enih,:@computed_region_yeji_bk3q,address,bbl,bin,borough,census_tract,...,name,notification,nta,phone,postcode,sa,su,th,tu,we
0,47.0,20530.0,51.0,59.0,3.0,312 Beach 54 Street,4158900000.0,4158900000.0,QUEENS,97204.0,...,Arverne,,Hammels-Arverne-Edgemere ...,(718) 634-4784,11692,10:00AM - 5:30PM,closed,10:00AM - 6:00PM,1:00PM - 6:00PM,10:00AM - 6:00PM
1,4.0,16859.0,39.0,72.0,3.0,14-01 Astoria Boulevard,4005400000.0,4005400000.0,QUEENS,83.0,...,Astoria,,Old Astoria ...,(718) 278-2220,11102,10:00AM - 5:30PM,closed,10:00AM - 6:00PM,1:00PM - 6:00PM,10:00AM - 6:00PM
2,20.0,14193.0,22.0,67.0,3.0,25-55 Francis Lewis Boulevard,4057690000.0,4057690000.0,QUEENS,1017.0,...,Auburndale,,Ft. Totten-Bay Terrace-Clearview ...,(718) 352-2027,11358,10:00AM - 5:30PM,closed,1:00PM - 8:00PM,1:00PM - 6:00PM,10:00AM - 6:00PM
3,46.0,24671.0,41.0,71.0,3.0,117-11 Sutphin Boulevard,4122040000.0,4122040000.0,QUEENS,288.0,...,Baisley Park,<b>This library is currently CLOSED for renova...,Baisley Park ...,(718) 529-1590,11436,closed,closed,closed,closed,closed
4,20.0,14195.0,22.0,67.0,3.0,18-36 Bell Boulevard,4058650000.0,4058650000.0,QUEENS,99704.0,...,Bay Terrace,,Ft. Totten-Bay Terrace-Clearview ...,(718) 423-7004,11360,10:00AM - 5:30PM,closed,1:00AM - 8:00PM,1:00PM - 6:00PM,10:00AM - 6:00PM


Our data there requires a lot of cleaning up. For example, we want to be able to tell when libraries are open or closed, but the hours are listed as strings, like this arbitrary example:

In [114]:
response["mn"][34]

' 1:00PM -  8:00PM'

As is, the opening and closing times are stuck in the same string, and has whitespace that Python won't know how to deal with when comparing to the current time, which is stored in a `datetime.datetime` object. We need to split the string where the space-dash-space connects the two times, and strip the leading whitespace from the separate opening and closing times.

In [115]:
def split_and_strip(hours):
    split = str(hours).split(" - ") #returns list of two times
    output = [element.strip() for element in split]
    return output

split_and_strip(response["mn"][34])

['1:00PM', '8:00PM']

Now, each of the elements in that array must be converted into a `datetime.datetime` object. We take the hour, minute, and whether it is AM or PM from the array, and fill in the year, day, and month with the current year, day, and month.

In [116]:
def convert_time(time):
    if(time not in ["closed", "nan"]):
        out = datetime.strptime(time,"%I:%M%p")
        out = out.replace(year=datetime.now().year, day=datetime.now().day, month=datetime.now().month)
        return out
    else:
        return time
    
convert_time("1:00PM")

datetime.datetime(2018, 4, 21, 13, 0)

Let's combine the `split_and_strip` and `convert_time` functions into one `process` function, which will take something like ' 1:00PM -  8:00PM' and output a dictionary where "opening" and "closing" each correspond to their respective `datetime.datetime` objects.

Then, we can `process` each element in a day of the week's column to get an array of dictionaries that say when that library in the list opens and closes.

In [117]:
def process(hours):
    sas = split_and_strip(hours)
    return {
        "opening": convert_time(sas[0]),
        "closing": convert_time(sas[-1])
    }

def clean_column(column):
    return [process(x) for x in list(column)]

clean_column(response["mn"])

[{'closing': datetime.datetime(2018, 4, 21, 20, 0),
  'opening': datetime.datetime(2018, 4, 21, 13, 0)},
 {'closing': datetime.datetime(2018, 4, 21, 20, 0),
  'opening': datetime.datetime(2018, 4, 21, 13, 0)},
 {'closing': datetime.datetime(2018, 4, 21, 20, 0),
  'opening': datetime.datetime(2018, 4, 21, 13, 0)},
 {'closing': 'closed', 'opening': 'closed'},
 {'closing': datetime.datetime(2018, 4, 21, 20, 0),
  'opening': datetime.datetime(2018, 4, 21, 13, 0)},
 {'closing': datetime.datetime(2018, 4, 21, 20, 0),
  'opening': datetime.datetime(2018, 4, 21, 10, 0)},
 {'closing': datetime.datetime(2018, 4, 21, 20, 0),
  'opening': datetime.datetime(2018, 4, 21, 13, 0)},
 {'closing': datetime.datetime(2018, 4, 21, 20, 0),
  'opening': datetime.datetime(2018, 4, 21, 13, 0)},
 {'closing': datetime.datetime(2018, 4, 21, 20, 0),
  'opening': datetime.datetime(2018, 4, 21, 13, 0)},
 {'closing': datetime.datetime(2018, 4, 21, 20, 0),
  'opening': datetime.datetime(2018, 4, 21, 10, 0)},
 {'closing

Now, let's create a new `DataFrame` called `library_data` that has only the data we want for each library: name, latitude, longitude, and the hours for each day of the week. There are probably more compact ways to do this, but for the sake of legibility, I made a `DataFrame` out of only the library names, and then added the other categories one-by-one as columns.

In [121]:
library_data = pd.DataFrame(response["name"])
library_data["lat"] = response["latitude"]
library_data["lon"] = response["longitude"]
library_data[0] = clean_column(response["mn"])
library_data[1] = clean_column(response["tu"])
library_data[2] = clean_column(response["we"])
library_data[3] = clean_column(response["th"])
library_data[4] = clean_column(response["fr"])
library_data[5] = clean_column(response["sa"])
library_data[6] = clean_column(response["su"])
library_data.set_index("name", inplace = True)
library_data.head()

Unnamed: 0_level_0,lat,lon,0,1,2,3,4,5,6
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Arverne,40.593066,-73.784341,"{'opening': 2018-04-21 13:00:00, 'closing': 20...","{'opening': 2018-04-21 13:00:00, 'closing': 20...","{'opening': 2018-04-21 10:00:00, 'closing': 20...","{'opening': 2018-04-21 10:00:00, 'closing': 20...","{'opening': 2018-04-21 13:00:00, 'closing': 20...","{'opening': 2018-04-21 10:00:00, 'closing': 20...","{'opening': 'closed', 'closing': 'closed'}"
Astoria,40.772173,-73.928757,"{'opening': 2018-04-21 13:00:00, 'closing': 20...","{'opening': 2018-04-21 13:00:00, 'closing': 20...","{'opening': 2018-04-21 10:00:00, 'closing': 20...","{'opening': 2018-04-21 10:00:00, 'closing': 20...","{'opening': 2018-04-21 13:00:00, 'closing': 20...","{'opening': 2018-04-21 10:00:00, 'closing': 20...","{'opening': 'closed', 'closing': 'closed'}"
Auburndale,40.773525,-73.796552,"{'opening': 2018-04-21 13:00:00, 'closing': 20...","{'opening': 2018-04-21 13:00:00, 'closing': 20...","{'opening': 2018-04-21 10:00:00, 'closing': 20...","{'opening': 2018-04-21 13:00:00, 'closing': 20...","{'opening': 2018-04-21 10:00:00, 'closing': 20...","{'opening': 2018-04-21 10:00:00, 'closing': 20...","{'opening': 'closed', 'closing': 'closed'}"
Baisley Park,40.680318,-73.79203,"{'opening': 'closed', 'closing': 'closed'}","{'opening': 'closed', 'closing': 'closed'}","{'opening': 'closed', 'closing': 'closed'}","{'opening': 'closed', 'closing': 'closed'}","{'opening': 'closed', 'closing': 'closed'}","{'opening': 'closed', 'closing': 'closed'}","{'opening': 'closed', 'closing': 'closed'}"
Bay Terrace,40.783103,-73.777013,"{'opening': 2018-04-21 13:00:00, 'closing': 20...","{'opening': 2018-04-21 13:00:00, 'closing': 20...","{'opening': 2018-04-21 10:00:00, 'closing': 20...","{'opening': 2018-04-21 01:00:00, 'closing': 20...","{'opening': 2018-04-21 13:00:00, 'closing': 20...","{'opening': 2018-04-21 10:00:00, 'closing': 20...","{'opening': 'closed', 'closing': 'closed'}"


Finally, the data cleaning is done!

Now, let's write an `is_open` function that tells you whether the library at a given index is open. We can get the current date and time using `datetime.now()` and check whether that is both after that day's opening time and before it's closing time.

In [119]:
def is_open(index, now=datetime.now(), data=library_data):
    hours = data.loc[index, now.weekday()]
    open_time  = hours["opening"]
    close_time = hours["closing"]
    
    if (open_time not in ["closed", "nan"]) and (close_time not in ["closed", "nan"]):
        has_opened     = now > open_time
        has_not_closed = now < close_time
        return (has_opened and has_not_closed)
    else:
        return False

All right! After all this buildup, let's see which Queens libraries are open right now. Using Python list comprehensions, we loop through every index in the data table and check whether the library that corresponds to that index is open. Then, we'll show a list of all libraries for which `is_open` is true.

In [120]:
[library for library in list(library_data.index.values) if is_open(library)]

['Flushing']