## Introduction
Erwin de Leon and Joeseph Schilling introduce their April 2017 Urban Instute research report ["Urban Blight and Public Health: Addressing the Impact of Substandard Housing, Abandoned Buildings, and Vacant Lots"](https://www.urban.org/research/publication/urban-blight-and-public-health) with the following statement: "We spend more than 2/3rds of our time where we live; thus, housing and neighborhood conditions affect our individual and family's well-being". They also discuss the impact of poor economic conditions that result in "increasing inventories of vacant homes and abandoned buildings". For example, the authors of this report cite Detroit's Blight Removal Task Force's 2014 "strategic plan to address more than 80,000 derelict structures and vacant lots".

This Jupyter notebook illustrates the application of the Python programming language to address three questions regarding the [City of Detroit's Blight Violation Notices (BVN) public domain dataset](https://data.detroitmi.gov/Property-Parcels/Blight-Violations/ti6p-wcg4):  
1. Is the number of blight violations per month increasing, decreasing, or staying constant over time?  
2. Is the number of blight violations in collection increasing, decreasing, or staying constant over time?  
3. For 2017, are blight violation notices clustered as a function of latitute and longitude?  

## Import Dependencies

In [198]:
import numpy as np
import pandas as pd
from csv import DictReader
import os
import urllib
import re
import matplotlib.pyplot as plt
import seaborn as sns
from itertools import zip_longest

%load_ext pycodestyle_magic
%matplotlib inline

The pycodestyle_magic extension is already loaded. To reload it, use:
  %reload_ext pycodestyle_magic


## Prepare Data  
-[Download *.csv from URL](https://stackoverflow.com/questions/41992223/download-csv-from-web-service-with-python-3)  
-[Verify PEP8 in IPython notebook](https://stackoverflow.com/questions/26126853/verifying-pep8-in-ipython-notebook-code)  
-[Tidy Data](https://www.jstatsoft.org/article/view/v059i10)  
-[How to read file n lines at a time in Python](https://stackoverflow.com/questions/5832856/how-to-read-file-n-lines-at-a-time-in-python)  
-[Delete last item in a list](https://stackoverflow.com/questions/18169965/how-to-delete-last-item-in-list)  
-[Remove comma from the end of a line](https://stackoverflow.com/questions/2774558/how-do-i-strip-the-comma-from-the-end-of-a-string-in-python)  
-[Appending row to Pandas DataFrame can be slow](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.append.html)  
-[DictReader parse string](https://stackoverflow.com/questions/31658115/python-csv-dictreader-parse-string)  

In [241]:
#%%pycodestyle
csv_url = "https://data.detroitmi.gov/api/views/ti6p-wcg4" +\
          "/rows.csv?accessType=DOWNLOAD"

data_dir = './Data'
csv_file = os.path.join(data_dir, 'rows.csv')

if not os.path.exists(data_dir):
    os.mkdir(data_dir)
    urllib.request.urlretrieve(csv_url, csv_file)

In [239]:
#%%pycodestyle
def format_columns(h_file):
    """
    Formats the columns of the City of Detroit Blight Violation Notices
    (BVN) public domain dataset

    INPUT:
        h_file: BVN *.csv file handle

    OUTPUT:
        columns: formatted list of BVN *.csv file columns
    """
    columns = h_file.readline().strip()

    columns = [elem.lower() for elem in columns.split(',')]

    columns = [re.sub("[\\s\\(\\)-]", '', elem) for elem in columns]

    del columns[-1]

    return columns


def grouper(iterable, n, fillvalue=None):
    """
    Implements the "grouper() itertools recipe"

    Reference:
    --------
    https://stackoverflow.com/questions/5832856/
        how-to-read-file-n-lines-at-a-time-in-python

    INPUT:
        iterable: Iterable Python object

        n: Number of elements to group

        fillvalue: (Optional) Missing group element fill value

    OUTPUT:
        grouped: Object that stores a group of n elements
    """
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)


def parse_string(string_value):
    """
    If the input string value is empty, returns "NULL". Otherwise,
    returns the input string value

    INPUT:
        string_value: String value that may be empty

    RETURNS:
        formatted_string_value: "NULL" if the input string value
            is empty. Otherwise, is set to the input string value
    """
    if string_value == '':
        return "NULL"
    else:
        return string_value


def parse_int(int_string):
    """Parses an integer value stored in a string

    INPUT:
        int_string: String that stores an integer value

    OUTPUT:
        int_value: Integer value stored in a string"""
    try:
        int_value = int(int_string)
    except ValueError:
        int_value = np.nan

    return int_value


def parse_float(float_string):
    """Parses a floating point value stored in a string

    INPUT:
        float_string: String that stores a floating point value

    OUTPUT:
        float_value: Floating point value stored in a string"""
    try:
        float_value = float(float_string)
    except ValueError:
        float_value = np.nan

    return float_value


class RowFormatter(object):
    """
    Defines the format of a City of Detroit Blight Violation Notices
    dataset row
    """

    def __init__(self):
        """BVN dataset row formatter class object constructor

        INPUT:
            self: RowFormatter class object reference

        OUTPUT:
            self: RowFormatter class object reference
        """
        self.conversion =\
            {'ticketid': lambda x: parse_int(x),
             'ticketnumber': lambda x: parse_string(x),
             'agencyname': lambda x: parse_string(x),
             'inspectorname': lambda x: parse_string(x),
             'violatorname': lambda x: parse_string(x),
             'violatorid': lambda x: parse_int(x),
             'violationstreetnumber':
             lambda x: parse_int(x),
             'violationstreetname':
             lambda x: parse_string(x),
             'violationzipcode':
             lambda x: parse_int(x),
             'mailingaddressstreetnumber':
             lambda x: parse_int(x),
             'mailingaddressstreetname':
             lambda x: parse_string(x),
             'mailingaddresscity':
             lambda x: parse_string(x),
             'mailingaddressstate':
             lambda x: parse_string(x),
             'mailingaddresszipcode':
             lambda x: parse_int(x),
             'mailingaddressnonusacode':
             lambda x: parse_int(x),
             'mailingaddresscountry':
             lambda x: parse_string(x),
             'violationdate': lambda x: parse_string(x),
             'ticketissuedtime': lambda x: parse_string(x),
             'hearingdate': lambda x: parse_string(x),
             'hearingtime': lambda x: parse_string(x),
             'violationcode': lambda x: parse_string(x),
             'violationdescription':
             lambda x: parse_string(x),
             'disposition': lambda x: parse_string(x),
             'fineamount': lambda x: parse_float(x),
             'adminfee': lambda x: parse_float(x),
             'statefee': lambda x: parse_float(x),
             'latefee': lambda x: parse_float(x),
             'discountamount': lambda x: parse_float(x),
             'cleanupcost': lambda x: parse_float(x),
             'judgmentamounttotaldue':
             lambda x: parse_float(x),
             'paymentamountsumofallpayments':
             lambda x: parse_float(x),
             'balancedue': lambda x: parse_float(x),
             'paymentdatemostrecent':
             lambda x: parse_string(x),
             'paymentstatus':
             lambda x: parse_string(x),
             'collectionstatus': lambda x: parse_string(x),
             'violationaddress': lambda x: parse_string(x),
             'violationparcelid': lambda x: parse_string(x),
             'violationlatitude': lambda x: parse_float(x),
             'violationlongitude': lambda x: parse_float(x)}

    def format_row(self,
                   row):
        """
        Returns an ordered dictonary that stores a formatted
        BVN datset row

        INPUT:
            self: RowFormatter class reference

            row: String that stores a BVN dataset row

        OUTPUT:
            row_dict: Ordered dictonary that stores a formatted
                      BVN datset row
        """
        reader =\
            DictReader([row],
                       fieldnames=self.conversion.keys())

        row_dict = reader.__next__()

        for key in row_dict.keys():
            row_dict[key] = self.conversion[key](row_dict[key])

        return row_dict

In [234]:
with open("./Data/rowsSubset.csv") as h_file:
    formatter_obj = RowFormatter()

    columns = format_columns(h_file)

    rows = []
    for lines in grouper(h_file, 2, ''):
        cur_line = re.sub('"location','',lines[0].strip())
        cur_line = cur_line.rstrip(',')
        
        row_dict = formatter_obj.format_row(cur_line)

        rows.append(list(row_dict.values()))

bvn_df = pd.DataFrame(rows, columns=columns)
bvn_df

Unnamed: 0,ticketid,ticketnumber,agencyname,inspectorname,violatorname,violatorid,violationstreetnumber,violationstreetname,violationzipcode,mailingaddressstreetnumber,...,judgmentamounttotaldue,paymentamountsumofallpayments,balancedue,paymentdatemostrecent,paymentstatus,collectionstatus,violationaddress,violationparcelid,violationlatitude,violationlongitude
0,64886,06002195DAH,"Buildings, Safety Engineering & Env Department",John Morris,JARRAL HENSON,41636,18504,BLOOM,,20416,...,1030.0,,1030.0,,,,18504 BLOOM,13015190.001,42.429568,-83.048822
1,248811,10022360DAH,"Buildings, Safety Engineering & Env Department",John Morris,BERNICE THIGPEN,224399,7663,EPWORTH,,411,...,280.0,,280.0,,,,7663 EPWORTH,16014949.,42.352629,-83.130627
2,429537,18027732DAH,"Buildings, Safety Engineering & Env Department",Ernest Thompson,REGAL INVESMENTS LLC,395993,10067,GREENSBORO,48224.0,5550,...,0.0,,0.0,,,,10067 GREENSBORO,21065362.,42.412203,-82.955854
3,279800,11020105DAH,"Buildings, Safety Engineering & Env Department",Louis Smith,... dpc holdings llc,250375,9800,GRAND RIVER,,5404,...,0.0,0.0,0.0,,NO PAYMENT DUE,,9800 GRAND RIVER,16005180-9,42.369278,-83.139438
4,429706,18023498DAH,"Buildings, Safety Engineering & Env Department",Eric Kuuttila,MGC PROPERTIES INC,396162,5910,COVENTRY,48224.0,6689,...,0.0,,0.0,,,,5910 COVENTRY,,42.44466,-83.101793


In [242]:
bvn_df.iloc[0]

ticketid                                                                     64886
ticketnumber                                                           06002195DAH
agencyname                          Buildings, Safety Engineering & Env Department
inspectorname                                                          John Morris
violatorname                                                         JARRAL HENSON
violatorid                                                                   41636
violationstreetnumber                                                        18504
violationstreetname                                                          BLOOM
violationzipcode                                                               NaN
mailingaddressstreetnumber                                                   20416
mailingaddressstreetname                                                   CAMERON
mailingaddresscity                                                         DETROIT
mail