# Alachua County restaurant inspection analysis
This will take restaurant inspection data by the state of Florida and format it in a more reader-friendly way for publication in print and online. We'll filter for the most egregious current violations at restaurants in Alachua County.

__After importing Pandas and Datetime, this reads in the state summary report year-to-date for District 5, which includes Alachua County, and adds an exception in case the file is not found (output probably needs to be set as variable so it can be written into the output file). The raw file has no headers and 82 columns. So this removes all but five columns and adds headers for those. Finally, it displays the first five rows of values.__

In [1]:
import pandas as pd
import datetime
import numpy as np

In [2]:
try:
    insp = pd.read_csv("ftp://dbprftp.state.fl.us/pub/llweb/5fdinspi.csv", 
                               usecols=[2,14,18,80,81])
    
except IOError:
    print "The file is not accessible."
insp.columns = ["CountyName", "InspectDate", "NumHighVio", "LicenseID", "VisitID"]

insp.head() ## this can go away later

Unnamed: 0,CountyName,InspectDate,NumHighVio,LicenseID,VisitID
0,Alachua,11/27/2017,0.0,3713828,6267656
1,Alachua,01/11/2018,0.0,3713828,6432746
2,Alachua,11/27/2017,0.0,3713765,6267651
3,Alachua,12/13/2017,0.0,3713820,6267655
4,Alachua,03/28/2018,0.0,5399007,6510609


__This creates a DataFrame named 'alachua', filters out records for Alachua County:__

In [3]:
alachua = insp[insp.CountyName == 'Alachua']

In [4]:
alachua.info() ## this can go away later

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1512 entries, 0 to 1511
Data columns (total 5 columns):
CountyName     1512 non-null object
InspectDate    1512 non-null object
NumHighVio     1512 non-null float64
LicenseID      1512 non-null int64
VisitID        1512 non-null int64
dtypes: float64(1), int64(2), object(2)
memory usage: 70.9+ KB


__This filters out NumHighVio rows with value of 0:__ _(need to figure out how to combine two arguments above)_

In [5]:
alachua = alachua[alachua.NumHighVio > 0]

In [6]:
alachua.info() ## this can go away later

<class 'pandas.core.frame.DataFrame'>
Int64Index: 759 entries, 5 to 1498
Data columns (total 5 columns):
CountyName     759 non-null object
InspectDate    759 non-null object
NumHighVio     759 non-null float64
LicenseID      759 non-null int64
VisitID        759 non-null int64
dtypes: float64(1), int64(2), object(2)
memory usage: 35.6+ KB


__Convert object in 'InspectDate' to date value:__ 

In [7]:
alachua['InspectDate'] = pd.to_datetime(alachua['InspectDate']) ## done after dataframe reduced
alachua.head() ## this can go away later

Unnamed: 0,CountyName,InspectDate,NumHighVio,LicenseID,VisitID
5,Alachua,2018-03-21,1.0,5399007,6509808
6,Alachua,2018-03-20,2.0,5399007,6280880
24,Alachua,2017-11-22,1.0,6621480,6306950
26,Alachua,2018-04-30,1.0,6621480,6433038
30,Alachua,2017-07-21,1.0,6381936,6302632


__Set start and stop time range as week prior:__ _(doesn't seem to work; number of rows unchanged)_

In [17]:
today = pd.to_datetime('today')
lastweek = datetime.date.today() - datetime.timedelta(days=7)

alachua[(alachua['InspectDate'] > lastweek) & (alachua['InspectDate'] < today)]

alachua.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 759 entries, 5 to 1498
Data columns (total 5 columns):
CountyName     759 non-null object
InspectDate    759 non-null datetime64[ns]
NumHighVio     759 non-null float64
LicenseID      759 non-null int64
VisitID        759 non-null int64
dtypes: datetime64[ns](1), float64(1), int64(2), object(1)
memory usage: 35.6+ KB


__User input for start and sto time range:__ _(doesn't seem to work; number of rows unchanged)_

In [9]:
startDate = pd.to_datetime(raw_input("Enter start: "))
endDate = pd.to_datetime(raw_input("Enter end: "))

alachua[(alachua['InspectDate'] > startDate) & (alachua['InspectDate'] < endDate)]

alachua.info()

Enter start: 2018-01-01
Enter end: 2018-04-01
<class 'pandas.core.frame.DataFrame'>
Int64Index: 759 entries, 5 to 1498
Data columns (total 5 columns):
CountyName     759 non-null object
InspectDate    759 non-null datetime64[ns]
NumHighVio     759 non-null float64
LicenseID      759 non-null int64
VisitID        759 non-null int64
dtypes: datetime64[ns](1), float64(1), int64(2), object(1)
memory usage: 35.6+ KB


__Loop takes the LicenseID and VisitID, passes it to the url for the detailed reports:__

In [18]:
for index, row in alachua.iterrows():
    visitID = row['VisitID']
    licID = row['LicenseID']
    url = "https://www.myfloridalicense.com/inspectionDetail.asp?InspVisitID= %s &licid= %s" % (visitID, licID)
    url = url.replace(' ', '') 
    print url

https://www.myfloridalicense.com/inspectionDetail.asp?InspVisitID=6509808&licid=5399007
https://www.myfloridalicense.com/inspectionDetail.asp?InspVisitID=6280880&licid=5399007
https://www.myfloridalicense.com/inspectionDetail.asp?InspVisitID=6306950&licid=6621480
https://www.myfloridalicense.com/inspectionDetail.asp?InspVisitID=6433038&licid=6621480
https://www.myfloridalicense.com/inspectionDetail.asp?InspVisitID=6302632&licid=6381936
https://www.myfloridalicense.com/inspectionDetail.asp?InspVisitID=6489116&licid=6943776
https://www.myfloridalicense.com/inspectionDetail.asp?InspVisitID=6349967&licid=6381936
https://www.myfloridalicense.com/inspectionDetail.asp?InspVisitID=6377421&licid=3638712
https://www.myfloridalicense.com/inspectionDetail.asp?InspVisitID=6396972&licid=6767936
https://www.myfloridalicense.com/inspectionDetail.asp?InspVisitID=6341203&licid=6767936
https://www.myfloridalicense.com/inspectionDetail.asp?InspVisitID=6343120&licid=2129032
https://www.myfloridalicense.com

__Time for some Beautiful Soup.__