In [2]:
(require '[clojupyter.javascript.alpha :as cjp-js])
(require '[clojupyter.display :as display])
(require '[clojupyter.misc.helper :as helper])
(require '[clojure.data.json :as json])
(helper/add-dependencies '[org.clojure/data.csv "1.0.0"])
(require '[clojure.data.csv :as csv])
(helper/add-dependencies '[metasoarous/oz "1.5.6"])
(require '[oz.notebook.clojupyter :as oz])
(require '[clojure.java.io :as io])
(require '[clojure.pprint :as pp])
(helper/add-dependencies '[clojure.java-time "0.3.2"])
(require '[java-time :as t])
(require '[clojure.edn :as edn])
(helper/add-dependencies '[panthera "0.1-alpha.13"])
(require '[libpython-clj.python :as py])
(require '[panthera.panthera :as pt])

nil

In [3]:
;; use panthera html display
(defn show
  [obj]
  (display/html
    (py/call-attr obj "to_html")))

(defn show-table
  [m]
  (-> m
      pt/data-frame
      show))

(show-table [{:a 1 :b 2} {:a 3 :b 4}])

Unnamed: 0,a,b
0,1,2
1,3,4


Let's continue with our NYC 311 service requests example.

In [4]:
;; Python: complaints = pd.read_csv('../data/311-service-requests.csv')

;; read data in
(def raw-data
    (with-open [reader (io/reader "../data/311-service-requests.csv")]
      (doall
        (csv/read-csv reader))))

(defn blank->nil [s]
  (when-not (clojure.string/blank? s) s))

(defn csv-data->maps [csv-data]
  (map zipmap
       (->> (first csv-data) ;; First row is the header
            (map keyword) ;; Drop if you want string keys instead
            repeat)
       (->> (rest csv-data)
            (map #(map blank->nil %))))) ;; Drop if you want blank strings to stay

(def complaints (csv-data->maps raw-data))

#'user/complaints

# 3.1 Selecting only noise complaints

I'd like to know which borough has the most noise complaints. First, we'll take a look at the data to see what it looks like:

In [5]:
(->> complaints
     (take 5)
     show-table)

Unnamed: 0,Road Ramp,Resolution Action Updated Date,Bridge Highway Name,Park Facility Name,School Number,Park Borough,Taxi Pick Up Location,Address Type,Due Date,Bridge Highway Segment,School Phone Number,Cross Street 1,Vehicle Type,Bridge Highway Direction,Complaint Type,Y Coordinate (State Plane),City,School Address,Intersection Street 2,School State,Agency,Unique Key,School Code,Intersection Street 1,School Zip,Descriptor,Borough,Street Name,Incident Zip,Longitude,Agency Name,Community Board,Incident Address,Facility Type,Latitude,School City,School Region,Closed Date,Location Type,Ferry Terminal Name,Landmark,Ferry Direction,X Coordinate (State Plane),Location,School Name,Created Date,Cross Street 2,Garage Lot Name,School Not Found,Taxi Company Borough,School or Citywide Complaint,Status
0,,10/31/2013 02:35:17 AM,,Unspecified,Unspecified,QUEENS,,ADDRESS,10/31/2013 10:08:41 AM,,Unspecified,90 AVENUE,,,Noise - Street/Sidewalk,197389,JAMAICA,Unspecified,,Unspecified,NYPD,26589651,Unspecified,,Unspecified,Loud Talking,QUEENS,169 STREET,11432,-73.79160395779721,New York City Police Department,12 QUEENS,90-03 169 STREET,Precinct,40.70827532593202,Unspecified,Unspecified,,Street/Sidewalk,,,,1042027,"(40.70827532593202, -73.79160395779721)",Unspecified,10/31/2013 02:08:41 AM,91 AVENUE,,N,,,Assigned
1,,,,Unspecified,Unspecified,QUEENS,,BLOCKFACE,10/31/2013 10:01:04 AM,,Unspecified,58 PLACE,,,Illegal Parking,201984,MASPETH,Unspecified,,Unspecified,NYPD,26593698,Unspecified,,Unspecified,Commercial Overnight Parking,QUEENS,58 AVENUE,11378,-73.90945306791765,New York City Police Department,05 QUEENS,58 AVENUE,Precinct,40.72104053562831,Unspecified,Unspecified,,Street/Sidewalk,,,,1009349,"(40.721040535628305, -73.90945306791765)",Unspecified,10/31/2013 02:01:04 AM,59 STREET,,N,,,Open
2,,10/31/2013 02:39:42 AM,,Unspecified,Unspecified,MANHATTAN,,ADDRESS,10/31/2013 10:00:24 AM,,Unspecified,WEST 171 STREET,,,Noise - Commercial,246531,NEW YORK,Unspecified,,Unspecified,NYPD,26594139,Unspecified,,Unspecified,Loud Music/Party,MANHATTAN,BROADWAY,10032,-73.93914371913482,New York City Police Department,12 MANHATTAN,4060 BROADWAY,Precinct,40.84332975466513,Unspecified,Unspecified,10/31/2013 02:40:32 AM,Club/Bar/Restaurant,,,,1001088,"(40.84332975466513, -73.93914371913482)",Unspecified,10/31/2013 02:00:24 AM,WEST 172 STREET,,N,,,Closed
3,,10/31/2013 02:21:10 AM,,Unspecified,Unspecified,MANHATTAN,,BLOCKFACE,10/31/2013 09:56:23 AM,,Unspecified,COLUMBUS AVENUE,,,Noise - Vehicle,222727,NEW YORK,Unspecified,,Unspecified,NYPD,26595721,Unspecified,,Unspecified,Car/Truck Horn,MANHATTAN,WEST 72 STREET,10023,-73.98021349023975,New York City Police Department,07 MANHATTAN,WEST 72 STREET,Precinct,40.7780087446372,Unspecified,Unspecified,10/31/2013 02:21:48 AM,Street/Sidewalk,,,,989730,"(40.7780087446372, -73.98021349023975)",Unspecified,10/31/2013 01:56:23 AM,AMSTERDAM AVENUE,,N,,,Closed
4,,10/31/2013 01:59:54 AM,,Unspecified,Unspecified,MANHATTAN,,BLOCKFACE,11/30/2013 01:53:44 AM,,Unspecified,LENOX AVENUE,,,Rodent,233545,NEW YORK,Unspecified,,Unspecified,DOHMH,26590930,Unspecified,,Unspecified,Condition Attracting Rodents,MANHATTAN,WEST 124 STREET,10027,-73.94738703491433,Department of Health and Mental Hygiene,10 MANHATTAN,WEST 124 STREET,,40.80769092704951,Unspecified,Unspecified,,Vacant Lot,,,,998815,"(40.80769092704951, -73.94738703491433)",Unspecified,10/31/2013 01:53:44 AM,ADAM CLAYTON POWELL JR BOULEVARD,,N,,,Pending


To get the noise complaints, we need to find the rows where the "Complaint Type" column is "Noise - Street/Sidewalk". I'll show you how to do that, and then explain what's going on.

In [6]:
(->> complaints
    (filter (comp #{"Noise - Street/Sidewalk"} (keyword "Complaint Type")))
    (take 5)
    show-table)

Unnamed: 0,Road Ramp,Resolution Action Updated Date,Bridge Highway Name,Park Facility Name,School Number,Park Borough,Taxi Pick Up Location,Address Type,Due Date,Bridge Highway Segment,School Phone Number,Cross Street 1,Vehicle Type,Bridge Highway Direction,Complaint Type,Y Coordinate (State Plane),City,School Address,Intersection Street 2,School State,Agency,Unique Key,School Code,Intersection Street 1,School Zip,Descriptor,Borough,Street Name,Incident Zip,Longitude,Agency Name,Community Board,Incident Address,Facility Type,Latitude,School City,School Region,Closed Date,Location Type,Ferry Terminal Name,Landmark,Ferry Direction,X Coordinate (State Plane),Location,School Name,Created Date,Cross Street 2,Garage Lot Name,School Not Found,Taxi Company Borough,School or Citywide Complaint,Status
0,,10/31/2013 02:35:17 AM,,Unspecified,Unspecified,QUEENS,,ADDRESS,10/31/2013 10:08:41 AM,,Unspecified,90 AVENUE,,,Noise - Street/Sidewalk,197389,JAMAICA,Unspecified,,Unspecified,NYPD,26589651,Unspecified,,Unspecified,Loud Talking,QUEENS,169 STREET,11432,-73.79160395779721,New York City Police Department,12 QUEENS,90-03 169 STREET,Precinct,40.70827532593202,Unspecified,Unspecified,,Street/Sidewalk,,,,1042027,"(40.70827532593202, -73.79160395779721)",Unspecified,10/31/2013 02:08:41 AM,91 AVENUE,,N,,,Assigned
1,,10/31/2013 02:07:14 AM,,Unspecified,Unspecified,STATEN ISLAND,,ADDRESS,10/31/2013 08:54:03 AM,,Unspecified,HENDERSON AVENUE,,,Noise - Street/Sidewalk,171076,STATEN ISLAND,Unspecified,,Unspecified,NYPD,26594086,Unspecified,,Unspecified,Loud Music/Party,STATEN ISLAND,CAMPBELL AVENUE,10310,-74.1161500428337,New York City Police Department,01 STATEN ISLAND,173 CAMPBELL AVENUE,Precinct,40.63618202176914,Unspecified,Unspecified,10/31/2013 02:16:39 AM,Street/Sidewalk,,,,952013,"(40.63618202176914, -74.1161500428337)",Unspecified,10/31/2013 12:54:03 AM,WINEGAR LANE,,N,,,Closed
2,,10/31/2013 01:45:17 AM,,Unspecified,Unspecified,STATEN ISLAND,,ADDRESS,10/31/2013 08:35:18 AM,,Unspecified,HAMPTON GREEN,,,Noise - Street/Sidewalk,140964,STATEN ISLAND,Unspecified,,Unspecified,NYPD,26591573,Unspecified,,Unspecified,Loud Talking,STATEN ISLAND,PRINCETON LANE,10312,-74.19674315017886,New York City Police Department,03 STATEN ISLAND,24 PRINCETON LANE,Precinct,40.55342078716953,Unspecified,Unspecified,10/31/2013 02:41:35 AM,Street/Sidewalk,,,,929577,"(40.55342078716953, -74.19674315017886)",Unspecified,10/31/2013 12:35:18 AM,DEAD END,,N,,,Closed
3,,10/31/2013 02:00:57 AM,,Unspecified,Unspecified,MANHATTAN,,ADDRESS,10/31/2013 08:32:08 AM,,Unspecified,LENOX AVENUE,,,Noise - Street/Sidewalk,231613,NEW YORK,Unspecified,,Unspecified,NYPD,26594085,Unspecified,,Unspecified,Loud Talking,MANHATTAN,WEST 116 STREET,10026,-73.95052644123253,New York City Police Department,10 MANHATTAN,121 WEST 116 STREET,Precinct,40.80238950799943,Unspecified,Unspecified,,Street/Sidewalk,,,,997947,"(40.80238950799943, -73.95052644123253)",Unspecified,10/31/2013 12:32:08 AM,7 AVENUE,,N,,,Assigned
4,,,,Unspecified,Unspecified,BROOKLYN,,BLOCKFACE,10/31/2013 08:30:36 AM,,Unspecified,EAST 80 STREET,,,Noise - Street/Sidewalk,170310,BROOKLYN,Unspecified,,Unspecified,NYPD,26595564,Unspecified,,Unspecified,Loud Music/Party,BROOKLYN,AVENUE J,11236,-73.91105541883589,New York City Police Department,18 BROOKLYN,AVENUE J,Precinct,40.634103775951736,Unspecified,Unspecified,,Street/Sidewalk,,,,1008937,"(40.634103775951736, -73.91105541883589)",Unspecified,10/31/2013 12:30:36 AM,EAST 81 STREET,,N,,,Open


We can combine predicates together with higher-order functions, like `every-pred` and `some-fn`.

In [31]:
(def is_noise (comp #{"Noise - Street/Sidewalk"} (keyword "Complaint Type")))

(def in_brooklyn (comp #{"BROOKLYN"} (keyword "Borough")))

(->> complaints
     (filter (every-pred is_noise in_brooklyn)) ;; both have to be true. Functional version of 'and'
     (take 5)
     show-table)

Unnamed: 0,Road Ramp,Resolution Action Updated Date,Bridge Highway Name,Park Facility Name,School Number,Park Borough,Taxi Pick Up Location,Address Type,Due Date,Bridge Highway Segment,School Phone Number,Cross Street 1,Vehicle Type,Bridge Highway Direction,Complaint Type,Y Coordinate (State Plane),City,School Address,Intersection Street 2,School State,Agency,Unique Key,School Code,Intersection Street 1,School Zip,Descriptor,Borough,Street Name,Incident Zip,Longitude,Agency Name,Community Board,Incident Address,Facility Type,Latitude,School City,School Region,Closed Date,Location Type,Ferry Terminal Name,Landmark,Ferry Direction,X Coordinate (State Plane),Location,School Name,Created Date,Cross Street 2,Garage Lot Name,School Not Found,Taxi Company Borough,School or Citywide Complaint,Status
0,,,,Unspecified,Unspecified,BROOKLYN,,BLOCKFACE,10/31/2013 08:30:36 AM,,Unspecified,EAST 80 STREET,,,Noise - Street/Sidewalk,170310,BROOKLYN,Unspecified,,Unspecified,NYPD,26595564,Unspecified,,Unspecified,Loud Music/Party,BROOKLYN,AVENUE J,11236,-73.91105541883589,New York City Police Department,18 BROOKLYN,AVENUE J,Precinct,40.634103775951736,Unspecified,Unspecified,,Street/Sidewalk,,,,1008937,"(40.634103775951736, -73.91105541883589)",Unspecified,10/31/2013 12:30:36 AM,EAST 81 STREET,,N,,,Open
1,,10/31/2013 01:29:29 AM,,Unspecified,Unspecified,BROOKLYN,,ADDRESS,10/31/2013 08:05:10 AM,,Unspecified,WASHINGTON AVENUE,,,Noise - Street/Sidewalk,180388,BROOKLYN,Unspecified,,Unspecified,NYPD,26595553,Unspecified,,Unspecified,Loud Talking,BROOKLYN,LEFFERTS AVENUE,11225,-73.95993363978067,New York City Police Department,09 BROOKLYN,25 LEFFERTS AVENUE,Precinct,40.6617931276793,Unspecified,Unspecified,10/31/2013 02:43:43 AM,Street/Sidewalk,,,,995366,"(40.6617931276793, -73.95993363978067)",Unspecified,10/31/2013 12:05:10 AM,BEDFORD AVENUE,,N,,,Closed
2,,10/31/2013 12:18:54 AM,,Unspecified,Unspecified,BROOKLYN,,INTERSECTION,10/31/2013 07:26:32 AM,,Unspecified,,,,Noise - Street/Sidewalk,203271,BROOKLYN,Unspecified,NORMAN STREET,Unspecified,NYPD,26594653,Unspecified,DOBBIN STREET,Unspecified,Loud Music/Party,BROOKLYN,,11222,-73.95427134534344,New York City Police Department,01 BROOKLYN,,Precinct,40.724599563793525,Unspecified,Unspecified,10/31/2013 12:18:54 AM,Street/Sidewalk,,,,996925,"(40.724599563793525, -73.95427134534344)",Unspecified,10/30/2013 11:26:32 PM,,,N,,,Closed
3,,10/30/2013 10:23:20 PM,,Unspecified,Unspecified,BROOKLYN,,LATLONG,10/31/2013 06:02:58 AM,,Unspecified,,,,Noise - Street/Sidewalk,171051,BROOKLYN,Unspecified,,Unspecified,NYPD,26591992,Unspecified,,Unspecified,Loud Talking,BROOKLYN,DITMAS AVENUE,11218,-73.97245504682485,New York City Police Department,01 BROOKLYN,DITMAS AVENUE,Precinct,40.63616876563881,Unspecified,Unspecified,10/30/2013 10:23:20 PM,Street/Sidewalk,,,,991895,"(40.63616876563881, -73.97245504682485)",Unspecified,10/30/2013 10:02:58 PM,,,N,,,Closed
4,,10/30/2013 10:26:28 PM,,Unspecified,Unspecified,BROOKLYN,,ADDRESS,10/31/2013 04:38:25 AM,,Unspecified,CHURCH AVENUE,,,Noise - Street/Sidewalk,173511,BROOKLYN,Unspecified,,Unspecified,NYPD,26594167,Unspecified,,Unspecified,Loud Music/Party,BROOKLYN,BEVERLY ROAD,11218,-73.97876175474585,New York City Police Department,12 BROOKLYN,126 BEVERLY ROAD,Precinct,40.6429222774404,Unspecified,Unspecified,10/30/2013 10:26:28 PM,Street/Sidewalk,,,,990144,"(40.6429222774404, -73.97876175474585)",Unspecified,10/30/2013 08:38:25 PM,EAST 2 STREET,,N,,,Closed


In [25]:
(->> complaints
     (filter (some-fn is_noise in_brooklyn)) ;; one has to be true. Functional version of 'or'
     (take 5)
     show-table)

Unnamed: 0,Road Ramp,Resolution Action Updated Date,Bridge Highway Name,Park Facility Name,School Number,Park Borough,Taxi Pick Up Location,Address Type,Due Date,Bridge Highway Segment,School Phone Number,Cross Street 1,Vehicle Type,Bridge Highway Direction,Complaint Type,Y Coordinate (State Plane),City,School Address,Intersection Street 2,School State,Agency,Unique Key,School Code,Intersection Street 1,School Zip,Descriptor,Borough,Street Name,Incident Zip,Longitude,Agency Name,Community Board,Incident Address,Facility Type,Latitude,School City,School Region,Closed Date,Location Type,Ferry Terminal Name,Landmark,Ferry Direction,X Coordinate (State Plane),Location,School Name,Created Date,Cross Street 2,Garage Lot Name,School Not Found,Taxi Company Borough,School or Citywide Complaint,Status
0,,10/31/2013 02:35:17 AM,,Unspecified,Unspecified,QUEENS,,ADDRESS,10/31/2013 10:08:41 AM,,Unspecified,90 AVENUE,,,Noise - Street/Sidewalk,197389,JAMAICA,Unspecified,,Unspecified,NYPD,26589651,Unspecified,,Unspecified,Loud Talking,QUEENS,169 STREET,11432,-73.79160395779721,New York City Police Department,12 QUEENS,90-03 169 STREET,Precinct,40.70827532593202,Unspecified,Unspecified,,Street/Sidewalk,,,,1042027,"(40.70827532593202, -73.79160395779721)",Unspecified,10/31/2013 02:08:41 AM,91 AVENUE,,N,,,Assigned
1,,10/31/2013 01:48:26 AM,,Unspecified,Unspecified,BROOKLYN,,ADDRESS,10/31/2013 09:34:41 AM,,Unspecified,UNION STREET,,,Noise - Commercial,182725,BROOKLYN,Unspecified,,Unspecified,NYPD,26594392,Unspecified,,Unspecified,Loud Music/Party,BROOKLYN,NOSTRAND AVENUE,11225,-73.95064760056546,New York City Police Department,09 BROOKLYN,835 NOSTRAND AVENUE,Precinct,40.66820406598287,Unspecified,Unspecified,10/31/2013 02:23:51 AM,Club/Bar/Restaurant,,,,997941,"(40.66820406598287, -73.95064760056546)",Unspecified,10/31/2013 01:34:41 AM,PRESIDENT STREET,,N,,,Closed
2,,,,Unspecified,Unspecified,BROOKLYN,,ADDRESS,10/31/2013 09:25:12 AM,,Unspecified,EAST 9 STREET,,,Noise - House of Worship,170399,BROOKLYN,Unspecified,,Unspecified,NYPD,26595176,Unspecified,,Unspecified,Loud Music/Party,BROOKLYN,18 AVENUE,11218,-73.96946177104543,New York City Police Department,14 BROOKLYN,3775 18 AVENUE,Precinct,40.63437840816299,Unspecified,Unspecified,,House of Worship,,,,992726,"(40.63437840816299, -73.96946177104543)",Unspecified,10/31/2013 01:25:12 AM,EAST 8 STREET,,N,,,Open
3,,10/31/2013 01:29:26 AM,,Unspecified,Unspecified,BROOKLYN,,BLOCKFACE,11/30/2013 01:19:54 AM,,Unspecified,13 AVENUE,,,Rodent,167519,BROOKLYN,Unspecified,,Unspecified,DOHMH,26590917,Unspecified,,Unspecified,Rat Sighting,BROOKLYN,63 STREET,11219,-73.99921826202639,Department of Health and Mental Hygiene,10 BROOKLYN,63 STREET,,40.6264774690411,Unspecified,Unspecified,,1-2 Family Mixed Use Building,,,,984467,"(40.6264774690411, -73.99921826202639)",Unspecified,10/31/2013 01:19:54 AM,14 AVENUE,,N,,,Pending
4,,10/31/2013 02:07:14 AM,,Unspecified,Unspecified,STATEN ISLAND,,ADDRESS,10/31/2013 08:54:03 AM,,Unspecified,HENDERSON AVENUE,,,Noise - Street/Sidewalk,171076,STATEN ISLAND,Unspecified,,Unspecified,NYPD,26594086,Unspecified,,Unspecified,Loud Music/Party,STATEN ISLAND,CAMPBELL AVENUE,10310,-74.1161500428337,New York City Police Department,01 STATEN ISLAND,173 CAMPBELL AVENUE,Precinct,40.63618202176914,Unspecified,Unspecified,10/31/2013 02:16:39 AM,Street/Sidewalk,,,,952013,"(40.63618202176914, -74.1161500428337)",Unspecified,10/31/2013 12:54:03 AM,WINEGAR LANE,,N,,,Closed


Or if we just wanted a few columns:

In [26]:
(->> complaints
     (filter (every-pred is_noise in_brooklyn))
     (map #(select-keys % (map keyword ["Complaint Type", "Borough", "Created Date", "Descriptor"])))
     (take 5)
     show-table)

Unnamed: 0,Complaint Type,Borough,Created Date,Descriptor
0,Noise - Street/Sidewalk,BROOKLYN,10/31/2013 12:30:36 AM,Loud Music/Party
1,Noise - Street/Sidewalk,BROOKLYN,10/31/2013 12:05:10 AM,Loud Talking
2,Noise - Street/Sidewalk,BROOKLYN,10/30/2013 11:26:32 PM,Loud Music/Party
3,Noise - Street/Sidewalk,BROOKLYN,10/30/2013 10:02:58 PM,Loud Talking
4,Noise - Street/Sidewalk,BROOKLYN,10/30/2013 08:38:25 PM,Loud Music/Party


# 3.3 So, which borough has the most noise complaints?

In [27]:
(->> complaints
    (filter (comp #{"Noise - Street/Sidewalk"} (keyword "Complaint Type")))
    (map (keyword "Borough"))
    frequencies
    pp/pprint)

{"QUEENS" 226,
 "STATEN ISLAND" 36,
 "MANHATTAN" 917,
 "BROOKLYN" 456,
 "BRONX" 292,
 "Unspecified" 1}


nil

It's Manhattan! But what if we wanted to divide by the total number of complaints, to make it make a bit more sense? That would be easy too:

In [28]:
;; Python:
;; noise_complaint_counts = noise_complaints['Borough'].value_counts()
;; complaint_counts = complaints['Borough'].value_counts()

(def noise-complaint-counts 
    (->> complaints
        (filter (comp #{"Noise - Street/Sidewalk"} (keyword "Complaint Type")))
        (map (keyword "Borough"))
        frequencies))
    
(def complaint-counts 
    (->> complaints
        (map (keyword "Borough"))
        frequencies))

#'user/complaint-counts

In [29]:
;; Python:
;; noise_complaint_counts / complaint_counts

(->> (merge-with (comp float /) noise-complaint-counts complaint-counts)
     pp/pprint)

{"QUEENS" 0.010143171,
 "STATEN ISLAND" 0.007473531,
 "MANHATTAN" 0.03775527,
 "BROOKLYN" 0.013864396,
 "BRONX" 0.014832876,
 "Unspecified" 1.4070634E-4}


nil

In [35]:
;; Python
;; (noise_complaint_counts / complaint_counts.astype(float)).plot(kind='bar')

(defn bar-graph [vs]
 {:data {:values (map (fn [[k v]] {:Index k :Value v}) vs)}
  :mark "bar"
  :encoding {:x {:field :Index
                 :type "nominal"
                 :sort false}
             :y {:field :Value
                 :type "quantitative"}}
  :width 800})

(->> (merge-with (comp float /) noise-complaint-counts complaint-counts)
     bar-graph
     oz/view!)

So Manhattan really does complain more about noise than the other boroughs! Neat.