# The _geography_ of household waste generation

This experiment generates choropleths to help visualise the variations in the amounts of household-generated waste across geographic areas in Scotland.

In [2]:
; Add code libraries

(require '[clojupyter.misc.helper :as helper])

(helper/add-dependencies '[org.clojure/data.csv "1.0.0"])
(helper/add-dependencies '[metasoarous/oz "1.6.0-alpha24"])
(helper/add-dependencies '[clj-http/clj-http "3.10.1"])
(helper/add-dependencies '[org.apache.commons/commons-math3 "3.6.1"])

(require '[clojure.string :as str]
         '[clojure.pprint :as pp]
         '[clojure.java.io :as io]
         '[clojure.data.csv :as csv]
         '[clj-http.client :as http]
         '[oz.notebook.clojupyter :as oz]
         '[oz.core :as ozcore])
         
(import 'java.net.URLEncoder
        'org.apache.commons.math3.stat.regression.SimpleRegression)

org.apache.commons.math3.stat.regression.SimpleRegression

In [3]:
; Define convenience functions

; Convert the CSV structure to a list-of-maps structure.
(defn to-maps [csv-data]
    (map zipmap (->> (first csv-data)
                    (map keyword)
                    repeat)
                (rest csv-data)))

; Ask statistic.gov.scot to execute the given SPARQL query
; and return its result as a list-of-maps.
(defn exec-query [sparql]
    (->> (http/post "http://statistics.gov.scot/sparql" 
                    {:body (str "query=" (URLEncoder/encode sparql)) 
                    :headers {"Accept" "text/csv" 
                              "Content-Type" "application/x-www-form-urlencoded"} 
                    :debug false})
        :body
        csv/read-csv
        to-maps))
        
; Compute 'the trend of y'.
; (Returns the gradient of a linear approximation to the curve decribed by xy-pairs.)
(defn trend [xy-pairs]
    (let [regression (SimpleRegression. true)]
        (doseq [[x y] xy-pairs]
            (.addData regression x y))
        (.getSlope regression)))

#'user/trend

Use a SPARQL query against statistics.gov.scot's data cubes to find the waste tonnage generated per council citizen per year.

In [4]:
; Query for the waste tonnage generated per council citizen per year

(def sparql "

PREFIX qb: <http://purl.org/linked-data/cube#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX pdmx: <http://purl.org/linked-data/sdmx/2009/dimension#>
PREFIX sdmx: <http://statistics.gov.scot/def/dimension/>
PREFIX snum: <http://statistics.gov.scot/def/measure-properties/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

SELECT 
    ?council 
    ?year 
    ?tonnagePerCitizen 
    (strafter(str(?areaUri), 'http://statistics.gov.scot/id/statistical-geography/') as ?councilCode) 
WHERE {
  
    ?tonnageObs qb:dataSet <http://statistics.gov.scot/data/household-waste> .
    ?tonnageObs pdmx:refArea ?areaUri .
    ?tonnageObs pdmx:refPeriod ?periodUri .
    ?tonnageObs sdmx:wasteCategory ?wasteCategoryUri .
    ?tonnageObs sdmx:wasteManagement ?wasteManagementUri .
    ?tonnageObs snum:count ?tonnage .
  
    ?wasteCategoryUri rdfs:label \"Total Waste\" .
    ?wasteManagementUri rdfs:label \"Waste Generated\" .

    ?populationObs qb:dataSet <http://statistics.gov.scot/data/population-estimates-current-geographic-boundaries> .
    ?populationObs pdmx:refArea ?areaUri .
    ?populationObs pdmx:refPeriod ?periodUri .
    ?populationObs sdmx:age <http://statistics.gov.scot/def/concept/age/all> .
    ?populationObs sdmx:sex <http://statistics.gov.scot/def/concept/sex/all> .
    ?populationObs snum:count ?population .

    ?areaUri rdfs:label ?council .
    ?periodUri rdfs:label ?year .
    BIND((xsd:integer(?tonnage)/xsd:integer(?population)) AS ?tonnagePerCitizen) .
}
")

(def tonnage-generated-per-council-citizen-per-year 
    (->> sparql
        exec-query
        (sort-by (juxt :c :y))))

(println (count tonnage-generated-per-council-citizen-per-year) "rows")

264 rows


nil

In [5]:
; Print a sample

(def ks [:council :councilCode :year :tonnagePerCitizen])
(pp/print-table ks (repeatedly 5 #(rand-nth tonnage-generated-per-council-citizen-per-year)))


|          :council | :councilCode | :year |         :tonnagePerCitizen |
|-------------------+--------------+-------+----------------------------|
|    South Ayrshire |    S12000028 |  2013 | 0.491033932843093824754142 |
|          Scotland |    S92000003 |  2017 | 0.453624096740893673499484 |
|   Argyll and Bute |    S12000035 |  2016 | 0.616194192585791346264203 |
|   Argyll and Bute |    S12000035 |  2015 | 0.596696973184486131890897 |
| South Lanarkshire |    S12000029 |  2017 | 0.476914856837539680045259 |


nil

For each council area, derive the 3 values:
* `recent` - 2018's tonnage of waste generated per council citizen.
* `average` - 2011-2018's average tonnage of waste generated per council citizen.
<br>(Calculated as the _mean_.)
* `trend` - 2011-2018's trend in tonnage of waste generated per council citizen.
<br>(Calculated as the gradient of a linear approximation to the tonnage over the years.)

In [6]:
; For each council area, derive the 3 values: 'recent', 'average' and 'trend'.

(def stats-on-tonnage-generated-per-council-citizen
    (let [base-data tonnage-generated-per-council-citizen-per-year]
        (for [council (->> base-data (map :council) distinct)]
            {:council council
             :recent (->> base-data 
                         (filter #(and (= council (:council %)) (= "2018" (:year %)))) 
                         first 
                         :tonnagePerCitizen
                         bigdec
                         .doubleValue
                         (format "%.6f"))
             :average (->> base-data 
                         (filter #(= council (:council %))) 
                         (map #(-> % :tonnagePerCitizen bigdec .doubleValue))
                         (apply +) 
                         (#(/ % 8))
                         (format "%.6f"))
              :trend (->> base-data 
                         (filter #(= council (:council %))) 
                         (map #(vector (-> % :year bigdec .doubleValue) (-> % :tonnagePerCitizen bigdec .doubleValue)))
                         trend
                         (format "%.6f"))})))
            
(println (count stats-on-tonnage-generated-per-council-citizen) "rows")

33 rows


nil

In [7]:
; Print a sample

(def ks [:council :recent :average :trend])
(pp/print-table ks (repeatedly 5 #(rand-nth stats-on-tonnage-generated-per-council-citizen)))


|         :council |  :recent | :average |    :trend |
|------------------+----------+----------+-----------|
| Clackmannanshire | 0.508210 | 0.533003 | -0.007780 |
|      Dundee City | 0.408558 | 0.442179 | -0.005313 |
|            Angus | 0.470691 | 0.497753 | -0.006480 |
|     East Lothian | 0.473901 | 0.499116 | -0.007190 |
|            Angus | 0.470691 | 0.497753 | -0.006480 |


nil

In [8]:
; Store stats-on-tonnage-generated-per-council-citizen in a CSV file for subsequent use by the Vega chart

(def filename "stats-on-tonnage-generated-per-council-citizen.csv")

(let [file (io/file filename)
      header-row (->> stats-on-tonnage-generated-per-council-citizen
                      first
                      keys
                      (map name))
      data-rows (->> stats-on-tonnage-generated-per-council-citizen
                     (map vals))]
    (with-open [writer (io/writer file)]
      (csv/write-csv writer (cons header-row data-rows)))
      
    (println "Wrote to" (.getAbsolutePath file)))

Wrote to /Users/amc/workspace/data-commons-scotland/dcs-shorts/choropleth-generation/stats-on-tonnage-generated-per-council-citizen.csv


nil

Use Vega to generate 3 choropleths which help visualise the
* 2018 tonnage
* 2011-2018 average tonnage
* 2011-2018 trend in tonnage

...of waste generated per council citizen against the council-oriented geography of Scotland.

In [9]:
; Use Vega to generate 3 choropleths

(def repo-dir "https://raw.githubusercontent.com/data-commons-scotland/dcs-shorts/master/choropleth-generation/")

(def chart-spec {:$schema "https://vega.github.io/schema/vega-lite/v4.json"
                 :repeat {:row ["2018 tonnage" 
                                "2011-2018 average tonnage"
                                "2011-2018 trend in tonnage"]}
                 :resolve {:scale {:color "independent"}}
                 :spec {
                     :width "500"
                     :height "500"
                     :data {:url (str repo-dir "topo_lad.json")
                            :format {:type "topojson" :feature "lad"}}
                     :transform [;; Cross reference by council name rather than council code
                                 ;; because the topoJSON data uses some obsolete codes (etc.).
                                 {:lookup "properties['LAD13NM']" 
                                  :from {:data {:url (str repo-dir filename)}
                                         :key "council"
                                         :fields ["recent" "average" "trend"]}
                                  :as ["2018 tonnage" 
                                       "2011-2018 average tonnage"
                                       "2011-2018 trend in tonnage"]}]
                     :projection {:type "albers" :rotate [0, 0, 0]}
                     :mark {:type "geoshape" :strokeWidth 0.2 :stroke "black"}
                     :encoding {:tooltip [{:title "council" :field "properties['LAD13NM']" :type "nominal"}
                                          {:field {:repeat "row"} :type "quantitative"}]
                                :color {;:title "tonnage per citizen"
                                        :field {:repeat "row"}
                                        :type "quantitative"}}}})

; (print (json/write-str chart-spec))

; (ozcore/export! [:div [:vega-lite chart-spec]] "choropleths.html" {:from-format :hiccup :to-format :html})

(oz/view! chart-spec)

to-format: :html
[I 14:42:32.947 Clojupyter] oz.core:425 -- input: /var/folders/wl/ff7688p93t1b3tm2l0bv93yc0000gn/T/a7ff0e95-d21a-4a80-97e5-619e9f0cba826267511786189011164.vl.json
[I 14:42:33.001 Clojupyter] oz.core:426 -- output: /var/folders/wl/ff7688p93t1b3tm2l0bv93yc0000gn/T/7e17d605-ef6b-4437-9d22-cc0d7d5690a21725674012211629027.png
