This notebook includes the process of extracting Canterbury regions, and income brackets and median personal income for Canterbury regions. The code should be followed in the exact order in which it occurs in this notebook.

Load required packages

In [11]:
using Pkg, Queryverse, VegaDatasets, VegaLite

In [12]:
Pkg.add("XLSX")

[32m[1m    Updating[22m[39m registry at `C:\Users\danie\.julia\registries\General`
[32m[1m    Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m   Installed[22m[39m XLSX ─ v0.7.8
[32m[1m    Updating[22m[39m `C:\Users\danie\.julia\environments\v1.6\Project.toml`
 [90m [fdbf4ff8] [39m[92m+ XLSX v0.7.8[39m
[32m[1m    Updating[22m[39m `C:\Users\danie\.julia\environments\v1.6\Manifest.toml`
 [90m [fdbf4ff8] [39m[93m↑ XLSX v0.7.6 ⇒ v0.7.8[39m
[32m[1mPrecompiling[22m[39m project...
[33m  ✓ [39mXLSX
[33m  ✓ [39mExcelFiles
[33m  ✓ [39mQueryverse
  3 dependencies successfully precompiled in 42 seconds (115 already precompiled, 2 skipped during auto due to previous errors)
  [33m3[39m dependencies precompiled but different versions are currently loaded. Restart julia to access the new versions


Load file containing suburb personal income information

In [13]:
import XLSX

region_income_data = XLSX.readxlsx("Region_data_income.xlsx")

XLSXFile("Region_data_income.xlsx") containing 9 Worksheets
            sheetname size          range        
-------------------------------------------------
             Contents 145x5         A1:E145      
       Geographic key 3929x11       A1:K3929     
Footnotes and symbol… 55x6          A1:F55       
                  SA1 3867x348      A1:MJ3867    
                  SA2 325x349       A1:MK325     
                 Ward 55x349        A1:MK55      
Territorial authorit… 22x349        A1:MK22      
District health boar… 15x349        A1:MK15      
     Regional council 12x349        A1:MK12      


Extract required columns for regions and territories in Canterbury

In [14]:
using DataFrames

region = DataFrame(region_income_data["Geographic key!C:E"], :auto)

Unnamed: 0_level_0,x1,x2,x3
Unnamed: 0_level_1,Any,Any,Any
1,missing,missing,missing
2,missing,missing,missing
3,Statistical area 2 description,Territorial authority code (2018 areas),Territorial authority description
4,Kaikoura Ranges,054,Kaikoura District
5,Kaikoura Ranges,054,Kaikoura District
6,Kaikoura Ranges,054,Kaikoura District
7,Kaikoura Ranges,054,Kaikoura District
8,Kaikoura Ranges,054,Kaikoura District
9,Kaikoura Ranges,054,Kaikoura District
10,Kaikoura Ranges,054,Kaikoura District


In [15]:
region = region[4:end,[1,3]]
rename!(region, [:Suburb, :Territory])

Unnamed: 0_level_0,Suburb,Territory
Unnamed: 0_level_1,Any,Any
1,Kaikoura Ranges,Kaikoura District
2,Kaikoura Ranges,Kaikoura District
3,Kaikoura Ranges,Kaikoura District
4,Kaikoura Ranges,Kaikoura District
5,Kaikoura Ranges,Kaikoura District
6,Kaikoura Ranges,Kaikoura District
7,Kaikoura Ranges,Kaikoura District
8,Kaikoura Ranges,Kaikoura District
9,Kaikoura Ranges,Kaikoura District
10,Kaikoura Ranges,Kaikoura District


Extract only unique suburb names from file so we have all regions in Canterbury

In [16]:
region = unique(region, :Suburb)
region = region[completecases(region), :]

Unnamed: 0_level_0,Suburb,Territory
Unnamed: 0_level_1,Any,Any
1,Kaikoura Ranges,Kaikoura District
2,Kaikoura,Kaikoura District
3,Hanmer Range,Hurunui District
4,Amuri,Hurunui District
5,Hanmer Springs,Hurunui District
6,Parnassus,Hurunui District
7,Upper Hurunui,Hurunui District
8,Omihi,Hurunui District
9,Ashley Forest,Hurunui District
10,Balcairn,Hurunui District


In [17]:
region[region[!, :Territory] .== "Christchurch City", :]

Unnamed: 0_level_0,Suburb,Territory
Unnamed: 0_level_1,Any,Any
1,Styx,Christchurch City
2,Brooklands-Spencerville,Christchurch City
3,Inlets other Christchurch City,Christchurch City
4,McLeans Island,Christchurch City
5,Christchurch Airport,Christchurch City
6,Harewood,Christchurch City
7,Bishopdale North,Christchurch City
8,Clearwater,Christchurch City
9,Casebrook,Christchurch City
10,Northwood,Christchurch City


Extract columns from dataset that contain information relating to income brackets and mean personal income for each suburb

In [18]:
income_data = DataFrame(region_income_data["SA2!JE10:JO324"], :auto)
rename!(income_data, Symbol.(Vector(income_data[1,:])))[2:end,:]

Unnamed: 0_level_0,"$5,000 or less","$5,001 – $10,000","$10,001 – $20,000","$20,001 – $30,000","$30,001 – $50,000"
Unnamed: 0_level_1,Any,Any,Any,Any,Any
1,105,57,219,201,354
2,108,69,357,414,492
3,21,6,30,21,63
4,174,57,267,207,402
5,51,39,114,132,255
6,114,57,258,189,237
7,108,45,225,189,240
8,72,18,75,60,117
9,117,42,189,147,219
10,201,84,357,303,387
