# Analysing Government Data

I successfully scraped a bunch of government data - now I can take a look at what I managed to scrape.

In [1]:
import json

In [4]:
with open('govtdata.json', 'r') as f:
    govt_data = json.load(f)
    f.close()

In [6]:
import numpy as np
import pandas as pd

In [8]:
govt_data = pd.DataFrame(govt_data)

In [11]:
govt_data.agency.value_counts()

Land Information New Zealand                          1817
Ministry for the Environment (MfE)                     705
Landcare Research                                      194
Statistics New Zealand                                 172
Canterbury Regional Council                            153
Ministry of Transport                                  152
Ministry of Health                                     145
New Zealand Defence Force                              134
Ministry of Education                                  103
Ministry of Business Innovation and Employment          80
The Treasury                                            69
Wellington City Council                                 57
Ministry for Primary Industries                         56
Department of Internal Affairs                          37
Greater Wellington Regional Council                     33
University of Otago - National School of Surveying      32
NZ Transport Agency                                     

In [44]:
govt_data.ix[grepl("Maori|Iwi|Pasifika", govt_data.title), 'title']

674     Pasifika Parent Representation on the Board of...
708     Progress against Maori Education Plan targets ...
1492    Counts of Te Reo speakers of Maori descent by ...
2134                        Pasifika Education Statistics
2750                           Maori Education Statistics
2960           Schooling - Pasifika Language in Education
3063    Directory of Iwi and Maori organisations Te Ka...
3665                          Pasifika Tertiary Education
4006             Maori descent by meshblock (2013 Census)
Name: title, dtype: object

In [48]:
govt_data.ix[grepl("Police", govt_data.agency), ['title', 'link']]

Unnamed: 0,title,link
134,"Independent Police Conduct Authority, Chair's ...",https://data.govt.nz/dataset/show/1021
138,Recorded Crime Victims Statistics,https://data.govt.nz/dataset/show/3739
301,New Zealand Police Commissioner Credit Card Ex...,https://data.govt.nz/dataset/show/3641
523,Police Stolen Vehicle Database,https://data.govt.nz/dataset/show/2378
1344,Designated terrorist individuals and organisat...,https://data.govt.nz/dataset/show/602
3098,NZ Police District Boundaries,https://data.govt.nz/dataset/show/2083
3419,NZ Police Area Boundaries,https://data.govt.nz/dataset/show/2082
4057,NZ Police Station Boundaries,https://data.govt.nz/dataset/show/2081


In [18]:
print("\n".join(sorted(list(govt_data.ix[govt_data.agency == 'Land Information New Zealand', 'title']))))

12 Mile Territorial Sea Limit Basepoints
12 Mile Territorial Sea Outer Limit
200 Mile Exclusive Economic Zone Outer Limits
2012 Report on the adoption of the Declaration on Open and Transparent Government
2013 Report on the adoption of the Declaration on Open and Transparent Government
24 Mile Contiguous Zone Basepoints
24 Mile Contiguous Zone Outer Limits
ASP Street SUFI to Landonline Road Name SUFI Mappings
ASP: Check Combination (Deprecated)
ASP: GED Codes (Deprecated)
ASP: MED Codes (Deprecated)
ASP: Map 10000 (Deprecated)
ASP: Map 260 (Deprecated)
ASP: Name Associations (Deprecated)
ASP: Place (Deprecated)
ASP: Place Part (Deprecated)
ASP: Processing Centres (Deprecated)
ASP: Status Types (Deprecated)
ASP: Street (Deprecated)
ASP: Street Part (Deprecated)
ASP: Street Type (Deprecated)
ASP: TLA Codes (Deprecated)
ASP: Unofficial Status (Deprecated)
Administration Area (Named) polygon (Hydro, 1:22k - 1:90k)
Administration Area (Named) polygon (Hydro, 1:90k - 1:350k)
Administration A

In [19]:
govt_data.columns

Index(['agency', 'description', 'formats', 'link', 'title', 'type'], dtype='object')

In [24]:
import itertools

In [27]:
pd.Series(list(itertools.chain.from_iterable(list(govt_data['formats'])))).value_counts()

Other Geo       3291
API             3035
KML             2616
CSV             2573
PDF             2568
Spreadsheet      721
HTML             364
Other format      48
XML               27
DB                26
Datastreams        4
ASCII              1
dtype: int64

In [29]:
govt_data.formats.astype(str).value_counts()

['API', 'CSV', 'KML', 'Other Geo', 'PDF']                                                                1683
['API', 'Other Geo']                                                                                      381
['Spreadsheet']                                                                                           340
['API', 'KML', 'Other Geo', 'PDF']                                                                        332
['API', 'CSV', 'KML', 'Other Geo']                                                                        295
['API', 'CSV', 'Other Geo']                                                                               226
['KML', 'Other Geo', 'PDF']                                                                               152
['CSV', 'HTML']                                                                                           137
['PDF', 'Spreadsheet']                                                                                    115
['HTML', '

In [30]:
def grepl(pattern, col):
    return col.str.contains(pattern)

In [34]:
govt_data.ix[grepl("Council", govt_data.agency), 'agency'].value_counts()

Canterbury Regional Council                      153
Wellington City Council                           57
Greater Wellington Regional Council               33
Northland Regional Council                        24
Porirua City Council                              19
Palmerston North City Council                     18
Wellington City Council                            6
Wanganui District Council                          6
Waikato Regional Council                           5
Christchurch City Council                          3
Arts Council of New Zealand Toi Aotearoa           1
Health Research Council of New Zealand             1
Standards Council                                  1
Auckland Council                                   1
The Education Council of Aotearoa New Zealand      1
Testing Laboratory Registration Council            1
Name: agency, dtype: int64