Load our libraries

In [33]:
import json
from pathlib import Path
import mkwikidata # you may need to pip install mkwikidata

Set up our Wikidata query

In [34]:
query = """
SELECT ?typeLabel ?item ?itemLabel  ?inception ?website ?twitter ?GSS ?WDTK WHERE {
  ?item wdt:P31/wdt:P279* wd:Q837766 . #Instance of or sub-class of Local Authority
  ?item wdt:P17 wd:Q145 . #in UK
  ?item wdt:P31 ?type . #get type 
  MINUS {?item wdt:P576 ?abol .} #ignore abolished councils
  MINUS {?item wdt:P31 wd:Q640452 .} #ignore Area Committees 
  OPTIONAL {?item wdt:P856 ?website .}
  OPTIONAL {?item wdt:P2002 ?twitter .}
  OPTIONAL {?item wdt:P836 ?GSS.}
  OPTIONAL {?item wdt:P8167 ?WDTK .}
  OPTIONAL {?item wdt:P571 ?inception .}
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
"""

Execute the wikidata query, loading the results into a dictionary: query_result

In [35]:
query_result = mkwikidata.run_query(query, params={ })

Create a Python Set - wd_set - and load our itemLabels (names of councils)

In [36]:
wd_set = set()

for x in query_result["results"]["bindings"]:
    wd_set.add(x["itemLabel"]["value"])
    #print (x["itemLabel"]["value"])

set the path to where the DC data is. Create a Python set to hold that data

In [37]:
path = Path.cwd().joinpath('data')
dc_council_set = set()


Read the json file into our set - __if__ the council is current (ie not abolished)

In [38]:
with open (path / 'uk_local_authorities.json') as dc_file:
    data = json.load (dc_file)
    for item in data:
        if item['end-date'] =="":
            dc_council_set.add(item['official-name'])

Check how many councils we have in DC set

In [39]:
print(len(dc_council_set))


409


Check how many in our Wikidata set

In [40]:
print(len(wd_set))

2923


Using set theory, check how many councils are in DC's set whcih are not in our Wikidata set (derived from the Wikidata Query)

In [46]:
print ("Councils in DC list but not in WD: ", len(dc_council_set - wd_set))
print("======================================")
missing_list = list(dc_council_set - wd_set)
mising_list = missing_list.sort()
for council in missing_list:
    print(council)


Councils in DC list but not in WD:  72
Ards and North Down Borough Council
Armagh City, Banbridge and Craigavon Borough Council
Blackpool Borough Council
Bolton Metropolitan Borough Council
Borough Council of Kings Lynn and West Norfolk
Cambridgeshire and Peterborough Combined Authority
Causeway Coast and Glens Borough Council
City of Cardiff Council
City of Westminster
City of Wolverhampton Council
Derry City and Strabane District Council
Folkestone and Hythe District Council
Gateshead Metropolitan Borough Council
Greater London Authority
Greater Manchester Combined Authority
Kirklees Council
Lisburn and Castlereagh City Council
Liverpool City Region
London Borough of Barking and Dagenham
London Borough of Barnet
London Borough of Bexley
London Borough of Brent
London Borough of Bromley
London Borough of Camden
London Borough of Croydon
London Borough of Ealing
London Borough of Enfield
London Borough of Hackney
London Borough of Hammersmith & Fulham
London Borough of Haringey
London 

In [47]:
print ("Councils in WD but not in DC list: ",len(wd_set - dc_council_set))

Councils in WD but not in DC list:  2586


The WD query contains parish councils which DC data does not. 