<a href="https://colab.research.google.com/github/scarfboy/wetsuite-dev/blob/main/examples/dataset_gemeentes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip --quiet install https://github.com/scarfboy/wetsuite-dev/archive/refs/heads/main.zip

In [3]:
import wetsuite.datasets
gem = wetsuite.datasets.load('gemeentes')
print( gem.description )


    This is largely the more interesting fields from https://organisaties.overheid.nl/export/Gemeenten.csv
    augmented with RDF-like data like that under https://standaarden.overheid.nl/owms/terms/Leiden_(gemeente)

    
    .data is a list of dicts, one per gemeente (currently 344 of them). Keys in that dict include:
    
    'Namen' - a list of name variants. 
       Usually just the short name, and a longer one with "Gemeente " in front
       Sometimes with alternative names, e.g. ["Den Bosch", "Gemeente 's-Hertogenbosch", "'s-Hertogenbosch"]
       We have used these as "Match one of these" to search for gemeentebeleid per gemeente
   
    Descriptions like 'Aantal inwoners', 'Oppervlakte'
    
    Organisational relations like 
      - 'Bevat plaatsen'
      - 'Overlaps with', mentioning Provinces, Waterschappen
      - 'Service area of' - things like GGD, Police, Social services  (each item is a list because we tend to have a full name and an abbreviation)
      - 'Predecesso

In [7]:
import pprint, random

pprint.pprint( random.choice( gem.data ) )  # show details for one random gemeente

{'Aantal inwoners': '31663',
 'Bevat plaatsen': ['Hendrik-Ido-Ambacht'],
 'CBSCode': '0531',
 'Namen': ['Hendrik-Ido-Ambacht', 'Gemeente Hendrik-Ido-Ambacht'],
 'OWMS URI': 'http://standaarden.overheid.nl/owms/terms/Hendrik-Ido-Ambacht_(gemeente)',
 'Oppervlakte': [12, 'km2'],
 'Organisatiecode': 'gm0531',
 'Overlaps with': [['Waterschap Hollandse Delta'], ['Zuid-Holland']],
 'Predecessors': [],
 'Raad': [['Gemeente Belangen', 4],
          ['VVD', 3],
          ['CDA', 2],
          ['D66', 2],
          ['Realistisch Ambacht', 2],
          ['PvdA', 2],
          ['E.V.A.', 1],
          ['SGP/Christenunie', 1]],
 'Service area of': [['480', 'GGD Zuid-Holland Zuid'],
                     ['Bureau Openbare Verlichting', 'Bureau OVL'],
                     ['Dienst Gezondheid & Jeugd Zuid-Holland Zuid'],
                     ['Drechtwerk', 'Sociale Werkvoorziening Drechtwerk'],
                     ['GEVUDO',
                      'Gemeenschappelijke Vuilverwerking Dordrecht en '
     

## Beleidsregels per gemeente

One reason for this not-really-a-dataset was just the names, so we can do things like look for beleidsregels per gemeente.0

The below example combines the names with a specific search into the KOOP repositories, one per municipality (see also the datacollect_koop_repos example for more introduction to the repositories).

In [27]:
import wetsuite.datacollect.koop_repositories
import wetsuite.helpers.etree as etree
import wetsuite.helpers.net



for gemeente_dict in gem.data[64:67]:  # the weird offset is tring to find Den Haag with its other name, to check that the search is not tripping over that
    query_gemeente_names = ' OR '.join( '(creator any "%s")'%naam  for naam in gemeente_dict['Namen'] )

    ## Construct a complex-looking query to mean:
    #   (match gemeente by one of its names)  AND (  mentions 'damocles'   OR   (mentions drugs or the opiumwet  AND  mentions words you likely see around putting people out of their house)
    # This is a practical consideration: we _will_ get too many results, but at least what we want is probably in there,  and filtering out can be easier than searching again 
    query = '(%s) AND ( (body any "damoclesbeleid damocles")  OR  (body any "drugs softdrugs harddrugs handelshoeveelheid opiumwet 13b") AND (body any "sluiting herstelsanctie bestuursdwang"))'%( 
        query_gemeente_names
    )
    #print( query )


    ## search and fetch only first page, just so that num_records is filled in to report
    cvdr = wetsuite.datacollect.koop_repositories.CVDR()
    cvdr.search_retrieve( query ) 
    print( "\n == %3d  hits for   %s == "%(cvdr.num_records(), ' / '.join(gemeente_dict['Namen'])) )

    ## search and fetch all, summarizing each record as we go
    def show_brief( record ): 
        ''' a brief summary of each search result.   
            Ignore how this code works for now, because we absolutely need to make this easier for you to do. '''
        #print( etree.tostring( gzd ).decode('u8') ) # for debug, seeing what's in that record
        gzd = record.find('recordData')[0]
        owmskern     = etree.kvelements_to_dict( gzd.find('originalData/meta/owmskern') )
        cvdripm      = etree.kvelements_to_dict( gzd.find('originalData/meta/cvdripm')  )
        enrichedData = etree.kvelements_to_dict( gzd.find('enrichedData')               )
        print( "  %15s  %10s..%-10s  %s"%( owmskern.get('identifier'),  cvdripm.get('inwerkingtredingDatum'),  cvdripm.get('uitwerkingtredingDatum',''),  owmskern.get('title')) )
        #print('    URL: %s'%enrichedData.get('publicatieurl_xml') ) 
        # 'publicatieurl_xml' points to text in structured XML.  There is also 'publicatieurl_xhtml' (more formatted),  and 'preferred_url' (webpage version that lokaleregelgeving.overheid.nl would also send you to)

    cvdr.search_retrieve_many( query, callback=show_brief ) # all results, and show titles



 ==  30  hits for   Delft / Gemeente Delft == 
     CVDR647015_1  2020-12-03..2022-09-21  Beleidsregel van de burgemeester van de gemeente Delft houdende regels omtrent bestuurlijke handhaving met betrekking tot de Opiumwet (Beleidsregel bestuurlijke handhaving artikel 13b Opiumwet, Delft 2020)
     CVDR681430_1  2022-09-21..            Beleidsregel bestuurlijke handhaving artikel 13b Opiumwet, Delft 2022
     CVDR663040_1  2021-10-16..            Coffeeshopbeleid Delft 2021 (met handhavingsarrangement)
     CVDR681180_1  2022-09-13..            Handhavingsarrangement mensenhandel voor Delft
         185002_1  2012-05-10..2012-05-10  Algemene Plaatselijke Verordening voor Delft
      CVDR56373_1  2010-01-01..2011-01-01  Algemene plaatselijke verordening voor Delft
          83446_1  2011-01-01..2012-04-25  Algemene plaatselijke verordening voor Delft
      CVDR83446_2  2012-01-01..2012-05-10  Algemene plaatselijke verordening voor Delft
    CVDR185002_19  2022-02-16..2022-10-27  Algem